NameError: name 'pycrfsuite' is not defined - python

The following code appears when I am running a cell on Google Colab:
NameError Traceback (most recent call last)
<ipython-input-36-5f325bc0550d in <module>()
4
5 TAGGER_PATH = "crf_nlu.tagger" # path to the tagger- it will save/access the model from here
----> 6 ct = CRFTagger(feature_func=get_features) # initialize tagger with get_features function
7
8 print("training tagger...")
/usr/local/lib/python3.6/dist-packages/nltk/tag/crf.py in __init__(self, feature_func, verbose, training_opt)
81
82 self._model_file = ''
---> 83 self._tagger = pycrfsuite.Tagger()
84
85 if feature_func is None:
NameError: name 'pycrfsuite' is not defined
This is the code from the cell:
# Train the CRF BIO-tag tagger
import pycrfsuite
TAGGER_PATH = "crf_nlu.tagger" # path to the tagger- it will save/access the model from here
ct = CRFTagger(feature_func=get_features) # initialize tagger with get_features function
print("training tagger...")
ct.train(training_data, TAGGER_PATH)
print("done")
What causes this issue?
I have an import for the CRFTagger which is:
from nltk.tag import CRFTagger

I just sorted it out. For the google colab, I had to add the following line:
pip install sklearn-pycrfsuite

Use:
pip install python-crfsuite
Scikit-crfsuite provides API similar to scikit-learn library.

Related

Can't create file into specific folder using gspread google colab

Yesterday I was still able to create a google sheet files using gspread into a specific folder using this code:
ss = gc.create(fileName,"my folder destination")
But today, this code yields an error.
Here is my full code:
from google.colab import auth
auth.authenticate_user()
import pandas as pd
import gspread
from google.auth import default
creds, _ = default()
gc = gspread.authorize(creds)
wb = gc.open('file name')
ws = wb.worksheet('sheet name')
# get_all_values gives a list of rows.
rows = ws.get_all_values()
df = pd.DataFrame.from_records(rows[1:],columns=rows[0])
modisseries = df["column name"]
uniquemodis = modisseries.drop_duplicates().tolist()
def createSpreadsheet(columName):
ndf = df[df["Modis"] == Columname]
nlist = [ndf.columns.tolist()] + ndf.to_numpy().tolist()
ss = gc.create(columnName,"Folder_id")
nws = ss.sheet1
nws.update_title(columnName)
nws.update("A1",nlist,value_input_option="USER_ENTERED")
for modis in uniquemodis:
createSpreadsheet(modis)
And here is the error message:
TypeError Traceback (most recent call last)
<ipython-input-1-5c297289782b> in <module>()
30
31 for column_value in uniquecolumn:
---> 32 createSpreadsheet(column_value)
<ipython-input-1-5c297289782b> in createSpreadsheet(column_value)
24 nlist = [ndf.columns.tolist()] + ndf.to_numpy().tolist()
25
---> 26 ss = gc.create(column_value,"folder_id")
27 nws = ss.sheet1
28 nws.update_title(column_value)
TypeError: create() takes 2 positional arguments but 3 were given
Hi which version of gspread do you use ?
The parameter folder_id has been introduced in version 3.5.0
From the error message you get it seems you have an old version and it does not handle the new parameter folder_id.
You can check the version you are currently using by using: print(gspread.__version__)
If this is bellow version 3.5.0 then run the following command:
python3 -m pip install -U gspread

How can I fix this error: AttributeError: module 'graphviz.backend' has no attribute 'ENCODING'?

I'm working on visualizing a tree and using Graphviz. I'm basing my code on the examples they have in their website, and, I think my code itself is fine. However, I can't run it because of this error. My code is:
from graphviz import Digraph
with open('greetings.json') as json_file:
data = json.load(json_file)
newstring = data['data']['stitches']['hiImSymmi']['content']
print(newstring)
g = Digraph('G', filename='hello.gv')
g.edge('Hello',newstring[0])
g.view()
The full error message is:
AttributeError Traceback (most recent call last)
Input In [7], in <cell line: 2>()
1 import json
----> 2 from graphviz import Digraph
3 #from treelib import Node, Tree
5 with open('greetings.json') as json_file:
File ~/opt/anaconda3/lib/python3.8/site-packages/graphviz/__init__.py:27, in <module>
1 # graphviz - create dot, save, render, view
3 """Assemble DOT source code and render it with Graphviz.
4
5 >>> dot = Digraph(comment='The Round Table')
(...)
24 }
25 """
---> 27 from .dot import Graph, Digraph
28 from .files import Source
29 from .lang import escape, nohtml
File ~/opt/anaconda3/lib/python3.8/site-packages/graphviz/dot.py:32, in <module>
3 r"""Assemble DOT source code objects.
4
5 >>> dot = Graph(comment=u'M\xf8nti Pyth\xf8n ik den H\xf8lie Grailen')
(...)
28 'test-output/m00se.gv.pdf'
29 """
31 from . import backend
---> 32 from . import files
33 from . import lang
35 __all__ = ['Graph', 'Digraph']
File ~/opt/anaconda3/lib/python3.8/site-packages/graphviz/files.py:22, in <module>
16 __all__ = ['File', 'Source']
19 log = logging.getLogger(__name__)
---> 22 class Base(object):
24 _engine = 'dot'
26 _format = 'pdf'
File ~/opt/anaconda3/lib/python3.8/site-packages/graphviz/files.py:28, in Base()
24 _engine = 'dot'
26 _format = 'pdf'
---> 28 _encoding = backend.ENCODING
30 #property
31 def engine(self):
32 """The layout commmand used for rendering (``'dot'``, ``'neato'``, ...)."""
AttributeError: module 'graphviz.backend' has no attribute 'ENCODING'
I understand that a similar question about the same kind of error was asked previously. However, I don't really understand the answers that were given to that question. Any help with a specific way to fix this would be greatly appreciated. Thanks a lot.

Colab + sedona: TypeError: 'JavaPackage' object is not callable

Using the guide here: https://sedona.apache.org/setup/install-python/
Trying to use Apache Sedona in python on google colab. However, following the guide results in a TypeError. Any ideas?
!pip install apache-sedona[spark]
from pyspark.sql import SparkSession
from sedona.register import SedonaRegistrator
from sedona.utils import SedonaKryoRegistrator, KryoSerializer
spark = SparkSession. \
builder. \
appName('appName'). \
config("spark.serializer", KryoSerializer.getName). \
config("spark.kryo.registrator", SedonaKryoRegistrator.getName). \
config('spark.jars.packages',
'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.2.0-incubating,'
'org.datasyslab:geotools-wrapper:1.1.0-25.2'). \
getOrCreate()
SedonaRegistrator.registerAll(spark)
Result:
TypeError Traceback (most recent call last)
<ipython-input-73-3e0f20bf8072> in <module>()
11 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
12 'org.datasyslab:geotools-wrapper:geotools-24.0').getOrCreate()
---> 13 SedonaRegistrator.registerAll(spark)
14 dfx = spark.sql("SELECT ST_PolygonFromText('-74.0428197,40.6867969,-74.0421975,40.6921336,-74.0508020,40.6912794,-74.0428197,40.6867969', ',') AS polygonshape")
15 dfx
1 frames
/usr/local/lib/python3.7/dist-packages/sedona/register/geo_registrator.py in registerAll(cls, spark)
41 spark.sql("SELECT 1 as geom").count()
42 PackageImporter.import_jvm_lib(spark._jvm)
---> 43 cls.register(spark)
44 return True
45
/usr/local/lib/python3.7/dist-packages/sedona/register/geo_registrator.py in register(cls, spark)
46 #classmethod
47 def register(cls, spark: SparkSession):
---> 48 return spark._jvm.SedonaSQLRegistrator.registerAll(spark._jsparkSession)
49
50
TypeError: 'JavaPackage' object is not callable
You have to download missing jar connector files, choosing one of the methods from https://sedona.apache.org/setup/install-python/#prepare-python-adapter-jar
One simple option is to:
cd $SPARK_HOME/jars
wget https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.1.0-25.2/geotools-wrapper-1.1.0-25.2.jar
wget https://repo1.maven.org/maven2/org/apache/sedona/sedona-python-adapter-3.0_2.12/1.2.0-incubating/sedona-python-adapter-3.0_2.12-1.2.0-incubating.jar
wget https://repo1.maven.org/maven2/org/apache/sedona/sedona-viz-3.0_2.12/1.2.0-incubating/sedona-viz-3.0_2.12-1.2.0-incubating.jar
This question is closely related with sedona error : java.lang.NoClassDefFoundError: org/opengis/referencing/FactoryException

missing variable to complete cell execution

I hope someone can help me. I've been stuck with this error for a while. I have two .py files that I am importing in a jupyter notebook. When running the last cell (see code below) I get a Traceback error I can'f fix.
I think there is some error in my ch_data_prep.py file related to variable df_ch not correctly passed between files. Is this possible? Any suggestion on how to solve this problem?
Thanks!
ch_data_prep.py
def seg_data(self):
seg_startdate = input('Enter start date (yyyy-mm-dd): ')
seg_finishdate = input('Enter end date (yyyy-mm-dd): ')
df_ch_seg = df_ch[(df_ch['event_datetime'] > seg_startdate)
& (df_ch['event_datetime'] < seg_finishdate)
]
return df_ch_seg
df_ch_seg = seg_data(df_ch)
ch_db.py
def get_data():
# Some omitted code here to connect to database and get data...
df_ch = pd.DataFrame(result)
return df_ch
df_ch = get_data()
Jupyter Notebook data_analysis.ipynb
In[1]: import ch_db
df_ch = ch_db.get_data()
In[2]: import ch_data_prep
When I run cell 2, I get this error
--------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-fbe2d4fceba6> in <module>
----> 1 import ch_data_prep
~/clickstream/ch_data_prep.py in <module>
34 return df_ch_seg
35
---> 36 df_ch_seg = seg_data()
TypeError: seg_data() missing 1 required positional argument: 'df_ch'

Why do i get an Attribute Error when using Neurokit?

Why do i get an attribute error when i run this code in jupyter ? I am trying to figure out how to use Neurokit.
Ive tried to look through the modules one by one, but i seem to find the error.
import neurokit as nk
import pandas as pd
import numpy as np
import sklearn
df = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit.py/master/examples/Bio/bio_100Hz.csv")
# Process the signals
bio = nk.bio_process(ecg=df["ECG"], rsp=df["RSP"], eda=df["EDA"], add=df["Photosensor"], sampling_rate=1000 )
Output Message:
AttributeError Traceback (most recent call last)
<ipython-input-2-ad0abf8de45e> in <module>
11
12 # Process the signals
---> 13 bio = nk.bio_process(ecg=df["ECG"], rsp=df["RSP"], eda=df["EDA"], add=df["Photosensor"], sampling_rate=1000 )
14 # Plot the processed dataframe, normalizing all variables for viewing purpose
15 nk.z_score(bio["df"]).plot()
~\Anaconda3\lib\site-packages\neurokit\bio\bio_meta.py in bio_process(ecg, rsp, eda, emg, add, sampling_rate, age, sex, position, ecg_filter_type, ecg_filter_band, ecg_filter_frequency, ecg_segmenter, ecg_quality_model, ecg_hrv_features, eda_alpha, eda_gamma, scr_method, scr_treshold, emg_names, emg_envelope_freqs, emg_envelope_lfreq, emg_activation_treshold, emg_activation_n_above, emg_activation_n_below)
123 # ECG & RSP
124 if ecg is not None:
--> 125 ecg = ecg_process(ecg=ecg, rsp=rsp, sampling_rate=sampling_rate, filter_type=ecg_filter_type, filter_band=ecg_filter_band, filter_frequency=ecg_filter_frequency, segmenter=ecg_segmenter, quality_model=ecg_quality_model, hrv_features=ecg_hrv_features, age=age, sex=sex, position=position)
126 processed_bio["ECG"] = ecg["ECG"]
127 if rsp is not None:
~\Anaconda3\lib\site-packages\neurokit\bio\bio_ecg.py in ecg_process(ecg, rsp, sampling_rate, filter_type, filter_band, filter_frequency, segmenter, quality_model, hrv_features, age, sex, position)
117 # ===============
118 if quality_model is not None:
--> 119 quality = ecg_signal_quality(cardiac_cycles=processed_ecg["ECG"]["Cardiac_Cycles"], sampling_rate=sampling_rate, rpeaks=processed_ecg["ECG"]["R_Peaks"], quality_model=quality_model)
120 processed_ecg["ECG"].update(quality)
121 processed_ecg["df"] = pd.concat([processed_ecg["df"], quality["ECG_Signal_Quality"]], axis=1)
~\Anaconda3\lib\site-packages\neurokit\bio\bio_ecg.py in ecg_signal_quality(cardiac_cycles, sampling_rate, rpeaks, quality_model)
355
356 if quality_model == "default":
--> 357 model = sklearn.externals.joblib.load(Path.materials() + 'heartbeat_classification.model')
358 else:
359 model = sklearn.externals.joblib.load(quality_model)
AttributeError: module 'sklearn' has no attribute 'externals'
You could downgrade you scikit-learn version if you don't need the most recent fixes using
pip install scikit-learn==0.20.1
There is an issue to fix this problem in future version:
https://github.com/neuropsychology/NeuroKit.py/issues/101
I'm executing the exact same code as you and run into the same problem.
I followed the link indicated by Louis MAYAUD and there they suggest to just add
from sklearn.externals import joblib
That solves everything and you don't need to downgrade scikit-learn version
Happy code! :)

Categories