So I'm trying out machine learning, and following a tutorial I found online.
For some reason when I run my code numpy is giving me an error, even-though I am not importing that library. (I've been having problems with numpy)
Code:
#!/usr/bin/env python
from sklearn import tree
#1 = smooth 0 = bumpy
features = [[140, 1], [130, 1], [150, 0], [170, 0]] #input
labels = ["apple", "apple", "orange", "orange"] #desired output
#0 = apple 1 = orange
clf = tree.DecisionTreeClassifier()
clf = clf.fit(features, labels)
print clf.predict([[160, 0]])
Error:
C:\Windows\system32\cmd.exe /c (python ^<C:\Users\me\AppData\Local\Temp\22\V
Ii532A.tmp)
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "E:\Python27\lib\site-packages\sklearn\__init__.py", line 134, in <module
>
from .base import clone
File "E:\Python27\lib\site-packages\sklearn\base.py", line 9, in <module>
import numpy as np
File "E:\Python27\lib\site-packages\numpy\__init__.py", line 142, in <module>
from . import add_newdocs
File "E:\Python27\lib\site-packages\numpy\add_newdocs.py", line 13, in <module
>
from numpy.lib import add_newdoc
File "E:\Python27\lib\site-packages\numpy\lib\__init__.py", line 8, in <module
>
from .type_check import *
File "E:\Python27\lib\site-packages\numpy\lib\type_check.py", line 11, in <mod
ule>
import numpy.core.numeric as _nx
File "E:\Python27\lib\site-packages\numpy\core\__init__.py", line 21, in <modu
le>
from . import function_base
File "E:\Python27\lib\site-packages\numpy\core\function_base.py", line 7, in <
module>
from .numeric import (result_type, NaN, shares_memory, MAY_SHARE_BOUNDS,
ImportError: cannot import name shares_memory
shell returned 1
Hit any key to close this window...
Thanks
P.S.
Also looking for a couple tutorial suggestions, one with machine learning and NLP would be great
Numpy is a scikitlearn dependency. That means SKlearn is made on top of numpy.
Creating a virtualenv is a great idea so as to understand what the real issue is.
The same code worked for me and I can tell you the prediction is "orange". :P
Related
I tried to use distplot to plot an array of double value but failed. Below is my source code:
>>> import seaborn as sns, numpy as np
>>> sns.set(); np.random.seed(0)
>>> x = np.random.randn(100)
>>> ax = sns.distplot(x)
Below is the error I got. I don't know what wrong with my code. Does anyone know the issue?
>>> ax = sns.distplot(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/anaconda/lib/python3.6/site-packages/seaborn/distributions.py", line 221, in distplot
kdeplot(a, vertical=vertical, ax=ax, color=kde_color, **kde_kws)
File "/anaconda/lib/python3.6/site-packages/seaborn/distributions.py", line 604, in kdeplot
cumulative=cumulative, **kwargs)
File "/anaconda/lib/python3.6/site-packages/seaborn/distributions.py", line 270, in _univariate_
kdeplot
cumulative=cumulative)
File "/anaconda/lib/python3.6/site-packages/seaborn/distributions.py", line 328, in _statsmodels
_univariate_kde
kde.fit(kernel, bw, fft, gridsize=gridsize, cut=cut, clip=clip)
File "/anaconda/lib/python3.6/site-packages/statsmodels/nonparametric/kde.py", line 146, in fit
clip=clip, cut=cut)
File "/anaconda/lib/python3.6/site-packages/statsmodels/nonparametric/kde.py", line 506, in kden
sityfft
f = revrt(zstar)
File "/anaconda/lib/python3.6/site-packages/statsmodels/nonparametric/kdetools.py", line 20, in
revrt
y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j
TypeError: slice indices must be integers or None or have an __index__ method
BTW, I am using python3.6.
This is caused by an old version of statsmodels and the problem is fixed in version 0.8.0. Upgrade it as described in https://github.com/mwaskom/seaborn/issues/1092
conda update statsmodels
I am running Python 3, and when I attempt to run this code:
from sklearn.preprocessing import LabelEncoder
cv=train.dtypes.loc[train.dtypes=='object'].index
print (cv)
le=LabelEncoder()
for i in cv:
train[i]=le.fit_transform(train[i])
test[i]=le.fit_transform(test[i])
However, i get this error:
le=LabelEncoder()
for i in cv:
train[i]=le.fit_transform(train[i])
test[i]=le.fit_transform(test[i])
Traceback (most recent call last):
File "<ipython-input-5-8739984f61b2>", line 3, in <module>
train[i]=le.fit_transform(train[i])
File "C:\Users\myname\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py", line 127, in fit_transform
self.classes_, y = np.unique(y, return_inverse=True)
File "C:\Users\myname\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py", line 195, in unique
perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
TypeError: unorderable types: str() > float()
Oddly enough, if I call the encoder on a specified column in my data, the output is successful. For instance:
le.fit_transform(test['Race'])
Results in:
le.fit_transform(test['Race'])
Out[7]: array([2, 4, 4, ..., 4, 1, 4], dtype=int64)
I've tried:
float(le.fit_transform(train[i]))
str(le.fit_transform(train[i]))
Both have not worked.
Could someone please provide help me out?
I have this simple code to test matplotlib using Python 2.7 in Fedora 23 32bit.
import matplotlib.pyplot as plt
x = [1,2,3,4]
y = [20, 21, 20.5, 20.8]
plt.plot(x, y)
plt.show()
which gives the following output:
Exception in Tkinter callback
Traceback (most recent call last):
File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 1536, in call
return self.func(*args)
File "/usr/lib/python2.7/site-packages/matplotlib/backends/backend_tkagg.py", line 278, in resize
self.show()
File "/usr/lib/python2.7/site-packages/matplotlib/backends/backend_tkagg.py", line 350, in draw
tkagg.blit(self._tkphoto, self.renderer._renderer, colormode=2)
File "/usr/lib/python2.7/site-packages/matplotlib/backends/tkagg.py", line 21, in blit
_tkagg.tkinit(tk.interpaddr(), 1)
OverflowError: Python int too large to convert to C long
Any ideas what is wrong here?
Thanks
I created a program to do some calculations, part of that is an interpolation. For some reason the program is not capable of loading scipy interpolate module. How can I make it work? The path to the scipy interpolate module is right.
from scipy.interpolate import interplt
ftempacce = interplt(temp, acce, kind='linear')
ait = ftempacce(t)
Log Error:
SyntaxError: invalid syntax
>>> from scipy.interpolate import interplt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\scipy\interpolate\__init__.py", line 150, in <module>
from .interpolate import *
File "C:\Python27\lib\site-packages\scipy\interpolate\interpolate.py", line 467
ait if self.bounds_error and below_bounds.any():
^
Variables temp and acce are lists of numbers:
>>> type(temp)
<type 'list'>
>>> type(acce)
<type 'list'>
>>>
I am trying to run a Augmented Dickey-Fuller test in statsmodels in Python, but I seem to be missing something.
This is the code that I am trying:
import numpy as np
import statsmodels.tsa.stattools as ts
x = np.array([1,2,3,4,3,4,2,3])
result = ts.adfuller(x)
I get the following error:
Traceback (most recent call last):
File "C:\Users\Akavall\Desktop\Python\Stats_models\stats_models_test.py", line 12, in <module>
result = ts.adfuller(x)
File "C:\Python27\lib\site-packages\statsmodels-0.4.1-py2.7-win32.egg\statsmodels\tsa\stattools.py", line 201, in adfuller
xdall = lagmat(xdiff[:,None], maxlag, trim='both', original='in')
File "C:\Python27\lib\site-packages\statsmodels-0.4.1-py2.7-win32.egg\statsmodels\tsa\tsatools.py", line 305, in lagmat
raise ValueError("maxlag should be < nobs")
ValueError: maxlag should be < nobs
My Numpy Version: 1.6.1
My statsmodels Version: 0.4.1
I am using windows.
I am looking at the documentation here but can't figure what I am doing wrong. What am I missing?
Thanks in Advance.
I figured it out. By default maxlag is set to None, while it should be set to integer. Something like this works:
import numpy as np
import statsmodels.tsa.stattools as ts
x = np.array([1,2,3,4,3,4,2,3])
result = ts.adfuller(x, 1) # maxlag is now set to 1
Output:
>>> result
(-2.6825663173365015, 0.077103947319183241, 0, 7, {'5%': -3.4775828571428571, '1%': -4.9386902332361515, '10%': -2.8438679591836733}, 15.971188911270618)