I am new to numpy and use a lab instrument that returns data as a numpy array, which I am in turn storing as part of a list with the structure
raw_data = [[int, numpy.ndarray, int, int]]
I want to grab the biggest number from the second column of that numpy array (many rows and 2 columns), so I wrote this code:
…code that collects data and writes it to the raw_data list…
max_ccounts = raw_data[1][1].max(axis=0)[1]
print(max_ccounts)
But I constantly get the following error:
File "/usr/local/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/kimfook-lee/Desktop/automated_TPI.py", line 75, in <module>
max_ccounts = raw_data[1][1].max(axis=0)[1]
KeyError: '__builtins__'
I know that raw_data[1][1] points to a numpy array because I printed it. I wasn’t sure if I was accessing the list elements properly so I tried to do it on my own in the terminal, using the exact same numpy array that was printed:
>>> raw_data1 = np.array([[7200, 876],[8000,840271]])
>>> raw_data = [[0,raw_data1,10,10]]
>>> max_ccounts = raw_data[0][1].max(axis=0)[1]
>>> print(max_ccounts)
840271
Whenever I search for KeyError, I only find things related to dictionaries, and can’t find any information related to builtins or numpy. What is causing this error and how can it be resolved?
EDIT:
Following the advice of ForceBru I went ahead and added some temp variables and printed them to find the error.
…code that collects data and writes it to the raw_data list…
tmp1 = raw_data[1]
print('tmp1:', tmp1)
tmp2 = tmp1[1]
print('tmp2', tmp2)
tmp3=tmp2.max(axis=0)
print('tmp3', tmp3)
tmp4 = tmp3[1]
print('tmp4', tmp4)
Now there is the same error with a different traceback.
KeyError: '__builtins__'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/kimfook-lee/Desktop/automated_TPI.py", line 75, in <module>
print('tmp1:', tmp1)
RuntimeError: Unable to configure default ndarray.__repr__
cant figure out the question , but to me
import numpy as np
raw_data1 = np.array([[7200, 876],[8000,840271]])
raw_data = [[0,raw_data1,10,10]]
max_ccounts = raw_data[0][1].max(axis=0)[1]
print(max_ccounts)
output
840271
Related
I'm learning how to use Numba to speed up functions with jit and vectorize. I didn't have any issues with the jit version of this code, but I am getting an index error with vectorize. I suspect this question's answer is getting at the right idea that there is a type error, but I'm not confident on which direction to take on changing the indexing. Included below is the function I've been playing around with, which outputs the Fibonacci numbers up to a chosen index of the sequence. What is going wrong with the indexing, and how I can correct my code to account for it?
from numba import vectorize
import numpy as np
from timeit import timeit
#vectorize
def fib(n):
'''
Adjusted from:
https://lectures.quantecon.org/py/numba.html
https://en.wikipedia.org/wiki/Fibonacci_number
https://www.geeksforgeeks.org/program-for-nth-fibonacci-number/
'''
if n == 1:
return np.ones(1)
elif n > 1:
x = np.empty(n)
x[0] = 1
x[1] = 1
for i in range(2,n):
x[i] = x[i-1] + x[i-2]
return x
else:
print('WARNING: Check validity of input.')
print(timeit('fib(10)', globals={'fib':fib}))
Which results in the following error output.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/llvmlite/ir/instructions.py", line 619, in __init__
typ = typ.elements[i]
AttributeError: 'PointerType' object has no attribute 'elements'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/galen/Projects/myjekyllblog/test_code/quantecon_2.py", line 27, in <module>
print(timeit('fib(10)', globals={'fib':fib}))
File "/usr/lib/python3.6/timeit.py", line 233, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.6/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/dufunc.py", line 166, in _compile_for_args
return self._compile_for_argtys(tuple(argtys))
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/dufunc.py", line 188, in _compile_for_argtys
cres, actual_sig)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/ufuncbuilder.py", line 157, in _build_element_wise_ufunc_wrapper
cres.objectmode, cres)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 220, in build_ufunc_wrapper
env=envptr)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 130, in build_fast_loop_body
env=env)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 23, in _build_ufunc_loop_body
store(retval)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 126, in store
out.store_aligned(retval, ind)
File "/usr/local/lib/python3.6/dist-packages/numba/npyufunc/wrappers.py", line 276, in store_aligned
self.context.pack_value(self.builder, self.fe_type, value, ptr)
File "/usr/local/lib/python3.6/dist-packages/numba/targets/base.py", line 482, in pack_value
dataval = self.data_model_manager[ty].as_data(builder, value)
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 558, in as_data
elems = self._as("as_data", builder, value)
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 530, in _as
self.get(builder, value, i)))
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 558, in as_data
elems = self._as("as_data", builder, value)
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 530, in _as
self.get(builder, value, i)))
File "/usr/local/lib/python3.6/dist-packages/numba/datamodel/models.py", line 624, in get
name="extracted." + self._fields[pos])
File "/usr/local/lib/python3.6/dist-packages/llvmlite/ir/builder.py", line 911, in extract_value
instr = instructions.ExtractValue(self.block, agg, idx, name=name)
File "/usr/local/lib/python3.6/dist-packages/llvmlite/ir/instructions.py", line 622, in __init__
% (list(indices), agg.type))
TypeError: Can't index at [0] in i8*
The error is because you are trying to vectorize a function which you can say is essentially not vectorizable. I think you are confusing the functionality of how #jit and #vectorize work. In order to speed up your functions, you use #jit, while #vectorize is used to create numpy universal functions. See the official documentation here :
Using vectorize(), you write your function as operating over input
scalars, rather than arrays. Numba will generate the surrounding loop
(or kernel) allowing efficient iteration over the actual inputs.
So it is essentially not possible to create a numpy universal function which has the same functionality as your fibonacci function. Here is the link for official documentation on universal functions if you are interested.
So in order to use #vectorize, you need to create a function which can be essentially used as a numpy universal function. For your purpose of speeding up your code, you simply need to use #jit.
I am confused. I have 21 files that have been generated by the same process and I am filtering them all using savitzky-golay filter with the same parameters.
It works normally for some files, but at some point, I receive the ValueError: array must not contain infs or NaNs. The problem is, I checked the file and there aren't any infs or NaNs!
print "nan", df.isnull().sum()
print "inf", np.isinf(df).sum()
gives
nan x 0
T 0
std_T 0
sterr_T 0
dtype: int64
inf x 0
T 0
std_T 0
sterr_T 0
dtype: int64
So the problem may be in implementation of the filter? Could this result from for example the choice of window length or polyorder relative to the length or step of the data?
Complete traceback:
Traceback (most recent call last):
File "<ipython-input-7-40b33049ef41>", line 1, in <module>
runfile('D:/data/scripts/DailyProfiles_Gradients.py', wdir='D:/data/DFDP2/DFDP2B/DTS/DTS_scripts')
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
execfile(filename, namespace)
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "D:/data/scripts/DailyProfiles_Gradients.py", line 142, in <module>
grad = gradient(y, x, scale,PO)
File "D:/data/scripts/DailyProfiles_Gradients.py", line 76, in Tgradient
smoothed=savgol_filter(list(x), scale, PO, deriv=1, delta=dy[0])
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\_savitzky_golay.py", line 337, in savgol_filter
coeffs = savgol_coeffs(window_length, polyorder, deriv=deriv, delta=delta)
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\_savitzky_golay.py", line 140, in savgol_coeffs
coeffs, _, _, _ = lstsq(A, y)
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\linalg\basic.py", line 822, in lstsq
b1 = _asarray_validated(b, check_finite=check_finite)
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\_lib\_util.py", line 187, in _asarray_validated
a = toarray(a)
File "C:\Users\me\AppData\Local\Continuum\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 1033, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs
This problem is rather specific to the data and method and I have not been able to produce a minimum reproducible working example. I am not asking for fixing my code, I am just asking for some brainstorming: What aspects have I not checked yet that might be causing the error? What should the function parameters look like, other than that the window length must be an odd number greater than the polyorder?
I am grateful for the discussion that has arisen, it helped, eventually.
I can reproduce the error ValueError: array must not contain infs or NaNs if delta is extremely small (e.g. delta=1e-310). Check your code and data to ensure that the values that you pass for delta are reasonable.
Hello I am using the following code to try to use scipy's interpolation.
I am getting the following error:
Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 3.4.1\helpers\pydev\pydevd.py", line 1733, in <module>
debugger.run(setup['file'], None, None)
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 3.4.1\helpers\pydev\pydevd.py", line 1226, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:/Users/%user%/PycharmProjects/price_options_multithread/curve_interpolation.py", line 81, in <module>
tck = interpolate.PiecewisePolynomial(moneyness_,vols_, orders = 2)
File "C:\Users\%user%\Miniconda\lib\site-packages\scipy\interpolate\polyint.py", line 731, in __init__
self._set_yi(yi0)
File "C:\Users\%user%\Miniconda\lib\site-packages\scipy\interpolate\polyint.py", line 131, in _set_yi
self._y_axis = (axis % yi.ndim)
ZeroDivisionError: integer division or modulo by zero
With the code to call the interpolation reading as:
tck = interpolate.PiecewisePolynomial(X,Y, orders = 2)
Where Y is:
[0.2015, 0.3469, 0.2985, 0.2113, 0.19989999999999997, 0.4262, 0.21355000000000002, 0.22260000000000002, 0.21194999999999997, 0.1846, 0.2627, 0.2058, 0.2276, 0.21715, 0.23099999999999998, 0.20235, 0.2165, 0.21165, 0.3836, 0.19594999999999999, 0.20450000000000002, 0.20375, 0.20145000000000002, 0.23525000000000001, 0.2242, 0.22645, 0.27455, 0.22425, 0.2232, 0.1977, 0.19635000000000002, 0.21995, 0.30325, 0.22565]
And X is:
[0.91157701638785, 0.586013796249332, 0.820419314749065, 0.937622073998931, 1.17202759249866, 0.520901152221628, 0.872509429971228, 0.989712189221094, 1.0548248332488, 1.02877977563772, 0.807396785943524, 0.92459954519339, 1.30225288055407, 1.01575724683218, 0.859486901165687, 0.976689660415553, 1.00273471802663, 1.23714023652637, 0.455788508193925, 1.10691494847096, 0.846464372360146, 0.963667131610013, 1.09389241966542, 0.781351728332443, 0.898554487582309, 1.08086989085988, 0.716239084304739, 0.833441843554605, 0.950644602804472, 1.04180230444326, 1.1199374772765, 1.06784736205434, 0.651126440277035, 0.885531958776768]
type(X) and type(Y) give: <type 'list'>
If I do a polynomial interpolation in excel it looks like the following:
Any pointers on what is generating this error would be super helpful.
Thank you!
The parallel fit using python would be best accomplished using numpy polyfit.
Hope this helps someone else in the future.
This is unrelated to the source error - but I will leave to hopefully help someone else.
I have the following code:
import pymc as pm
from matplotlib import pyplot as plt
from pymc.Matplot import plot as mcplot
import numpy as np
from matplotlib import rc
res = [18.752, 12.450, 11.832]
v = pm.Uniform('v', 0, 20)
errors = pm.Uniform('errors', 0, 100, size = 3)
taus = 1/(errors ** 2)
mydist = pm.Normal('mydist', mu = v, tau = taus, value = res, observed = True)
model=pm.Model([mydist, errors, taus, v, res])
mcmc=pm.MCMC(model) # This is line 19 where the TypeError originates
mcmc.sample(20000,10000)
mcplot(mcmc.trace('mydist'))
For some reason it doesn't work, I get the 'TypeError: hasattr(): attribute name must be string' error, with the following trace:
Traceback (most recent call last):
File "<ipython-input-49-759ebaf4321c>", line 1, in <module>
runfile('C:/Users/Paul/.spyder2-py3/temp.py', wdir='C:/Users/Paul/.spyder2-py3')
File "C:\Users\Paul\Miniconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Users\Paul\Miniconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Paul/.spyder2-py3/temp.py", line 19, in <module>
mcmc=pm.MCMC(model)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\MCMC.py", line 82, in __init__
**kwds)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Model.py", line 197, in __init__
Model.__init__(self, input, name, verbose)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Model.py", line 99, in __init__
ObjectContainer.__init__(self, input)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Container.py", line 606, in __init__
conservative_update(self, input_to_file)
File "C:\Users\Paul\Miniconda3\lib\site-packages\pymc\Container.py", line 549, in conservative_update
if not hasattr(obj, k):
TypeError: hasattr(): attribute name must be string
How do I make it work and output "mydist"?
Edit: I posted a wrong trace at first by accident.
Edit2: It all must be because res doesn't have a name, because it's an array, but I don't know how to assign a name to it, so it'll make this work.
I must admit that I'm not familiar with pymc, but changing it to the following at least made the application run:
mydist = pm.Normal('mydist', mu = v, tau = taus, value = res, observed = False)
mcmc=pm.MCMC([mydist, errors, taus, v, res])
This seems to be because you were wrapping everything in a Model which is an extension of ObjectContainer, but since you passed it a list, MCMC file_items in Container.py tried to assign index 4 in a list to something using replace, but since Model is an ObjectContainer it assigned the key 4 in it's __dict__ causing the weird TypeError you got. Removing the wrapping Model caused MCMC to correctly use an ListContainer instead.
Now, there's probably a bug in Model.py on line 543 where observable stochastics aren't stored in the database - the expression is for object in self.stochastics | self.deterministics: but I suspect it should include self.observable_stochastics too - so I needed to change observable to False or the last line would throw a KeyError.
I'm not familiar enough with pymc to determine if it's actually or bug or desired behaviour so I leave it up to you to submit an issue about it.
You simply need to define res as a numpy array:
res = np.array([18.752, 12.450, 11.832])
Then you'll get an error here mcmc.trace('mydist')because mydist is observed data, and therefore is not sampled. You probably want to plot other variables...
I'm using numba to speed up my code which is working fine without numba. But after using #jit, it crashes with this error:
Traceback (most recent call last):
File "C:\work_asaaki\code\gbc_classifier_train_7.py", line 54, in <module>
gentlebooster.train(X_train, y_train, boosting_rounds)
File "C:\work_asaaki\code\gentleboost_c_class_jit_v7_nolimit.py", line 298, in train
self.g_per_round, self.g = train_function(X, y, H)
File "C:\Anaconda\lib\site-packages\numba\dispatcher.py", line 152, in _compile_for_args
return self.jit(sig)
File "C:\Anaconda\lib\site-packages\numba\dispatcher.py", line 143, in jit
return self.compile(sig, **kws)
File "C:\Anaconda\lib\site-packages\numba\dispatcher.py", line 250, in compile
locals=self.locals)
File "C:\Anaconda\lib\site-packages\numba\compiler.py", line 183, in compile_bytecode
flags.no_compile)
File "C:\Anaconda\lib\site-packages\numba\compiler.py", line 323, in native_lowering_stage
lower.lower()
File "C:\Anaconda\lib\site-packages\numba\lowering.py", line 219, in lower
self.lower_block(block)
File "C:\Anaconda\lib\site-packages\numba\lowering.py", line 254, in lower_block
raise LoweringError(msg, inst.loc)
numba.lowering.LoweringError: Internal error:
NotImplementedError: ('cast', <llvm.core.Instruction object at 0x000000001801D320>, slice3_type, int64)
File "gentleboost_c_class_jit_v7_nolimit.py", line 103
Line 103 is below, in a loop:
weights = np.empty([n,m])
for curr_n in range(n):
weights[curr_n,:] = 1.0/(n) # this is line 103
where n is a constant already defined somewhere above in my code.
How can I remove the error? What "lowering" is going on? I'm using Anaconda 2.0.1 with Numba 0.13.x and Numpy 1.8.x on a 64-bit machine.
Based on this: https://gist.github.com/cc7768/bc5b8b7b9052708f0c0a,
I figured out what to do to avoid the issue. Instead of using the colon : to refer to any row/column, I just opened up the loop into two loops to explicitly refer to the indices in each dimension of the array:
weights = np.empty([n,m])
for curr_n in range(n):
for curr_m in range (m):
weights[curr_n,curr_m] = 1.0/(n)
There were other instances in my code after this where I used the colon, but they didn't cause errors further down, not sure why.