Too many indices error coming up in Python - python

This is my first bash at using python, I am making an array of fathers and children, I have a dataset in a .csv file which I need to go into python so I can later take it over to java script. However the same error message keeps coming which I have put below. Further down is my script. Would be most grateful for any advice!
Gabriella
>>> runfile('/Users/gkountourides/Desktop/FunctionalExperiment/untitled0.py', wdir='/Users/gkountourides/Desktop/FunctionalExperiment')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "//anaconda/lib/python3.5/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 714, in runfile
execfile(filename, namespace)
File "//anaconda/lib/python3.5/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 89, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/Users/gkountourides/Desktop/FunctionalExperiment/untitled0.py", line 41, in <module>
father,child = fc_csv_to_javascript_arrays("/Users/gkountourides/Desktop/FunctionalExperiment/fatherChild.csv")
File "/Users/gkountourides/Desktop/FunctionalExperiment/untitled0.py", line 38, in fc_csv_to_javascript_arrays
father_str, child_str = from_fc_array_to_fc_string(full_array)
File "/Users/gkountourides/Desktop/FunctionalExperiment/untitled0.py", line 30, in from_fc_array_to_fc_string
father_array = input_2d_array[:,0]
IndexError: too many indices for array
>>>
And then my actual script:
import glob
import numpy as np
def list_all_jpgs():
javascript_string = "["
for filename in glob.glob("*.jpg"):
javascript_string += '"' + filename + '",'
javascript_string = javascript_string[0:-1] + "]"
return javascript_string
def load_into_np_array(input_csv_file):
loaded_array = np.genfromtxt(input_csv_file, dtype=str, delimiter=",")
return loaded_array
def from_single_array_to_string(input_1d_array):
output_string = '['
for entry in input_1d_array:
output_string += '"'+str(entry)+'",'
output_string = output_string[0:-1]+']'
return output_string
def from_fc_array_to_fc_string(input_2d_array):
father_array = input_2d_array[:,0]
child_array = input_2d_array[:,1]
father_string = from_single_array_to_string(father_array)
child_string = from_single_array_to_string(child_array)
return father_string, child_string
def fc_csv_to_javascript_arrays(csv_input):
full_array = load_into_np_array(csv_input)
father_str, child_str = from_fc_array_to_fc_string(full_array)
return father_str, child_str
father,child = fc_csv_to_javascript_arrays("/Users/gkountourides/Desktop/FunctionalExperiment/fatherChild.csv")
print(father)
print(child)

The too many indices error indicates that input_2d_array is not a two 2d array. genfromtxt() is not returning what you are expecting.
numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?
http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html

Related

Python vestigial parameter and uncallable function

import numpy as np
from scipy.optimize import fsolve
from scipy.integrate import quad
import matplotlib.pyplot as plt
Rgas = 8.31446261815324 #Pa*m**3/mol*K
def Peng_Robinson_EOS(P,V,T,Tc,Pc,ω):
a = (1+(0.37464+1.54226*ω-0.26992*ω**2)*(1-(T/Tc)**(1/2)))**2*Rgas**2*Tc**2/Pc #Pa*m**3
b = 0.07780 * Rgas*Tc/Pc
return P + a/((V+(1-np.sqrt(2))*b)*(V+(1+np.sqrt(2)))) - Rgas*T/(V-b)
def PR_Psat(T,Tc,Pc,ω,V,Pguess = 100000):
def integral_diff (Pguess,T,Tc,Pc,ω,V):
def Psat_integrand (V,Pguess,T,Tc,Pc,ω):
integrand1 = fsolve(Peng_Robinson_EOS(Pguess,V,T,Tc,Pc,ω),Pguess)
integrand2 = Pguess
integrand = integrand1-integrand2
return integrand
Vl = fsolve(Psat_integrand(V,Pguess,T,Tc,Pc,ω),0)
Vv_guess = Rgas*T/Pguess
Vv = fsolve(Psat_integrand(V,Pguess,T,Tc,Pc,ω),Vv_guess)
Vinf_guess = (Vl + Vv)/2
Vinf = fsolve(Psat_integrand(V,Pguess,T,Tc,Pc,ω),Vinf_guess)
left = quad(Psat_integrand(V,Pguess,T,Tc,Pc,ω),Vl,Vinf)[0]
right = quad(Psat_integrand(V,Pguess,T,Tc,Pc,ω),Vinf,Vv)[0]
diff = left + right
return diff
Psat = fsolve(integral_diff(Pguess,T,Tc,Pc,ω,V),Pguess)
return Psat
There are two issues with this code.
1: in theory, PR_Psat should not depend on V, since all values of V used in calculation are found via fsolve. However, because Peng_Robinson_EOS depends on V, it Python won't let it be ignored in enclosing functions. Is there a way to eliminate the need to "specify" V?
from an earlier version (before V was a parameter of all functions), to demonstrate:
runfile('...', wdir='...')
Traceback (most recent call last):
File "<ipython-input-4-0875bc6411e8>", line 1, in <module>
runfile('...', wdir='...')
File "...\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "...\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "...", line 40, in <module>
print(PR_Psat(300,647.1,22055000,0.345))
File "...", line 37, in PR_Psat
Psat = fsolve(integral_diff(Pguess,T,Tc,Pc,ω),Pguess)
File "...", line 25, in integral_diff
Vl = fsolve(Psat_integrand(V,Pguess,T,Tc,Pc,ω),0)
NameError: name 'V' is not defined
2: It seems that Peng_Robinson is not being treated as a callable function, but rather as a float. I'm not sure what is causing this.
runfile('...', wdir='...')
Traceback (most recent call last):
File "<ipython-input-13-0875bc6411e8>", line 1, in <module>
runfile('...', wdir='...')
File "C:\Users\Spencer\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "...\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "...", line 42, in <module>
print(PR_Psat(300,647.1,22055000,0.345,1))
File "...", line 39, in PR_Psat
Psat = fsolve(integral_diff(Pguess,T,Tc,Pc,ω,V),Pguess)
File "...", line 27, in integral_diff
Vl = fsolve(Psat_integrand(V,Pguess,T,Tc,Pc,ω),0)
File "...", line 23, in Psat_integrand
integrand1 = fsolve(Peng_Robinson_EOS(Pguess,V,T,Tc,Pc,ω),Pguess)
File "...\lib\site-packages\scipy\optimize\minpack.py", line 148, in fsolve
res = _root_hybr(func, x0, args, jac=fprime, **options)
File "...\lib\site-packages\scipy\optimize\minpack.py", line 214, in _root_hybr
shape, dtype = _check_func('fsolve', 'func', func, x0, args, n, (n,))
File "...\lib\site-packages\scipy\optimize\minpack.py", line 27, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
TypeError: 'numpy.float64' object is not callable
In theory, Peng_Robinson_EOS should, if plotted as P(V) with constant T, produce a cubic regression. The goal of PR_Psat is to find the value of P for which the integrals between P and the cubic regression cancel. (Hence, integral_diff being plugged into fsolve)
To summarize the questions,
1) Is there a way to eliminate the need for V in PR_Psat?
2) Why is Peng_Robinson_EOS being flagged as an un-callable numpy.float64 object?
The problem was a syntax error. Arguments for fsolve and quad were placed wrong. To fix both problems, I moved the args to the back. For example,
incorrect:
Vv = fsolve(Psat_integrand(V,Pguess,T,Tc,Pc,ω),Vv_guess)
correct:
Vv = fsolve(Psat_integrand,Vv_guess,args=(Pguess,T,Tc,Pc,ω))
The reason Peng_Robinson_EOS is being called a numpy.float64 is because with the syntax of the code in the question the program is evaluated before being passed to the solver.
Also, with proper syntax, V is no longer an issue.
def PR_Psat(T,Tc,Pc,ω,Pguess = 1000):
def integral_diff (Pguess,T,Tc,Pc,ω):
def Psat_integrand (V,Pguess,T,Tc,Pc,ω):
integrand1 = fsolve(Peng_Robinson_EOS,Pguess,args=(V,T,Tc,Pc,ω))
integrand2 = Pguess
integrand = integrand1-integrand2
return integrand
Vl = fsolve(Psat_integrand,0,args=(Pguess,T,Tc,Pc,ω))
Vv_guess = Rgas*T/Pguess
Vv = fsolve(Psat_integrand,Vv_guess,args=(Pguess,T,Tc,Pc,ω))
Vinf_guess = (Vl + Vv)/2
Vinf = fsolve(Psat_integrand,Vinf_guess,args=(Pguess,T,Tc,Pc,ω))
left = quad(Psat_integrand,Vl,Vinf,args=(Pguess,T,Tc,Pc,ω))[0]
right = quad(Psat_integrand,Vinf,Vv,args=(Pguess,T,Tc,Pc,ω))[0]
diff = left + right
return diff
Psat = fsolve(integral_diff,Pguess,args=(T,Tc,Pc,ω))
return Psat

TypeError: unhashable type: 'slice'

I am trying to run a regression using the following dataframe dfMyRoll the head of the dataframe looks like:
SCORE SCORE_LAG
date
2007-10-29 -0.031551 NaN
2007-10-30 0.000100 -0.031551
2007-10-31 0.000100 0.000100
2007-11-01 0.000100 0.000100
2007-11-02 0.000100 0.000100
The code that I am using is :
import glob
import pandas as pd
import os.path
import scipy
from scipy.stats import linregress
def main():
dataPath = "C:/Users/Stacey/Documents/data/Roll"
roll = 4
1ID = "BBG.XNGS.AAPL.S"
2ID = "BBG.XNGS.AMAT.S"
print(1ID,1ID)
cointergration = getCointergration(dataPath,1ID,2ID,roll)
return
def getCointergration(dataPath,1ID,2ID,roll):
for myRoll in range((roll-4),roll,1):
path = dataPath+str(myRoll)+'/'
filename='PairData_'+1ID+'_'+2ID+'.csv'
for fname in glob.iglob(path+filename):
dfMyRoll = pd.read_csv(fname, header=0, usecols=[0,31],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'SCORE'])
dfMyRoll['SCORE_LAG'] = dfMyRoll['SCORE'].shift(1)
print('cointergration',dfMyRoll.head())
X = dfMyRoll[1:,'SCORE']
Y = dfMyRoll[1:,'SCORE_LAG']
slope,intercept,_,_,stderr=linregress(dfMyRoll[1:,'SCORE'],dfMyRoll[1:,'SCORE_LAG'])
if __name__ == "__main__":
print ("CointergrationTest...19/05/17")
try:
main()
except KeyboardInterrupt:
print ("Ctrl+C pressed. Stopping...")
I get the error: TypeError: unhashable type: 'slice'. I have looked at previous posts on this subject and tried adding iloc to the X and Y time series in the following way:
X = dfMyRoll.iloc[1:,'SCORE']
Y = dfMyRoll.iloc[1:,'SCORE_LAG']
but unfortunately I can't seem to find a solution. Please see below for a stack trace:
Traceback (most recent call last):
File "<ipython-input-3-431422978139>", line 1, in <module>
runfile('C:/Users/Stacey/Documents/scripts/cointergrationTest.py', wdir='C:/Users/Stacey/Documents/scripts')
File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Stacey/Documents/scripts/cointergrationTest.py", line 64, in <module>
main()
File "C:/Users/Stacey/Documents/scripts/cointergrationTest.py", line 23, in main
cointergration = getCointergration(dataPath,1ID,2ID,roll)
File "C:/Users/Stacey/Documents/scripts/cointergrationTest.py", line 42, in getCointergration
X = dfMyRoll[1:,'SCORE']
File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2059, in __getitem__
return self._getitem_column(key)
File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2066, in _getitem_column
return self._get_item_cache(key)
File "C:\Anaconda\lib\site-packages\pandas\core\generic.py", line 1384, in _get_item_cache
res = cache.get(item)
TypeError: unhashable type: 'slice'
You need to use loc rather than iloc:
X = dfMyRoll.loc[1:,'SCORE']
Y = dfMyRoll.loc[1:,'SCORE_LAG']
iloc is read as "integer location", and only accepts integer position. loc is somewhat more forgiving and allows both (you can also use ix).

Python string to int for classification

I am reading the book Machine Learning in Action.
One example in Chapter 2 converts string to int for classification use. For example, 'student' = 1, 'teacher' = 2, engineer = 3.
See the code below in Line 12. While an error comes up while I execute it:
invalid literal for int() with base 10: 'largeDose'
Where is my problem.
def file2matrix(filename):
fr = open(filename)
numberOfLines = len(fr.readlines()) #get the number of lines in the file
returnMat = zeros((numberOfLines,3)) #prepare matrix to return
classLabelVector = [] #prepare labels return
fr = open(filename)
index = 0
for line in fr.readlines():
line = line.strip()
listFromLine = line.split('\t')
returnMat[index,:] = listFromLine[0:3]
classLabelVector.append(int(listFromLine[-1]))
index += 1
return returnMat,classLabelVector
caller code:
from numpy import *
import kNN
datingDataMat,datingLabels = kNN.file2matrix('datingTestSet.txt')
import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
#ax.scatter(datingDataMat[:,1], datingDataMat[:,2])
ax.scatter(datingDataMat[:,1], datingDataMat[:,2], array(datingLabels), array(datingLabels))
plt.show()
Traceback and error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
execfile(filename, namespace)
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Users/Zhiming Zhang/Documents/Machine Learning/kNN/execute.py", line 10, in <module>
datingDataMat,datingLabels = kNN.file2matrix('datingTestSet.txt')
File "kNN.py", line 48, in file2matrix
classLabelVector.append(int(listFromLine[-1]))
ValueError: invalid literal for int() with base 10: 'largeDoses'
You try to convert a string like "largeDose" to an int using the conversion function int(). But that's not how this works. The function int() converts only strings which look like integer numbers (e. g. "123") to integers.
In your case you can use either an if-elif-else cascade or a dictionary.
Cascade:
if listFromLine[-1] == 'largeDose':
result = 1
elif listFromLine[-1] == 'teacher':
result = 2
elif …
…
else:
result = 42 # or raise an exception or whatever
Dictionary:
conversion = {
'largeDose': 1,
'teacher': 2,
… }
# ...
# later, in the loop:
classLabelVector.append(conversion[listFromLine[-1]])
# The above will raise a KeyError if an unexpected value is given.
# Ir in case you want to use a default value:
classLabelVector.append(conversion.get(listFromLine[-1], 42))

TypeError: Can't convert '_io.TextIOWrapper' object to str implicitly

I would like to save a pandas.DataFrame object into a .csv file (using DataFrame.to_csv()). To make it simple, here is the situation; my project 'Algos' contains the folder 'Plip' with a .txt file called 'plop.txt':
1,2
3,4
5,6
My script is here:
import numpy as np
import pandas as pa
# opening the file
dossier = "Plip"
fichier = "plop.txt"
fichier = open(dossier + "/" + fichier)
data = fichier.readlines()
# creating the data frame
Pair = []
Impair = []
for m in data:
impair = int(m[0:1])
pair = int(m[2:3])
Impair.append(impair)
Pair.append(pair)
M = np.array([Impair, Pair]).transpose()
Table = pa.DataFrame(M, columns = ["Impair", "Pair"])
#creating the .csv file
Table.to_csv(dossier + fichier + ".csv")
Table has been correcty created but the script returns:
runfile('C:/Users/******/Documents/PYTHON/Algos/truc.py', wdir='C:/Users/******/Documents/PYTHON/Algos')
Traceback (most recent call last):
File "<ipython-input-110-3c7306b8d61d>", line 1, in <module>
runfile('C:/Users/******/Documents/PYTHON/Algos/truc.py', wdir='C:/Users/******/Documents/PYTHON/Algos')
File "C:\Users\******\Documents\Gratuiciels\WINPYTHON.3355\python-3.3.5\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 601, in runfile
execfile(filename, namespace)
File "C:\Users\******\Documents\Gratuiciels\WINPYTHON.3355\python-3.3.5\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 80, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/******/Documents/PYTHON/Algos/truc.py", line 30, in <module>
Table.to_csv(dossier + fichier + ".csv")
TypeError: Can't convert '_io.TextIOWrapper' object to str implicitly
I'm quite new on Python so could you be as precise as possible on your answer please ?
Thank you in advance
Seems like you reassigned fichier to a file object on L7, use a new variable instead!

Python wired index out of bound

I was trying to create a volume weighted average price using python.
Here is my code:
vwap = []
for i in range(window, int(price_times_qty)):
vol_total = 0
vol_price = 0
for j in range(window):
vol_total = vol_total + lastqty[i-j]
vol_price = vol_price + lastqty[i-j] * lastprice[i-j]
vwap.append( vol_price / vol_total )
I believe my idea is right, but when I execute the code, I have the error:
Traceback (most recent call last):
File "<ipython-input-9-9724942aa7be>", line 1, in <module>
runfile('/home/intern2/Desktop/VWAP/vwap-v1.py', wdir='/home/intern2/Desktop/VWAP')
File "/usr/lib/python3/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 586, in runfile
execfile(filename, namespace)
File "/usr/lib/python3/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 48, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "/home/intern2/Desktop/VWAP/vwap-v1.py", line 25, in <module>
vol_total = vol_total + lastqty[i-j]
IndexError: index out of bounds
I just have no idea why the error is index out of bound. When I check my code, I think the index is well within the bound.
Could anyone please help me with it?

Categories