df = pd.read_csv('dowjones.csv', index_col=0);
df['rm'] = 100 * (np.log(df.DJIA) - np.log(df.DJIA.shift(1)))
df.head()
I initially defined df here, in the code above
df = df.dropna()
formula = 'MSFTtrans ~ rm'
results2 = smf.ols(formula, df).fit(cov_type = 'HAC', cov_kwds={'maxlags':10,'use_correction':True})
print(results2.summary())
Then I ran the code above
NameError Traceback (most recent call last)
<ipython-input-3-b46efd5c722d> in <module>
2
3
----> 4 df = df.dropna()
5 formula = 'MSFTtrans ~ rm'
6 results2 = smf.ols(formula, df).fit(cov_type = 'HAC', cov_kwds={'maxlags':10,'use_correction':True})
NameError: name 'df' is not defined
This is the error I got saying df is not defined.
There should not be a semi colon at the end of df = pd.read_csv().
Also run the first code and then run the second code. What you are doing is you are not running the first code so df is not defined and when you try to run second code, it is giving you the error.
how can i read data from a csv with chnunksize and names?
I tried this:
sms = pd.read_table('demodata.csv', header=None, names=['label', 'good'])
X = sms.label.tolist()
y = sms.good.tolist()
and it worked totaly fine. But if try this, i'll get an error:
sms = pd.read_table('demodata.csv', chunksize=100, header=None, names=['label', 'good'])
X = sms.label.tolist()
y = sms.good.tolist()
And i get this error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-18-e3f35149ab7f> in <module>()
----> 1 X = sms.label.tolist()
2 y = sms.good.tolist()
AttributeError: 'TextFileReader' object has no attribute 'label'
Why does it work in the first but not in the second place?
Hi I am having problems with this code:
**import numpy as np
# Summarize the data about minutes spent in the classroom
#total_minutes = total_minutes_by_account.values()
total_minutes = list(total_minutes_by_account.values())
type(total_minutes)
# Printing out the samething converting to a list
print('Printing out the samething converting to a list ')
print(type(total_minutes))
print ('Mean:', np.mean(total_minutes))
print ('Standard deviation:', np.std(total_minutes))
print ('Minimum:', np.min(total_minutes))
print ('Maximum:', np.max(total_minutes))**
The error I get is:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-93-945375bf6098> in <module>()
3 # Summarize the data about minutes spent in the classroom
4 #total_minutes = total_minutes_by_account.values()
----> 5 total_minutes = list(total_minutes_by_account.values())
6 type(total_minutes)
7 #print(total_minutes)
AttributeError: 'list' object has no attribute 'values'
I really would lie to know how I can make this work, I can do it with pandas converitng it to a numpy array and the getting values for the statistics I want with numpy
I have the following code:
from pyspark.sql import Row
z1=["001",1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,30,41,42,43]
print z1
r1 = Row.fromSeq(z1)
print (r1)
Then I got error:
AttributeError Traceback (most recent call last)
<ipython-input-6-fa5cf7d26ed0> in <module>()
2 z1=["001",1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,30,41,42,43]
3 print z1
----> 4 r1 = Row.fromSeq(z1)
5
6 print (r1)
AttributeError: type object 'Row' has no attribute 'fromSeq'
Anyone know what I might have missed? Thanks!
If you don't provide names just use tuple:
tuple(z1)
This is all what is needed to build correct DataFrame
I ran this statement dr=df.dropna(how='all') to remove missing values and got the error message shown below:
AttributeError Traceback (most recent call last)
<ipython-input-29-07367ab952bc> in <module>
----> 1 dr=df.dropna(how='all')
AttributeError: 'list' object has no attribute 'dropna'
According to pdf https://www.google.com/url?sa=t&source=web&rct=j&url=https://readthedocs.org/projects/tabula-py/downloads/pdf/latest/&ved=2ahUKEwiKr-mQ9qTnAhUKwqYKHcAtAcoQFjADegQIBRAB&usg=AOvVaw32D890VNjAq5wOkTo4icOi&cshid=1580168098808
df = tabula.read_pdf(file, lattice=True, pages='all', area=(1, 1, 1000, 100), relative_area=True)
pages='all' => probably return a list of Dataframe
So you have to check:
for sub_df in df:
dr=sub_df.dropna(how='all')