It works
import textstat
text = (
"Playing games has always been thought to be important to ")
textstat.flesch_reading_ease(text)
BUT When I call dataframe's columns df['Contents']
df['Read']= textstat.flesch_reading_ease(df['Contents'])
I am getting the error:
TypeError Traceback (most recent call last)
<ipython-input-50-b897dfd2f80f> in <module>
----> 1 df['Read']= textstat.flesch_reading_ease(df.Contents)
TypeError: unhashable type: 'Series'
I deleted null data but it still doesn't work. The result was same.
TypeError: unhashable type: 'Series'
Related
After using groupby function I want to convert that to a dataframe object but it shows error
My Code
dfgrp1 = df['Service 1'].groupby(['Service Type'])
dfgrp1 = dfgrp1.to_frame()
Output
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [18], in <cell line: 2>()
1 dfgrp1 = df['Service 1'].groupby(['Service Type'])
----> 2 dfgrp1 = dfgrp1.to_frame()
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\groupby\groupby.py:904, in GroupBy.__getattr__(self, attr)
901 if attr in self.obj:
902 return self[attr]
--> 904 raise AttributeError(
905 f"'{type(self).__name__}' object has no attribute '{attr}'"
906 )
AttributeError: 'DataFrameGroupBy' object has no attribute 'to_frame'
P.S. I have multiple sheets in the excel workbook I don't think that would be a problem but just mentioning it in case it does affect.
Apply aggregation to the grouped result first.
for instance, dfgrp1 what does it produces when you print it? an object reference, which you cannot make into frame.
However, the result that you see as result of groupby, employing agregation, will allow you to use to_frame()
Hi guys is am getting this error:
AttributeError: 'numpy.float64' object has no attribute 'index'
The traceback looks like this:
AttributeError Traceback (most recent call last)
<ipython-input-50-dfcbcabe20ea> in <module>()
2 for name, df in all_data.items():
3 top_10 = df.mean().dropna().sort_values().iloc[-10]
----> 4 top_10_columns[name] = top_10.index
While running the following code:
top_10_columns = {}
for name, df in all_data.items():
top_10 = df.mean().dropna().sort_values().iloc[-10]
top_10_columns[name] = top_10.index
You are accidentally not getting the "top 10" items when you do .iloc[-10], but just the 10th to last item. So top_10 is a single value of type numpy.float64. Giving iloc a range should fix it. .iloc[0:10] or .iloc[-10:] depending on whether your sort is ascending or descending and you want to get either the first ten items (.iloc[0:10]) or the last ten items (.iloc[-10:]).
You are trying to assign to an array, but Python is interpreting top_10_columns as a float. Above your for loop you must declare it as an array i.e top_10_columns = []
Initially, I was getting "list object is not callable" error but after "importing list " new error came in the picture as shown below.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
imort list
data_cols=['user id','movie id','rating','timestamp']
item_cols=['movie id','movie title','release date','video release date','IMDb URL','unknown','Action','Adventure','Animation','Childrens','Comedy','Crime','Documentary','Drama','Fantasy','Film-Noir','Horror','Musical','Mystery','Romance ','Sci-Fi','Thriller','War' ,'Western']
user_cols = ['user id','age','gender','occupation','zip code']
#importing the data files onto dataframes
users=pd.read_csv('u.user',sep='|',names=user_cols,encoding='latin-1')
item=pd.read_csv('u.item',sep='|',names=item_cols,encoding='latin-1')
data=pd.read_csv('u.data',sep='\t',names=data_cols,encoding='latin-1')
dataset=pd.merge(pd.merge(item,data),users)
#print(dataset.head())
rating_total=dataset.groupby('movie title').size()
rating_mean=(dataset.groupby('movie title'))['movie title','rating']
rating_mean=rating_mean.mean()
rating_total=pd.DataFrame({'movie title':rating_total.index,'total
ratings':rating_total.values})
rating_mean['movie title']=rating_mean.index
final=pd.merge(rating_mean,rating_total).sort_values(by='total
ratings',ascending=False)
pop=final[:300].sort_values(by='rating',ascending=False)
pop=pop['movie title']
pop1=list(pop.head(10))
Output
TypeError Traceback (most recent call last)
<ipython-input-57-0b36af3a9876> in <module>
30 pop=pop['movie title']
31 #print(pop.head())
---> 32 pop1=list(pop.head(10))
TypeError: 'module' object is not callable
I have the following code:
from pyspark.sql import Row
z1=["001",1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,30,41,42,43]
print z1
r1 = Row.fromSeq(z1)
print (r1)
Then I got error:
AttributeError Traceback (most recent call last)
<ipython-input-6-fa5cf7d26ed0> in <module>()
2 z1=["001",1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,30,41,42,43]
3 print z1
----> 4 r1 = Row.fromSeq(z1)
5
6 print (r1)
AttributeError: type object 'Row' has no attribute 'fromSeq'
Anyone know what I might have missed? Thanks!
If you don't provide names just use tuple:
tuple(z1)
This is all what is needed to build correct DataFrame
I ran this statement dr=df.dropna(how='all') to remove missing values and got the error message shown below:
AttributeError Traceback (most recent call last)
<ipython-input-29-07367ab952bc> in <module>
----> 1 dr=df.dropna(how='all')
AttributeError: 'list' object has no attribute 'dropna'
According to pdf https://www.google.com/url?sa=t&source=web&rct=j&url=https://readthedocs.org/projects/tabula-py/downloads/pdf/latest/&ved=2ahUKEwiKr-mQ9qTnAhUKwqYKHcAtAcoQFjADegQIBRAB&usg=AOvVaw32D890VNjAq5wOkTo4icOi&cshid=1580168098808
df = tabula.read_pdf(file, lattice=True, pages='all', area=(1, 1, 1000, 100), relative_area=True)
pages='all' => probably return a list of Dataframe
So you have to check:
for sub_df in df:
dr=sub_df.dropna(how='all')