Getting attribute error: Series object has no attribute 'explode' [duplicate] - python

This question already has answers here:
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
(16 answers)
Closed 2 years ago.
I am trying to run python script which I am using explode(). In my local it is working fine but when I am trying to run on server it is giving error.
I am using below code:
df_main1 = (df_main1.set_index(['rule_id', 'applied_sql_function1', 'input_condition', 'input_value', 'and_or_not_oprtor', 'output_condition', 'priority_order']).apply(lambda x: x.astype(str).str.split(',').explode()).reset_index())
Error I am getting:
("'Series' object has no attribute 'explode'", u'occurred at index comb_fld_order')

Problem is different versions of pandas, because Series.explode working in later versions only:
New in version 0.25.0.

Try:
df_main1 = (df_main1.set_index(['rule_id', 'applied_sql_function1', 'input_condition', 'input_value', 'and_or_not_oprtor', 'output_condition', 'priority_order'])[col].str.split(',', expand=True).stack()
Where col is the name of the string column, which you wish to split and explode.
Generally expand will do the horizontal explode, while stack will move everything into one column.

I used to below code to get rid off from explode():
df_main1 = (df_main1.set_index(['rule_id', 'applied_sql_function1', 'input_condition', 'input_value', 'and_or_not_oprtor', 'output_condition', 'priority_order'])['comb_fld_order']
.astype(str)
.str.split(',', expand=True)
.stack()
.reset_index(level=-1, drop=True)
.reset_index(name='comb_fld_order'))

Related

How to use .mode with groupby [duplicate]

This question already has answers here:
GroupBy pandas DataFrame and select most common value
(13 answers)
Closed 4 months ago.
i can use mean and median with groupby with this line:
newdf.groupby('dropoff_site')['load_weight'].mean()
newdf.groupby('dropoff_site')['load_weight'].median()
But when i use it for mode like this:
newdf.groupby('dropoff_site')['load_weight'].mode()
An error popped up, saying:
'SeriesGroupBy' object has no attribute 'mode'
What should i do?
update:
from GroupBy pandas DataFrame and select most common value i used
source2.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)
as
newdf.groupby(['dropoff_site'])['load_weight'].agg(pd.Series.mode)
because this has multimodal, but now the error goes:
Must produce aggregated value
Try this...
newdf.groupby('dropoff_site')['load_weight'].agg(pd.Series.mode)

How do i split a column whose name has a hyphen? [duplicate]

This question already has answers here:
Pandas column access w/column names containing spaces
(6 answers)
Closed 10 months ago.
so i am trying to split a column in a DataFrame (df1) using the .str.split() function. column name is hyphenated (lat-lon). When i run the code below, python reads only the "lat" and ignores the "-lon" thereby returning
AttributeError: 'DataFrame' object has no attribute 'lat'
df1[['lat','lon']] = df1.lat-lon.str.split(" ",expand=True,)
df1
How do i get python to read the entire column name (lat-lon) and not just lat?
df1[['lat','lon']] = df1["lat-lon"].str.split(" ",expand=True,)
You can't use attributes with hyphens. The name must be a valid python name (letters+digits+underscore, not starting with a digit).
A good practice is actually to never use attributes for column names in pandas.
Use:
df1[['lat','lon']] = df1['lat-lon'].str.split(" ",expand=True,)

How do i change column name in a dataframe in python pandas [duplicate]

This question already has answers here:
Python renaming Pandas DataFrame Columns
(4 answers)
Multiple aggregations of the same column using pandas GroupBy.agg()
(4 answers)
Closed 11 months ago.
im new to python, i used to code...
StoreGrouper= DonHenSaless.groupby('Store')
StoreGrouperT= StoreGrouper["SalesDollars"].agg(np.sum)
StoreGrouperT.rename(columns={SalesDollars:TotalSalesDollars})
to group stores and sum by the SalesDollars then rename SalesDollars to TotalSalesDollars. it outputted the following error...
NameError: name 'SalesDollars' is not defined
I also tried using quotes
StoreGrouper= DonHenSaless.groupby('Store')
StoreGrouperT= StoreGrouper["SalesDollars"].agg(np.sum)
StoreGrouperT= StoreGrouperT.rename(columns={'SalesDollars':'TotalSalesDollars'})
This output the error: rename() got an unexpected keyword argument 'columns'
Here is my df
df
In order to rename a column you need quotes so it would be:
StoreGrouperT.rename(columns={'SalesDollars':'TotalSalesDollars'})
Also I usually assign it a variable
StoreGrouperT = StoreGrouperT.rename(columns={'SalesDollars':'TotalSalesDollars'})
Use the pandas rename option to change the column name. You can also use inplace as true if you want your change to get reflected to the dataframe rather than saving it again on the df variable
df.rename(columns={'old_name':'new_name'}, inplace=True)

Can`t merge Pandas Dataframe [duplicate]

This question already has answers here:
Append to Series in python/pandas not working
(2 answers)
Closed 2 years ago.
I have around 8 .csv files in a given directory. When I am running this code, getting empty dataframe (new_df which I have specified.).
I have already seen how to use concat function to get the job done but just wondering what i am doing wrong in my approach since i read documentation on DataFrame.append() and it should have worked.
path = Path("/content/Sales_data/")
new_df = pd.DataFrame()
for file in path.glob("*.csv"):
df = pd.read_csv(file)
new_df.append(df, ignore_index=True)
new_df
Appreciate any recommendation.
Try setting new_df to the DataFrame with appended data:
new_df = new_df.append(df, ignore_index=True)
The problem with your code is due to the fact that append returns a new object, it does not modify the existing DataFrame in place

AttributeError: 'list' object has no attribute 'rename'

df.rename(columns={'nan': 'RK', 'PP': 'PLAYER','SH':'TEAM','nan':'GP','nan':'G','nan':'A','nan':'PTS','nan':'+/-','nan':'PIM','nan':'PTS/G','nan':'SOG','nan':'PCT','nan':'GWG','nan':'PPG','nan':'PPA','nan':'SHG','nan':'SHA'}, inplace=True)
This is my code to rename the columns according to http://www.espn.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2
I want both the tables to have same column names. I am using python2 in spyder IDE.
When I run the code above, it gives me this error:
AttributeError: 'list' object has no attribute 'rename'
The original question was posted a long time ago, but I just came across the same issue and found the solution here: pd.read_html() imports a list rather than a dataframe
When you do pd.read_html you are creating a list of dataframes since the website may have more than 1 table. Add one more line of code before you try your rename:
dfs = pd.read_html(url, header=0)
and then df = dfs[0] ; you will have the df variable as a dataframe , which will allow you to run the df.rename command you are trying to run in the original question.
this should be able to fix , df is you dataset
df.columns=['a','b','c','d','e','f']

Categories