Index to column - python

I tried to convert my Index to a column. But I get the Error: AttributeError: 'DataFrame' object has no attribute 'reset_Seriennummer' It should be simple but it doesn't work.
My Index is ot called Index but it is written the same way:
My df:
Seriennummer 0
701085.0 "(array([1.52558046e+03, 2.55900548e+02, 5.96901108e-01]), array([[ 9.41414894e+03, -2.07982124e+03, -2.30130078e+00],
[-2.07982124e+03, 1.44373786e+03, 9.59282709e-01],
[-2.30130078e+00, 9.59282709e-01, 7.75807643e-04]]))"
701086.0 "(array([1.19304206e+03, 2.71174688e+02, 6.59205468e-01]), array([[ 5.21906135e+03, -2.23855187e+03, -2.11896425e+00],
[-2.23855187e+03, 2.61036500e+03, 1.67396324e+00],
[-2.11896425e+00, 1.67396324e+00, 1.22581746e-03]]))"
What I tried so far:
df['Seriennummer'] = df.Seriennummer
or
df.reset_Seriennummer(level=0, inplace=True)

This will work:
df.reset_index(level='Seriennummer')

Related

I am having troubling indexing a dataframe

So I have a csv, and I am trying to load it into a dataframe via
df = pd.read_csv("watchlist.csv", sep='\s{2,}',)
It seems to work fine when I print(df)
Also, when I print columns, this is the output I get.
print(df.columns) #- OUTPUT:
Index([',Name,Growth,Recommendation,CurrentRatio,TotalCash,Debt,Revenue,PercentageSharesOut,PercentageInstitutions,PercentageInsiders,PricetoBook,ShortRatio,RegularMarketPrice'], dtype='object')
The trouble I'm having, is that when I try to then go and access a column with something like
med_debt = math.floor(df.Debt), or even
print(df.Debt)
I get an attribute error:
AttributeError: 'DataFrame' object has no attribute 'Debt'
Any assistance here would be appreicated
 sep='\s{2,}' parameter will cause column list to become an object of type string, example:
>>> df = pd.read_csv("weather", sep='\s{2,}')
>>> df.columns
Index(['Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),
Visibility (km),Stn Press (kPa),Weather'], dtype='object')
>>> df.index
RangeIndex(start=0, stop=8784, step=1)
 When you try to access a specific column math.floor(df.Debt) it returns
AttributeError: 'DataFrame' object has no attribute 'Debt'
or maybe df["Debt"]
raise KeyError(key) from err
(KeyError: 'Debt')
 To have access on specific columns of df by this way, use:
df = pd.read_csv("watchlist.csv")
The separator is not separating the csv correctly, try leaving it out and letting the csv reader use the default value of , instead.

AttributeError: ("'float' object has no attribute 'strip'", 'occurred at index DIV') in Pandas Data Frame?

I want to strip the spaces across my pandas data frame .
I am using the following code for my data frame d1 .
cols = df1.select_dtypes(object).columns
df1[cols] = df1[cols].applymap(lambda x: x.strip())
But Getting the follwoing error:
df1[cols] = df1[cols].applymap(lambda x: x.strip())
AttributeError: ("'float' object has no attribute 'strip'", 'occurred at index DIV') .
How to get rid of this ?

Trying to split output by ','

I have an object for my output. Now I want to split my output and
create a df with the values.
This is the output I work with:
Seriennummer
701085.0 ([1525.5804581812297, 255.9005481721001, 0.596...
701086.0 ([1193.0420594479258, 271.17468806239793, 0.65...
701087.0 ([1265.5151604213813, 217.26487934586433, 0.60...
701088.0 ([1535.8282855508626, 200.6196628705149, 0.548...
701089.0 ([1500.4964672930257, 247.8883736673866, 0.583...
701090.0 ([1203.6453723293514, 258.5749562983118, 0.638...
701091.0 ([1607.1851164005993, 209.82194423587782, 0.56...
701092.0 ([1711.7277933836879, 231.1560159770871, 0.567...
dtype: object
This is what I am doing and my attempt to split my output:
x=df.T.iloc[1]
y=df.T.iloc[2]
def logifunc(x,c,a,b):
return c / (1 + (a) * np.exp(-b*(x)))
result = df.groupby('Seriennummer').apply(lambda grp:
opt.curve_fit(logifunc, grp.mrwSmpVWi, grp.mrwSmpP, p0=[110, 400, -2]))
print(result)
for element in result:
parts = element.split(',')
print (parts)
It doesn't work. I get the Error:
AttributeError: 'tuple' object has no attribute 'split'
#jezrael
It works. Now it shows a lot of data I don't need. Do you have an idea how I can drop every with the data I don't need.
Seriennummer 0 1 2
701085.0 1525.5804581812297 255.9005481721001 0.5969011082719918
701085.0 [ 9.41414894e+03 -2.07982124e+03 -2.30130078e+00] [-2.07982124e+03 1.44373786e+03 9.59282709e-01] [-2.30130078e+00 9.59282709e-01 7.75807643e-04]
701086.0 1193.0420594479258 271.17468806239793 0.6592054681687264
701086.0 [ 5.21906135e+03 -2.23855187e+03 -2.11896425e+00] [-2.23855187e+03 2.61036500e+03 1.67396324e+00] [-2.11896425e+00 1.67396324e+00 1.22581746e-03]
701087.0 1265.5151604213813 217.26487934586433 0.607183527397275
Use Series.explode with DataFrame constructor:
s = result.explode()
df1 = pd.DataFrame(s.tolist(), index=s.index)
If small data and/or performnace is not important:
df1 = result.explode().apply(pd.Series)

Error message when using: groupby('').transform(pd.rolling_sum, window=30)

why would this:
df2['rollsum'] = df2.groupby('ID')['yes'].transform(pd.rolling_sum, window=30, min_periods=1)
Raise the error: "AttributeError: module 'pandas' has no attribute 'rolling_sum'"??
Also, i tried
df2['rollsum'] = df2.groupby('ID')['yes'].rolling(30).mean()
which gives me this error: "TypeError: incompatible index of inserted column with frame index"
What am i doing wrong here?
I think pd.rolling_sum has been depreciated, what you can do is:
df2['rollsum'] = df2.groupby('ID')['yes'].transform(lambda x: x.rolling(30).sum())
or
df2['rollsum'] = df2.groupby('ID')['yes'].rolling(30).sum().reset_index(level=0, drop=True)
The reset_index will allow index alignment when assigning the column

'DataFrame' object is not callable error when I try to create a new df

I try to create a new df out of df_exo, however the error I get is 'DataFrame' object is not callable. df_exo is a DataFrame with 176, 1222 size. What is going wrong?
df_features = df_exo(['INDU.NL.INTM.1.BS.M', 'INDU.NL.CONS.1.BS.M_4',\
'INDU.NL.INTM.2.BS.M', 'INDU.NL.INTM.3.BS.M_12', 'INDU.NL.CONS.4.BS.M_10',\
'INDU.NL.INTM.COF.BS.M_3', 'INDU.NL.INTM.COF.BS.M_4', 'INDU.NL.INVE.5.BS.M_11',\
'INDU.NL.FOBE.7.BS.M_4', 'INDU.NL.TOT.1.BS.M_1', 'INDU.NL.TOT.6.BS.M_4',\
'INDU.NL.INTM.2.BS.M', 'SERV.NL.TOT.2.BS.M', 'SERV.NL.TOT.3.BS.M',\
'SERV.NL.TOT.1.BS.M_2', 'SERV.NL.TOT.1.BS.M_3', 'SERV.NL.TOT.3.BS.M_1',\
'SERV.NL.TOT.3.BS.M_2', 'SERV.NL.TOT.COF.BS.M_7', 'CONS.NL.TOT.7.BS.M',\
'CONS.NL.TOT.6.BS.M_12', 'CONS.NL.TOT.7.BS.M_1', 'CONS.NL.TOT.7.BS.M_2',\
'CONS.NL.TOT.7.BS.M_12', 'BUIL.NL.TOT.3.BS.M_12'])
use
df_features = df_exo[['col1', 'col2']]
not
df_features = df_exo(['col1', 'col2'])
Reference:
Selecting multiple columns in a pandas dataframe

Categories