enter image description herePlease i am trying to name the index column but I can't. I want to be a able to name it such that I can reference it to view the index values which are dates. i have tried
df3.rename(columns={0:'Date'}, inplace=True) but it's not working.
Please can someone help me out? Thank you.
Note that the dataframe index cannot be accessed using df['Date'],
I fyou want rename the index, you can use DataFrame.rename_axis:
df=df.rename_axis(index='Date')
if you want to access it as a column then you have to transform it into a column using:
df=df.reset_index()
then you can use:
df['Date']
otherwise you can access the index by:
df.index
As there is no example data frame that you are on, I am listing an arbitrary example to demonstrate the idea.
import datetime as dt
import pandas as pd
data = {'C1' : [1, 2],
'Date' : [dt.datetime.now().strftime('%Y-%m-%d'),dt.datetime.now().strftime('%Y-%m-%d')]}
df = pd.DataFrame(data)
df.index = df["Date"]
del df["Date"]
print(df.index.name) # this will give you the new index column
print(df) #print the dataframe
Related
I have a pandas dataframe- got it from API so don't have much control over the structure of it- similar like this:
I want to have datetime a column and value as another column. Any hints?
you can use T to transform the dataframe and then reseindex to create a new index column and keep the current column you may need to change its name form index
df = df.T.reset_index()
df.columns = df.iloc[0]
df = df[1:]
Using python pandas how can we change the data frame
First, how to copy the column name down to other cell(blue)
Second, delete the row and index column(orange)
Third, modify the date formate(green)
I would appreciate any feedback~~
Update
df.iloc[1,1] = df.columns[0]
df = df.iloc[1:].reset_index(drop=True)
df.columns = df.iloc[0]
df = df.drop(df.index[0])
df = df.set_index('Date')
print(df.columns)
Question 1 - How to copy column name to a column (Edit- Rename column)
To rename a column pandas.DataFrame.rename
df.columns = ['Date','Asia Pacific Equity Fund']
# Here the list size should be 2 because you have 2 columns
# Rename using pandas pandas.DataFrame.rename
df.rename(columns = {'Asia Pacific Equity Fund':'Date',"Unnamed: 1":"Asia Pacific Equity Fund"}, inplace = True)
df.columns will return all the columns of dataframe where you can access each column name with index
Please refer Rename unnamed column pandas dataframe to change unnamed columns
Question 2 - Delete a row
# Get rows from first index
df = df.iloc[1:].reset_index()
# To remove desired rows
df.drop([0,1]).reset_index()
Question 3 - Modify the date format
current_format = '%Y-%m-%d %H:%M:%S'
desired_format = "%Y-%m-%d"
df['Date'] = pd.to_datetime(df['Date']).dt.strftime(desired_format)
# Input the existing format
df['Date'] = pd.to_datetime(df['Date'], infer_datetime_format=current_format).dt.strftime(desired_format)
# To update date format of Index
df.index = pd.to_datetime(df.index,infer_datetime_format=current_format).strftime(desired_format)
Please refer pandas.to_datetime for more details
I'm not sure I understand your questions. I mean, do you actually want to change the dataframe or how it is printed/displayed?
Indexes can be changed by using methods .set_index() or .reset_index(), or can be dropped eventually. If you just want to remove the first digit from each index (that's what I understood from the orange column), you should then create a list with the new indexes and pass it as a column to your dataframe.
Regarding the date format, it depends on what you want the changed format to become. Take a look into python datetime.
I would strongly suggest you to take a better look into pandas features and documentations, and how to handle a dataframe with this library. There is plenty of great sources a Google-search away :)
Delete the first two rows using this.
Rename the second column using this.
Work with datetime format using the datetime package. Read about it here
How to set date column as an index? I'm getting an error
AttributeError: 'DataFrame' object has no attribute 'Date'
How to fix this?
The Date column is already the index column, isn't it?
You can reset the index column and set it again like this if you want to try.
You will get the same result.
However, if you want to modify your Date column, you can do it by resetting the index column, modifying it, and then setting it back to the index.
import pandas as pd
import pandas_datareader as web
df = web.DataReader('^BSESN', data_source='yahoo', start='2015-07-16', end='2020-07-16')
df.reset_index(level=0, inplace=True)
# If you want to modify your index column, you can do it here.
df['Date'] = pd.to_datetime(df.Date, format='%Y-%m-%d')
df.index = df['Date']
df.drop('Date', axis=1, inplace=True)
df
Looks like you already have Date as index.
To set any column as index, you can also try:
df = df.set_index('Date')
This will set your Date column as index as well save your current index into DataFrame and will also make sure that there is no replica of Date present in the DataFrame.
I wanna get the row number of the dataframe and store it in a new column called Id. Please advise me on how to code it.
Current dataframe:
expected outcome with new Id column:
If the first column is index use DataFrame.insert, if necessary subtract 1:
df.insert(0, 'Id', df.index - 1)
If you need count column for general solution with any index values:
df.insert(0, 'Id', np.arange(len(df)))
import pandas as pd
df = pd.DataFrame({'a':[34, 23,37,38],'b':[1,2,3,4]})
df. set_index(a, inplace=True)
Id = list(df. index)
df['Id'] = id
First, reset an index, it will add as a new column with name index. Then rename the column to the desired name
import pandas as pd
df = (df.reset_index()
.rename(columns= {'index':'ID'}
)
how set my indexes from "Unnamed" to the first line of my dataframe in python
import pandas as pd
df = pd.read_excel('example.xls','Day_Report',index_col=None ,skip_footer=31 ,index=False)
df = df.dropna(how='all',axis=1)
df = df.dropna(how='all')
df = df.drop(2)
To set the column names (assuming that's what you mean by "indexes") to the first row, you can use
df.columns = df.loc[0, :].values
Following that, if you want to drop the first row, you can use
df.drop(0, inplace=True)
Edit
As coldspeed correctly notes below, if the source of this is reading a CSV, then adding the skiprows=1 parameter is much better.