How to refer to column that is not in index? - python

Let's say I have a data frame like this.
Max Min Open OpenA
Date
2017.10.18 1.18050 1.17858 1.17872 1.18028
2017.10.19 1.18575 1.17676 1.17804 1.18565
2017.10.20 1.18575 1.17621 1.17642 1.18532
2017.10.23 1.17770 1.17245 1.17281 1.17763
2017.10.24 1.17924 1.17423 1.17430 1.17866
And i want to refer to the data['Date'] column. But i get this error:
KeyError: 'Date'
Cheers!

You can use reset_index and then treat it as a column:
df = df.reset_index()
df['date']
OR
you can use df.index.tolist(). This will return you the values.
Ex:
In [2918]: df
Out[2918]:
emp_id
date
10/1/2018 staff_1
10/1/2018 staff_2
10/1/2018 staff_3
In [2922]: df.index.tolist()
Out[2922]: ['10/1/2018', '10/1/2018', '10/1/2018']
OR
In [2924]: df = df.reset_index()
In [2926]: df['date']
Out[2926]:
0 10/1/2018
1 10/1/2018
2 10/1/2018

That isn't actually a column, rather index. So use data.index to fetch the values without changing current structure of the dataframe.
You can further use data.reset_index() to make it a column.
Note - Don't use data.reset_index(drop=True) as that will drop the current index without even making it a column.

Related

swap pandas column to values in another column

I have a pandas dataframe- got it from API so don't have much control over the structure of it- similar like this:
I want to have datetime a column and value as another column. Any hints?
you can use T to transform the dataframe and then reseindex to create a new index column and keep the current column you may need to change its name form index
df = df.T.reset_index()
df.columns = df.iloc[0]
df = df[1:]

python pandas: how to modify column header name and modify the date formate

Using python pandas how can we change the data frame
First, how to copy the column name down to other cell(blue)
Second, delete the row and index column(orange)
Third, modify the date formate(green)
I would appreciate any feedback~~
Update
df.iloc[1,1] = df.columns[0]
df = df.iloc[1:].reset_index(drop=True)
df.columns = df.iloc[0]
df = df.drop(df.index[0])
df = df.set_index('Date')
print(df.columns)
Question 1 - How to copy column name to a column (Edit- Rename column)
To rename a column pandas.DataFrame.rename
df.columns = ['Date','Asia Pacific Equity Fund']
# Here the list size should be 2 because you have 2 columns
# Rename using pandas pandas.DataFrame.rename
df.rename(columns = {'Asia Pacific Equity Fund':'Date',"Unnamed: 1":"Asia Pacific Equity Fund"}, inplace = True)
df.columns will return all the columns of dataframe where you can access each column name with index
Please refer Rename unnamed column pandas dataframe to change unnamed columns
Question 2 - Delete a row
# Get rows from first index
df = df.iloc[1:].reset_index()
# To remove desired rows
df.drop([0,1]).reset_index()
Question 3 - Modify the date format
current_format = '%Y-%m-%d %H:%M:%S'
desired_format = "%Y-%m-%d"
df['Date'] = pd.to_datetime(df['Date']).dt.strftime(desired_format)
# Input the existing format
df['Date'] = pd.to_datetime(df['Date'], infer_datetime_format=current_format).dt.strftime(desired_format)
# To update date format of Index
df.index = pd.to_datetime(df.index,infer_datetime_format=current_format).strftime(desired_format)
Please refer pandas.to_datetime for more details
I'm not sure I understand your questions. I mean, do you actually want to change the dataframe or how it is printed/displayed?
Indexes can be changed by using methods .set_index() or .reset_index(), or can be dropped eventually. If you just want to remove the first digit from each index (that's what I understood from the orange column), you should then create a list with the new indexes and pass it as a column to your dataframe.
Regarding the date format, it depends on what you want the changed format to become. Take a look into python datetime.
I would strongly suggest you to take a better look into pandas features and documentations, and how to handle a dataframe with this library. There is plenty of great sources a Google-search away :)
Delete the first two rows using this.
Rename the second column using this.
Work with datetime format using the datetime package. Read about it here

Getting an attribute error in python code

How to set date column as an index? I'm getting an error
AttributeError: 'DataFrame' object has no attribute 'Date'
How to fix this?
The Date column is already the index column, isn't it?
You can reset the index column and set it again like this if you want to try.
You will get the same result.
However, if you want to modify your Date column, you can do it by resetting the index column, modifying it, and then setting it back to the index.
import pandas as pd
import pandas_datareader as web
df = web.DataReader('^BSESN', data_source='yahoo', start='2015-07-16', end='2020-07-16')
df.reset_index(level=0, inplace=True)
# If you want to modify your index column, you can do it here.
df['Date'] = pd.to_datetime(df.Date, format='%Y-%m-%d')
df.index = df['Date']
df.drop('Date', axis=1, inplace=True)
df
Looks like you already have Date as index.
To set any column as index, you can also try:
df = df.set_index('Date')
This will set your Date column as index as well save your current index into DataFrame and will also make sure that there is no replica of Date present in the DataFrame.

How do i name my index column of the attached dataset?

enter image description herePlease i am trying to name the index column but I can't. I want to be a able to name it such that I can reference it to view the index values which are dates. i have tried
df3.rename(columns={0:'Date'}, inplace=True) but it's not working.
Please can someone help me out? Thank you.
Note that the dataframe index cannot be accessed using df['Date'],
I fyou want rename the index, you can use DataFrame.rename_axis:
df=df.rename_axis(index='Date')
if you want to access it as a column then you have to transform it into a column using:
df=df.reset_index()
then you can use:
df['Date']
otherwise you can access the index by:
df.index
As there is no example data frame that you are on, I am listing an arbitrary example to demonstrate the idea.
import datetime as dt
import pandas as pd
data = {'C1' : [1, 2],
'Date' : [dt.datetime.now().strftime('%Y-%m-%d'),dt.datetime.now().strftime('%Y-%m-%d')]}
df = pd.DataFrame(data)
df.index = df["Date"]
del df["Date"]
print(df.index.name) # this will give you the new index column
print(df) #print the dataframe

Assign date to Dataframe column

Trying to assign a date to a column in a DataFrame.
Assigning in the following way gives an error
for date in sorted(list(set(dates))):
df.loc[:, 'DATE'] = date
Error Cannot set a frame with no defined index and a scalar
Okay, fine:
for date in sorted(list(set(dates))):
df['DATE'] = date
Warning: A value is truing to be set on a copy of a slice from a DataFrame, try using .loc ...
What is it exactly that python prefers I do to not avoid an Error with a Warning instead?
Many thanks!
if you are sure that len(sorted(list(set(dates)))) == len(df) then you can simply do:
df['DATE'] = sorted(list(set(dates)))

Categories