I have a huge .csv file with date as one of the column and I'm trying to plot it on a graph but I'm getting this error
"time data '01-Sept-20' does not match format '%d-%b-%y' (match)"
I'm using this line of code to convert it into datetime format
df['Date'] = pd.to_datetime(df['Date'], format="%d-%b-%y")
I think this error is because 'Sept' should be 'Sep'
What can I do to make Sept to Sep?
I'm using this dataset: covid19 api
As #Mayank pointed out in the comment you could replace the "Sept" string. And it works.
However, in your dataset is a column named Date_YMD which will give you the date without string replacement.
A complete example:
import pandas as pd
df = pd.read_csv('covid.csv')
df['Date_YMD'] = pd.to_datetime(df['Date_YMD'])
df['Date'] = pd.to_datetime(df['Date'].str.replace('Sept', 'Sep'), format='%d-%b-%y')
I think the main point here is to familiarize yourself with the data before searching for a technical solution.
Related
I've searched for 2 hours but can't find an answer for this that works.
I have this dataset I'm working with and I'm trying to find the latest date, but it seems like my code is not taking the year into account. Here are some of the dates that I have in the dataset.
Date
01/09/2023
12/21/2022
12/09/2022
11/19/2022
Here's a snippet from my code
import pandas as pd
df=pd.read_csv('test.csv')
df['Date'] = pd.to_datetime(df['Date'])
st.write(df['Date'].max())
st.write gives me 12/21/2022 as the output instead of 01/09/2023 as it should be. So it seems like the code is not taking the year into account and just looking at the month and date.
I tried changing the format to
df['Date'] = df['Date'].dt.strftime('%Y%m%d').astype(int) but that didn't change anything.
pandas.read_csv allows you to designate column for conversion into dates, let test.csv content be
Date
01/09/2023
12/21/2022
12/09/2022
11/19/2022
then
import pandas as pd
df = pd.read_csv('test.csv', parse_dates=["Date"])
print(df['Date'].max())
gives output
2023-01-09 00:00:00
Explanation: I provide list of names of columns holding dates, which then read_csv parses.
(tested in pandas 1.5.2)
I need to convert the date to Day, Month and Year. I tried some alternatives, but I was unsuccessful.
import pandas as pd
df = pd.read_excel(r"C:\__Imagens e Planilhas Python\Instagram\Postagem.xlsx")
print(df)
It's very confusing, because you're using two different formats between the image and the expected result (and you write you want the same).
Clarify that data is a date with:
df['data']= = pd.to_datetime(df['data'])
Once you have this, just change the format with:
my_format = '%m-%d-%Y'
df['data'] = df['data'].dt.strftime(my_format)
I have a dataframe with one of the columns being 'dates' (being a dtype: object) where I have a format YYYY-MM-DD HH:MM:SS+00:00 (there is a space between the days and the hours) but I want to simplify this by just having the YYYY-MM-DD format. Is there a way to cut off the HH:MM:SS+00:00 with a few lines of code? I've tried using but it didn't work:
pd.to_datetime(combined_csv['dates'], format='%Y-%m-%dT')
Any suggestions?
I hope that's useful for you
import pandas as pd
df = pd.read_csv("combined.csv")
df[["Date", "Time"]] = df["dates"].str.split(" ", expand=True)
The solutions I have found in a similar question are not working for me. I have a pandas DataFrame including mock sales data. I want to sort by date since they are currently out of order. I have tried converting to a datetime object. I also tried creating a Month and Day column and sorting by them but that did not work either. Date is in YYYY-MM-DD format
Here is my solution:
import pandas as pd
import datetime
data = pd.read_csv(path)
# sort by date (not working)
data['OrderDate'] = pd.to_datetime(data['OrderDate'])
data.sort_values(by='OrderDate')
data.reset_index(inplace=True)
# sort by month then day (not working)
data.sort_values(by='Month')
data.sort_values(by='Day')
data.reset_index(inplace=True)
# export csv
data.to_csv(fileName, index=False)
I've just started using pandas and I'm trying to import an excel file but I get Date-Time values like 01/01/2019 00:00:00 instead of the 01/01/2019 format. The source data is Date by the way, not Date-Time.
I'm using the following code
import pandas as pd
df = pd.read_excel (r'C:\Users\abcd\Desktop\KumulData.xlsx')
print(df)
The columns that have date in them are "BDATE", "BVADE" and "AKTIVASYONTARIH" which correspond to 6th, 7th and 11th columns.
What code can I use to see the dates as Date format in Pandas Dataframe?
Thanks.
If they're already datetimes then you can extract the date part and reassign the columns:
df[["BDATE", "BVADE", "AKTIVASYONTARIH"]] = df[["BDATE", "BVADE", "AKTIVASYONTARIH"]].apply(lambda x: x.dt.date)
solution updated..
For the sake of completeness, your goal can be achieved by:
df[["BDATE", "BVADE", "AKTIVASYONTARIH"]].astype("datetime64[D]")