How to convert to just a date - Pandas, Python [duplicate] - python

I'm trying to convert a string to a date and I understand how to use the to_datetime that comes with pandas but I'd like to be able to do this without inserting a time?
I'm sure this is very simple but I'm a little new to this.

You don't need the time component, if you use the datetime.strptime or to_datetime the conversion is the same:
In [10]:
df = pd.DataFrame({'date':['2012/04/06']})
0 2012/04/06
In [11]:
import datetime as dt
df['date'].apply(lambda x: dt.datetime.strptime(x, '%Y/%m/%d'))
0 2012-04-06
Name: date, dtype: datetime64[ns]
In [13]:
0 2012-04-06
Name: date, dtype: datetime64[ns]


How to convert datetime to strings in python [duplicate]

I have a dataframe which contains a column called period that has datetime values in it in the following format:
I want to convert the datetime to strings with the format - 03/01/2020 (month, day, year)
How would I do this?
import pandas as pd
df = pd.DataFrame({'period': ['2020-03-01T00:00:00.000000000', '2020-04-01T00:00:00.000000000']})
df['period'] = pd.to_datetime(df['period'])
df['period'] = df['period'].dt.strftime('%m/%d/%Y')
0 03/01/2020
1 04/01/2020

How to obtain just the year from pandas data frame? [duplicate]

So I wrote some code to turn a list of strings into date times:
s = pd.Series(["14 Nov 2020", "14/11/2020", "2020/11/14",
"Hello World", "Nov 14th, 2020"])
s_dates = pd.to_datetime(s, errors='coerce', exact=False)
It produced the following output:
0 2020-11-14
1 2020-11-14
2 2020-11-14
3 NaT
4 2020-11-14
dtype: datetime64[ns]
How would I obtain just the year from this?
Since your seriess_dates has dtype datetime64[ns], you can directly use
Series.dt.year like:
This will return a series containing only the year (as dtype int64).
Check the documentation for more useful datetime transformations.
Assuming your years would always be 4 digits, we can try using str.extract here:
s_dates["year"] = s_dates["dates_extracted"].str.extract(r'(\d{4})')

Convert date column (string) to datetime and match the format

I'm trying to covert the next date column (str) to datetime64 and say that format doesn't match, can anyone help me pleas :)
0 15/7/21
2541 13/9/21
dtype: object
What I try:
pd.to_datetime(df["Date"], format = "%d/%m/%Y")
ValueError: time data '15/7/21' does not match format '%d/%m/%Y' (match)
I also try:
pd.to_datetime(df["Date"].astype("datetime64"), format='%d/%m/%Y')
And it convert it as datetime but there is some date the day is in the month.
Anyone know what to do ?
%Y expects a 4-digit year. Use %y for a 2-digit year (See the docs):
>>> import pandas as pd
>>> df = pd.DataFrame({'Date':['15/7/21','13/9/21']})
>>> df['Date']
0 15/7/21
1 13/9/21
Name: Date, dtype: object
>>> pd.to_datetime(df['Date'].astype('datetime64'),format='%d/%m/%y')
0 2021-07-15
1 2021-09-13
Name: Date, dtype: datetime64[ns]
Note that pandas is pretty good at guessing the format:
>>> pd.to_datetime(df['Date'])
0 2021-07-15
1 2021-09-13

pandas to_datetime converts non-zero padded month and day into datetime

I am using pd.to_datetime to convert strings into datetime;
df = pd.DataFrame(data={'id':['DD-83']})
pd.to_datetime(df['id'].str.replace(r'\D+', ''), errors='coerce', format='%d%m')
%d%m defines zero-padded day and month, but the code still converts the above string into
0 1900-03-08
Name: id, dtype: datetime64[ns]
I am wondering how to avoid it being converted into datetime (e.g. convert to NaT in this case), if the month and day in a string are not 0-padded. So
will convert to
You need to look for - and only pass strings without -.
df = pd.DataFrame(data={'id':['DD-83', 'DD0706', 'DD0306']})
df['date'] = pd.to_datetime(df['id'].loc[~df['id'].str.contains('-')].str.replace(r'\D+', ''), errors='coerce', format='%d%m')
id date
0 DD-83 NaT
1 DD0706 1900-06-07
2 DD0306 1900-06-03

Datetime and Timestamp equality in Python and Pandas

I've been playing around with datetimes and timestamps, and I've come across something that I can't understand.
import pandas as pd
import datetime
year_month = pd.DataFrame({'year':[2001,2002,2003], 'month':[1,2,3]})
year_month['date'] = [datetime.datetime.strptime(str(y) + str(m) + '1', '%Y%m%d') for y,m in zip(year_month['year'], year_month['month'])]
>>> year_month
month year date
0 1 2001 2001-01-01
1 2 2002 2002-02-01
2 3 2003 2003-03-01
I think the unique function is doing something to the timestamps that is changing them somehow:
first_date = year_month['date'].unique()[0]
>>> first_date == year_month['date'][0]
In fact:
>>> year_month['date'].unique()
'2003-02-28T16:00:00.000000000-0800'], dtype='datetime64[ns]')
My suspicions are that there is some sort of timezone difference underneath the functions, but I can't figure it out.
I just checked the python commands list(set()) as an alternative to the unique function, and that works. This must be a quirk of the unique() function.
You have to convert to datetime64 to compare:
In [12]:
first_date == year_month['date'][0].to_datetime64()
This is because unique has converted the dtype to datetime64:
In [6]:
first_date = year_month['date'].unique()[0]
I think is because unique returns a np array and there is no dtype that numpy understands TimeStamp currently: Converting between datetime, Timestamp and datetime64
