Pandas Dataframe convert string to data without time - python

I have a Pandas Dataframe df:
a date
1 2014-06-29 00:00:00
df.types return:
a object
date object
I want convert column data to data without time but:
df['date']=df['date'].astype('datetime64[s]')
return:
a date
1 2014-06-28 22:00:00
df.types return:
a object
date datetime64[ns]
But value is wrong.
I'd have:
a date
1 2014-06-29
or:
a date
1 2014-06-29 00:00:00

I would start by putting your dates in pd.datetime:
df['date'] = pd.to_datetime(df.date)
Now, you can see that the time component is still there:
df.date.values
array(['2014-06-28T19:00:00.000000000-0500'], dtype='datetime64[ns]')
If you are ok having a date object again, you want:
df['date'] = [x.strftime("%y-%m-%d") for x in df.date]
Here would be ending with a datetime:
df['date'] = [x.date() for x in df.date]
df.date
datetime.date(2014, 6, 29)

Here you go. Just use this pattern:
df.to_datetime().date()

Related

Convert dates to %y-%m-%d format in Python

I have a csv file with column 'date' which has dates in many different formats like ddmmyy, mmddyy,yymmdd. I want to convert all the dates to y-m-d format
df=pd.read_csv(file)
df=df['date] .dt.strftime(%y-%m-%d)
This code gives error: "Can only use .dt accessor with datetimelike values"
You can utilise pd.to_datetime -
>>> import pandas as pd
>>>
>>> df = pd.DataFrame(['1/2/2020','12/31/2020','20-Jun-20'],columns=['Date'])
>>> df
Date
0 1/2/2020
1 12/31/2020
2 20-Jun-20
>>>
>>> df['Date'] = pd.to_datetime(df['Date'])
>>> df
Date
0 2020-01-02
1 2020-12-31
2 2020-06-20
>>>
>>> df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%y-%m-%d')
>>>
>>> df
Date
0 20-01-02
1 20-12-31
2 20-06-20
>>>
Step 0:-
Your dataframe:-
df=pd.read_csv('your file name.csv')
Step 1:-
firstly convert your 'date' column into datetime by using to_datetime() method:-
df['date']=pd.to_datetime(df['date'])
Step 2:-
And If you want to convert them in string like format Then use:-
df['date']=df['date'].astype(str)
Now if you print df or write df(if you are using jupyter notebook)
Output:-
0 2020-01-01
1 2020-12-31
2 2020-06-20

Pandas DateTime for Month

I have month column with values formatted as: 2019M01
To find the seasonality I need this formatted into Pandas DateTime format.
How to format 2019M01 into datetime so that I can use it for my seasonality plotting?
Thanks.
Use to_datetime with format parameter:
print (df)
date
0 2019M01
1 2019M03
2 2019M04
df['date'] = pd.to_datetime(df['date'], format='%YM%m')
print (df)
date
0 2019-01-01
1 2019-03-01
2 2019-04-01

convert pandas._libs.tslibs.timestamps.Timestamp to datetime

I want to convert this Timestamp object to datetime this object was obtained after using asfreq on a dataframe this is the last index
Timestamp('2018-12-01 00:00:00', freq='MS')
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
wanted output
2018-12-01
do you want this?
from pandas._libs.tslibs.timestamps import Timestamp
ts = Timestamp('2018-12-01 00:00:00', freq='MS')
date_time = ts.to_pydatetime()
And if you just want a string then you can do this:
print(str(ts).split()[0])
out:
'2018-12-01'
You should be able to floor the timestamp upto the date part (or any other part), which in this example will get rid of the hour-minute-second level detail.
df = pd.DataFrame({'ts': [pd.Timestamp('2019-01-01 00:10:10')]})
df.ts.dt.floor('d')
0 2019-01-01
Name: ts, dtype: datetime64[ns]

How to convert pandas column to date when column is something like "Jan-18"?

what is the efficient way to convert the column values into dates "DD-MM-YYYY" when the values given like "Feb-15" which needs to be "01-02-2015". if it's "Dec-46" it must return "01-12-1946".
You can pass the format '%b-%y' to to_datetime:
In[42]:
df = pd.DataFrame({'date':["Feb-15","Dec-46"]})
df['new_date'] = pd.to_datetime(df['date'], format='%b-%y')
df
Out[42]:
date new_date
0 Feb-15 2015-02-01
1 Dec-46 2046-12-01
Note that the new dtype is datetime64, you cannot control the display output, if you insist on DD-MM-YYYY then you would have to convert to a string using dt.strftime:
In[43]:
df['str_date'] = df['new_date'].dt.strftime('%d-%m-%Y')
df
Out[43]:
date new_date str_date
0 Feb-15 2015-02-01 01-02-2015
1 Dec-46 2046-12-01 01-12-2046
but then you have strings which is not that useful if you need to perform arithmetic operations or filtering
EDIT
You cannot store dates earlier than 1970 so '01-01-1946' is not a valid datetime that can be represented by datetime64

H2O python - How to let h2oframe to dataframe with correctly character and datetime

I have a csv file, and want to use H2O to do DeepLearning. But it has some Chinese and datetime that when I finish my Deeplearning need to save output to csv, it can't return to original data.
I use small data to show my problem here.
In[1]: df = pd.DataFrame({'datetime':['2016-12-17 00:00:00'],'time':['00:00:30'],'month':['月'], 'weekend':['周六']})
print(df.dtypes)
df
out[1]: datetime object
time object
month object
weekend object
dtype: object
datetime time month weekend
0 2016-12-17 00:00:00 00:00:30 月 周六
In[2]: h2o_frame = h2o.H2OFrame(df);h2o_frame ;h2o_frame.types ;h2o_frame
C:\Users\thi\Anaconda3\lib\site-packages\h2o\utils\shared_utils.py:170: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
data = _handle_python_lists(python_obj.as_matrix().tolist(), -1)[1]
out[2]: Parse progress: |█████████████████████████████████████████████████████████| 100%
datetime time month weekend
2016-12-17 00:00:00 1970-01-01 00:00:30 <0xA4EB> <0xA9>P<0xA4BB>
the time I want it just only 00:00:30, any way to fix it?
month and weekends I don't find any way to let it show Chinese, but I still finish my deeplearning
But when I want to let h2oframe back to DataFrame and save to csv file, it save <0xA4EB> for me but not 月, and datetime change to int
In[3]: dff = h2o_frame.as_data_frame();dff
out[3]: datetime time month weekend
0 1481932800000 30000 <0xA4EB> <0xA9>P<0xA4BB>
How to correctly return character from h2oframe to DataFrame
How to correctly return datetime from h2oframe to DataFrame
One simplest way to solve this is, when you convet pandas frame to H2OFrame use argument column_types ,as below:
In [69]: col_types
Out[69]: ['categorical', 'categorical', 'categorical', 'categorical']
In [70]: h2o_frame = h2o.H2OFrame(df,column_types=col_types);h2o_frame ;h2o_frame.types ;h2o_frame
Parse progress: |█████████████████████████████████████████████████████████████████████████████| 100%
Out[70]:
datetime month time weekend
------------------- ------- -------- ---------
2016-12-17 00:00:00 月 00:00:30 周六
[1 row x 4 columns]
In [71]: dff = h2o_frame.as_data_frame();dff
Out[71]:
datetime month time weekend
0 2016-12-17 00:00:00 月 00:00:30 周六
allfiles = h2o.import_file(path='data/', pattern=".csv")
df = allfiles.as_data_frame()
df['datetime'] = pd.to_datetime(df["datetime"], unit='ms')

Categories