Convert string to_datetime with wrong place day and month - python

I have a dataset like this
df = pd.DataFrame({'time': ('08.02.2020', '21.02.2020', '2020.05.04')})
df
I do
pd.to_datetime(df['time'])
0 2020-08-02
1 2020-02-21
2 2020-05-04
Name: time, dtype: datetime64[ns]
But the first row must be
0 2020-02-08
If i do
pd.to_datetime(df['time']).dt.strftime('%d-%m-%Y')
0 02-08-2020
1 21-02-2020
2 04-05-2020
Name: time, dtype: object
Again 02-08-2020 instead of 08-02-2020

Related

Convert day-time column to integer

I've looked everywhere for a solution to this issue but nothing seems to work.
I have a column in my dataframe df_jan
459984 0
451375 0
660585 0
722735 78 days 00:00:00
448295 0
...
585781 4 days 00:00:00
612351 22 days 00:00:00
631985 16 days 00:00:00
462341 0
450073 0
Name: delta_sale, Length: 12978, dtype: object
I want to change it so that it is simply the integer value of days.
I've tried the following:
pd.to_datetime()
df_jan['delta_sale'] / np.timedelta64(1, 'D')
.astype(int)
However, none of them have worked and I'm struggling to find any other questions that have the same issue. All I'm trying to achieve is this,
459984 0
451375 0
660585 0
722735 78
448295 0
...
585781 4
612351 22
631985 16
462341 0
450073 0
Name: delta_sale, Length: 12978, dtype: int
Any help would be greatly appreciated.
You can use .apply() in combination with a short temporary function lambda x: x.day
import pandas as pd
df = pd.DataFrame({'date': [pd.Timestamp.now(), pd.Timestamp.now()]})
df['date'].apply(lambda x: x.day)
This yields (because today is the 16th)
0 16
1 16
Name: date, dtype: int64

Converting large series from datetime64[ns] to ISO format with timezone (store as object type)

I have a large series in Pandas to convert many of its timestamp columns into ISO format with timezone(+2:00).
For example:
1 NaT
2 NaT
3 2019-06-20 11:35:11
4 2020-09-30 12:57:26
...
9999999 2021-07-17 20:58:01
Name: timestampvalues, Length: 9999999, dtype: datetime64[ns]
Required result as:
1 None
2 None
3 2019-06-20T11:35:11.000000+02:00
4 2020-09-30T12:57:26.000000+02:00
...
9999999 2021-07-17T20:58:01.000000+02:00
Name: timestampvalues, Length: 9999999, dtype: object
I have tried it with
df['timestampvalues']).dt.strftime('%Y-%m-%dT%H:%M%:%S.%f')
But it does not give the desired output.
You can use pd.Timestamp.isoformat:
import pandas as pd
df = pd.DataFrame(['2019-06-20 11:35:11','2020-09-30 12:57:26'], columns=['timestampvalues'])
df['timestampvalues'] = pd.to_datetime(df['timestampvalues'])
df['timestampvalues'] = df['timestampvalues'].apply(lambda x: pd.Timestamp.isoformat(x.tz_localize('CET')))
timestampvalues
0
2019-06-20T11:35:11+02:00
1
2020-09-30T12:57:26+02:00

Convert int64 column to time column

I have a pandas dataframe with a "time" column, which currently looks like:
ab['ZEIT'].unique()
array([ 0, 165505, 203355, ..., 73139, 75211, 74244], dtype=int64)
How can i get german time format out of it with hh:mm:ss so basically that it looks like:
array([00:00:00, 16:55:05, 20:33:55, ..., 07:31:39, 07:52:11, 07:42:44], dtype=?)
Use pd.to_datetime after converting values to string with padded 0s. Then call dt accessor for time
In [12]: pd.to_datetime(df['ZEIT'].astype(str).str.zfill(6), format='%H%M%S').dt.time
Out[12]:
0 00:00:00
1 16:55:05
2 20:33:55
3 07:31:39
4 07:52:11
5 07:42:44
Name: ZEIT, dtype: object
Details
In [13]: df
Out[13]:
ZEIT
0 0
1 165505
2 203355
3 73139
4 75211
5 74244
In [14]: df['ZEIT'].astype(str).str.zfill(6)
Out[14]:
0 000000
1 165505
2 203355
3 073139
4 075211
5 074244
Name: ZEIT, dtype: object
In [15]: pd.to_datetime(df['ZEIT'].astype(str).str.zfill(6), format='%H%M%S')
Out[15]:
0 1900-01-01 00:00:00
1 1900-01-01 16:55:05
2 1900-01-01 20:33:55
3 1900-01-01 07:31:39
4 1900-01-01 07:52:11
5 1900-01-01 07:42:44
Name: ZEIT, dtype: datetime64[ns]

How to swap day by month in a Series with python?

I have a column in which there are dates :
df['Date']
Date
0 2020-25-04
1 2020-26-04
2 2020-27-04
3 2020-12-05
4 2020-06-05
Name: Date, Length: 5, dtype: datetime64[ns]
I want to swap the element Day by element Month, so I can have :
df['Date']
Date
0 2020-04-25
1 2020-04-26
2 2020-04-27
3 2020-05-12
4 2020-05-06
Name: Date, Length: 5, dtype: datetime64[ns]
Any help would be appreciated.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Date':[np.datetime64('2020-04-25') ,np.datetime64('2020-04-26')]})
df['Date'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))
print(df)
I converted data into np.datetime format and applied lambda function.

How can I create a datetime column without 'date' part?

I have a dataframe and there's a column named 'Time' in it like the below(HH:MM:SS:fffff).
>>> df['Time']
0 09:42:29:75284
1 09:42:29:95584
2 09:42:31:15036
3 09:42:35:15138
4 09:42:35:95491
5 09:42:43:55414
6 09:42:45:35866
7 09:42:46:74638
8 09:42:47:35582
9 09:42:47:74774
10 09:42:48:94582
...
Name: Time, Length: 18924, dtype: object
I want to change its type as datetime, in order to make it easier to calculate. Is it possible to change its type, using pandas.to_datetime, as datetime without date?
You can convert it to timedelta64[ns] dtype:
Source DF:
In [164]: df
Out[164]:
Time
0 09:42:29:75284
1 09:42:29:95584
2 09:42:31:15036
3 09:42:35:15138
4 09:42:35:95491
5 09:42:43:55414
6 09:42:45:35866
7 09:42:46:74638
8 09:42:47:35582
9 09:42:47:74774
10 09:42:48:94582
In [165]: df.dtypes
Out[165]:
Time object # <-------- NOTE!
dtype: object
Converted:
In [166]: df.Time = pd.to_timedelta(df.Time.str.replace(r'\:(\d+)$', r'.\1'),
errors='coerce')
In [167]: df
Out[167]:
Time
0 09:42:29.752840
1 09:42:29.955840
2 09:42:31.150360
3 09:42:35.151380
4 09:42:35.954910
5 09:42:43.554140
6 09:42:45.358660
7 09:42:46.746380
8 09:42:47.355820
9 09:42:47.747740
10 09:42:48.945820
In [168]: df.dtypes
Out[168]:
Time timedelta64[ns] # <-------- NOTE!
dtype: object
Please refer python to_datetime documentation.
import pandas as pd
df = pd.DataFrame({'Time': ['09:42:29:75284','09:42:29:95584','09:42:31:15036']})
df
Out[]:
Time
0 09:42:29:75284
1 09:42:29:95584
2 09:42:31:15036
You can convert this into datetime format by specifying format as follows:
pd.to_datetime(df['Time'], format='%H:%M:%S:%f')
Out[]:
0 1900-01-01 09:42:29.752840
1 1900-01-01 09:42:29.955840
2 1900-01-01 09:42:31.150360
Name: Time, dtype: datetime64[ns]
but doing this will also add date 1900-01-01.

Categories