My objective is to create the following pandas dataframe (with the 'date_time' column in '%Y-%m-%d %s:%m:%f%z' format):
batt_no date_time
3 4 2019-09-19 20:59:06+00:00
4 5 2019-09-19 23:44:07+00:00
5 6 2019-09-20 00:44:06+00:00
6 7 2019-09-20 01:14:06+00:00
But the constraint is that I don't want to first create a dataframe as follows and then convert the 'date_time' column into the above format.
batt_no date_time
3 4 1568926746
4 5 1568936647
5 6 1568940246
6 7 1568942046
I need to directly create it by converting two lists of values into the desired dataframe.
The following is what I've tried but I get an error
(please note: the 'date_time' values are in epoch format which I need to specify but have them converted into the '%Y-%m-%d %s:%m:%f%z' format):
pd.DataFrame({'batt_volt':[4,5,6,7],
'date_time':[1568926746,1568936647,1568940246,1568942046].dt.strftime('%Y-%m-%d %s:%m:%f%z')}, index=[3,4,5,6])
Can anyone please help?
Edit Note: My question is different from the one asked here.
The question there deals with conversion of a single value of pandas datetime to unix timestamp. Mine's different because:
My timestamp values are slightly different from any of the types mentioned there
I don't need to convert any timestamp value, rather create a full-fledged dataframe having values of the desired timestamp - in a particular manner using lists that I've clearly mentioned in my question.
I've clearly stated the way I've attempted the process but requires some modifications in order to run without error, which in no way is similar to the question asked in the aforementioned link.
Hence, my question is definitely different. I'd request to kindly reopen it.
As suggested, I put the solution in comment as an answer here.
pd.DataFrame({'batt_volt':[4,5,6,7], 'date_time': pd.to_datetime([1568926746,1568936647,1568940246,1568942046], unit='s', utc=True).strftime('%Y-%m-%d %s:%m:%f%z')}, index=[3,4,5,6])
pd.to_datetime works with dates, or list of dates, and input dates can be in many formats including epoch int. Keyword unit ensure that those ints are interpreted as a number of seconds since 1970-01-01, not of ms, μs, ns, ...
So it is quite easy to use when creating a DataFrame to create directly the list of dates.
Since a list of string, with a specific format was wanted (btw, outside any specific context, I maintain that it is probably preferable to store datetimes, and convert to string only for I/O operations. But I don't know the specific context), we can use .strftime on the result, which is of type DatetimeIndex when to_datetime is called with a list. And .strftime also works on those, and then is applied on all datetimes of the list. So we get a list of string of the wanted format.
Last remaining problem was the timezone. And here, there is no perfect solution. Because a simple int as those we had at the beginning does not carry a timezone. By default, to_datetime creates datetime without a timezone (like those ints are). So they are relative to any timezone the user decide they are.
to_datetime can create "timezone aware datetime". But only UTC. Which is done by keyword arg utc=True
With that we get timezone aware datetime, assuming that the ints we provided were in number of seconds since 1970-01-01 00:00:00+00:00
This question already has answers here:
How to define format when using pandas to_datetime?
(2 answers)
Convert Pandas Column to DateTime
(8 answers)
Closed 5 months ago.
Does anyone know how to change 20130526T150000 to datetime format?
One note: the 'T' is usefull. Use pd.to_datetime() directly, the T is actually usefull as it denotes the iso format and will help not confuse for some locales (some countries have the month first, then the day, others the oposite - iso goes from most significant to less: year, month, day)...
pd.to_datetime("20130526T150000")
Timestamp('2013-05-26 15:00:00')
If you want to be more explicit, specify the format:
pd.to_datetime("20130526T150000", format=...)
However, this might be a duplicate: How to define format when using pandas to_datetime? ... For best results, if you are doing a conversion of a column, use Convert Pandas Column to DateTime
I'm learning python and software development...
I've been scraping data (date/time, interest rate) every minute since July from a website and appending it to a CSV file. Today I went to chart the data using jupyter notebook, pandas..etc
I sliced off the 'AM/PM' string characters and used the pandas.to_datetime method on the date/time column to properly format it and .
data['date/time'] = data['date/time'].str[0:14].map(pandas.to_datetime)
However, it appears that the date/time data was at first interpreted by python/jupyter/pandas following the ddmmyy convention but then changed at the start of a new month to being interpreted to mmddyy. On the 13th of the month the interpretation changed back to ddmmyy.
For example:
The CSV file shows the following string value within the respective cell:
31/07/22 23:59PM
01/08/22 00:00AM
...
12/08/22 23:59PM
13/08/22 00:00AM
However, the pandas dataframe, after using the 'to_datetime' method shows:
2022-07-31 23:59:00
2022-01-08 00:00:00
...
2022-12-08 23:59:00
2022-08-13 00:00:00
I've been trying to figure out:
Why this happened?
How can I avoid this moving forward?
How can I fix this so that I may chart/plot the time series data properly?
UpdateIt looks like the issue occurs while filtering from a larger CSV file into the CSV file I'm working with.
You should use the pandas.to_datetime dayfirst parameter set to True to assume the format as day/month/year. Otherwise, it assumes the format is month/day/year if 1 <= month <= 12.
It's going to be like the following:
data['date/time'] = data['date/time'].str[0:14].map(lambda x: pd.to_datetime(x, dayfirst=True))
This question already has answers here:
Convert date from excel in number format to date format python [duplicate]
(2 answers)
Convert Excel style date with pandas
(3 answers)
Closed 1 year ago.
I'd like to convert a column to date. My source data is from an Excel which is already formatted to date data type. However, when pandas read my file the date columns are read as e.g.'44249'
I tried the following code
RPT["Pot"] =`RPT["Pot"].apply(lambda x: pd.to_datetime(x, format='%d%m%Y'))
but I got the error time data '44249' does not match format '%d%m%Y' (match).
I also tried this code:
RPT["PLANNED SUBMISSION DATE TO E&P"] = pd.to_datetime(RPT["PLANNED SUBMISSION DATE TO E&P"])
but the results were 1970-01-01 00:00:00.000044365, which inaccurate.
Can I anyone please help me?
This question already has answers here:
Convert hh:mm:ss to minutes using python pandas
(3 answers)
Closed 5 years ago.
So I have Date/Time string in a pandas dataframe that looks like this:
2016-10-13 02:33:40
And I need to cut out the year/month/day completely, and convert the time to minutes. So that time/date above needs to be converted into just:
153
^^2 hours and 33 minutes = 153 minutes
I am basically trying to sift the data by the amount of time in between each entry and converting it all to minutes (since the amount of time that passes will not go past a day per session) seemed to make the most sense to me. But, I am open to any other suggestions!
Thanks for the help
lets say the date and time is in a column called DateTime.
df["Datetime"] = pd.to_datetime(df["Datetime"])
df["Minutes"] = 60*df["Datetime"].dt.hour +df["Datetime"].dt.minute
There once the datetime column is in the right format, you can access a lot of properties. See here for more https://pandas.pydata.org/pandas-docs/stable/api.html#datetimelike-properties