I have a dataframe with dates in string format. I convert those dates to timestamp, so that I could use this date column in the later part of the code. Everything is fine with calculations/comparisons etc, but I would like the timestamp to appear in %d.%m.%Y format, as opposed to default %Y-%m-%d. Let me illustrate it -
dt=pd.DataFrame({'date':['09.12.1998','07.04.2014']},index=[1,2])
dt
Out[4]:
date
1 09.12.1998
2 07.04.2014
dt['date_1']=pd.to_datetime(dt['date'],format='%d.%m.%Y')
dt
Out[7]:
date date_1
1 09.12.1998 1998-12-09
2 07.04.2014 2014-04-07
I would like to have dt['date_1'] to de displayed in the same format as dt['date']. I don't wish to use .strftime() function because it will convert the datatype from timestamp to string.
In Nutshell: How can I invoke the python system in displaying the timestamp in the format of my choice(months could be like APR, MAY etc), rather than getting a default format(like 1998-12-09), keeping in mind that the data type remains a timestamp, rather than string?
It seems Pandas didn't implement this option yet:
https://github.com/pandas-dev/pandas/issues/11501
having a look at https://pandas.pydata.org/pandas-docs/stable/options.html looks like you can set the display to achieve some of this, although not all.
display.date_dayfirst When True, prints and parses dates with the day first, eg 20/01/2005
display.date_yearfirst When True, prints and parses dates with the year first, eg 2005/01/20
so you can have dayfirst, but they haven't included names for months.
On a more fundamental level, whenever you're displaying something it is a string, right? I'm not sure why you wouldn't be able to convert it when you're displaying it without having to change the original dataframe.
your code would be:
pd.set_option("display.date_dayfirst", True)
except actually this doesn't work:
https://github.com/pandas-dev/pandas/issues/11501
the options have been implemented for parsing, but not for displaying.
Hallo Stael/Cezar/Droravr, Thank you all for providing your inputs. I value your time and appreciate your help a lot. Thanks for sharing this link https://github.com/pandas-dev/pandas/issues/11501 as well. I went through the link and understood that this problem can be broken down to a 'displaying problem' ultimately, as also expounded by jreback. This issue to have the dates displayed to your desired format has been marked as an Enhancement, so probably will be added to future versions.
All I wanted was the have to dates exported as dd-mm-yyy and by just formatting the string while exporting, we could solve this problem.
So, I sorted this issue by exporting the file as -
dt.to_csv(filename, date_format='%d-%m-%Y',index=False).
date date_1
09.12.1998 09-12-1998
07.04.2014 07-04-2014
Thus, this issue stands SOLVED.
Once again, thank you all for your kind help and the precious hours you spent with this issue. Deeply appreciated.
Related
I downloaded a .csv file to do some practice, a column named "year_month" is string with the format "YYYY-MM"
By doing:
df = pd.read_csv('C:/..../migration_flows.csv',parse_dates=["year_month"])
"year_month" is Dtype=object. So far so good.
By doing:
df["year_month"] = pd.to_datetime(df["year_month"],format='%Y-%m-%d')
it is converted to daterime64[ns]. So far so good.
I try to filter certain dates by doing:
filtered_df = df.loc[(df["year_month"]>= pd.Timestamp(2018-1-1))]
The program returns the whole column as if nothing happened. For instance, it starts displaying, starting from the date "2001-01-01"
Any thoughts on how to filter properly? Many thanks
how about this
df.loc[(df["year_month"]>= pd.to_datetime('2018-01-01'))]
or
df.loc[(df["year_month"]>= pd.Timestamp('2018-01-01'))]
is there a way to find out in Python the date format code of a string?
My Input would be e.g.:
2020-09-11T17:42:33.040Z
What I am looking for is in this example to get this:
'%Y-%m-%dT%H:%M:%S.%fZ'
Point is that I have diffrent time Formats for diffrent Files, therefore I don't know in Advancce how my datetime code format will look like.
For processing my data, I need unix time format, but to calculate that I need a solution to this problem.
data["time_unix"] = data.time.apply(lambda row: (datetime.datetime.strptime(row, '%Y-%m-%dT%H:%M:%S.%fZ').timestamp()*100))
Thank you for the support!
I know that there have been similar questions asked, but they seemed to have to do with the way datetime deals (or doesn't deal) with timezones.
The setup is a little complicated, and probably not relevant to the problem, but I thought it was important to include the code as is, so a little background:
I've got a dictionary of arrays. Each of these arrays represents an "attempt" by the same person, but taking place at different times. Ultimately I'm going to be looking for the earliest of these dates. This may be a bit of a roundabout solution, but I'm converting all of the dates to datetime objects, finding the earliest and then just using that index to pull out the first attempt:
Here's what the code looks like to setup that array of attempt datetimes:
for key in duplicates_set.keys():
attempt_dates = [datetime.strptime(attempt['Attempt Date'], "%-m-%-d-%y %-H:%M:%S") for attempt in duplicates_set[key]]
Here's the format of what one of the original date strings looks like:
12-5-2016 3:27:58 PM
What I'm getting back is:
ValueError: '-' is a bad directive in format '%-m-%d-%y %-H:%M:%S'
I assume that's referring to the dashes placed before the 'm', 'd' and 'H' because they're non-zero-padded decimals. Why is it telling me that?
%-* -- to skip padding -- is a GNU libc extension. It's not part of POSIX strftime, and thus not guaranteed to be portable to systems where your time-formatting calls aren't eventually backed by GNU's strftime C library function.
The Python datetime module documentation explicitly specifies the format strings it supports, and this extension is not given. Thus, while this is supported in GNU date and GNU strftime(), it isn't available in Python datetime.
I had the same issue;
date: 1/9/21
according to:
https://strftime.org/ the correct format would've been "%-d/%-m/%y"
which gave the bad directive error.
"%d-/%m-/%y" didn't work either.
Weirdly enough what worked was "%d/%m/%y".
This question already has answers here:
Convert weird Python date format to readable date
(2 answers)
Closed 7 years ago.
I can't for the life of me figure out how to convert a timestamp on the form 1433140740000+0200 to a datetime object or to any humanly readable representation. Also, what format is this specifically? I'm assuming the +0200 represents a timezone.
I can only seem to find questions regarding timestamps without timezones, such as this answer, where int("1433140740000+0200") would give me an error. Any help is appreciated. Thanks!
Edit: As mentioned in a comment, further examination of the API from which I am getting these values reveals other timestamps with different values for what I thought to represent timezones. E.g: 315529200000+0100. The entire line of data looks like this: "ArrivalTime": "/Date(1433051640000+0200)/", and the full response can be found here.
Second edit: As far as I can tell, the timestamps are unix timestamps, but they're given in milliseconds (hence the trailing zeros), and the +0200 indicates timezone UTC+02:00. So for now, I'll just trim out the extra zeros and the timezone, and convert as shown in the linked question, before adding the timezone manually afterwards. The timestamps with +0100 remain a mystery to me, but I've found they're always the same date, 1/1/1980 12:00am. They also have a different identifier: ActualTime, as opposed to ArrivalTime on the others. Anyway, thanks for the help guys!
You can use string split to remove the timezone
import datetime
intstring = int( ('1433140740000+0200').split('+')[0])
print(
datetime.datetime.fromtimestamp(intstring/1000).strftime('%Y-%m-%d %H:%M:%S')
)
I had to change it to this to make it work
intstring /1000
I'm quite new to python and don't know much about it but i need to make a small script that when someone inputs a date in any format , it would then converts it in to yyyy-mm-dd format.
The script should be able to share elements of the entered date, and identify patterns.
It might be easy and obvious to some but making one by my self is over my head.
Thanks in advance!
This is a difficult task to do yourself; you might want to take a look at dateutil which has a rather robust parse() method that you can use to try and parse arbitrarily formatted date strings.
You can do something like this (not tested)
import locale
import datetime
...
parsedDate = datetime.strptime(your_string, locale.D_FMT)
print datetime.strftime(parsedDate, "%Y-%M-%d")
This assumes that the user will use its own local convention for dates.
You can use strftime for output (your format is "%Y-%M-%d").
For parsing input there's a corresponding function - strptime. But you won't be able to handle "any format". You have to know what you're getting in the first place. Otherwise you wouldn't be able to tell a difference between (for example) American and other dates. What does 01.02.03 mean for example? This could be:
yy.mm.dd
dd.mm.yy
mm.dd.yy