I have a date in my dataframe with format like this
"2018-05-01"
"2018-05-02"
"2018-05-03"
I want to convert the date into something like this in my JSON
"2018-05-01T00:00:00.000Z"
"2018-05-02T00:00:00.000Z"
I had tried using pd.to_datetime(df['date'])
but the result is become something like 1523404800000 in my JSON file. What should I do to get a format like "2018-05-02T00:00:00.000Z" in my JSON?
You are looking for date_format='iso' in df.to_json()
Full example:
import pandas as pd
csvdata = '''\
date
2018-05-01
2018-05-02
2018-05-03'''
fileobj = pd.compat.StringIO(csvdata)
df = pd.read_csv(fileobj, sep='\s+')
# without conversion
print(df.to_json())
# with conversion
df['date'] = pd.to_datetime(df['date'])
print(df.to_json())
# with conversion and adding date_format = iso
print(df.to_json(date_format='iso'))
Prints:
{"date":{"0":"2018-05-01","1":"2018-05-02","2":"2018-05-03"}}
{"date":{"0":1525132800000,"1":1525219200000,"2":1525305600000}}
{"date":{"0":"2018-05-01T00:00:00.000Z","1":"2018-05-02T00:00:00.000Z","2":"2018-05-03T00:00:00.000Z"}}
Related
I have a date in format of YYYY-MM-DD (2022-11-01). I want to convert it to 'YYYYMMDD' format (without hyphen). Pls support.
I tried this...
df['ConvertedDate']= df['DateOfBirth'].dt.strftime('%m/%d/%Y')... but no luck
If I understand correctly, the format mask you should be using with strftime is %Y%m%d:
df["ConvertedDate"] = df["DateOfBirth"].dt.strftime('%Y%m%d')
Pandas itself providing the ability to convert strings to datetime in Pandas dataFrame with desire format.
df['ConvertedDate'] = pd.to_datetime(df['DateOfBirth'], format='%Y-%m-%d').dt.strftime('%Y%m%d')
Referenced Example:
import pandas as pd
values = {'DateOfBirth': ['2021-01-14', '2022-11-01', '2022-11-01']}
df = pd.DataFrame(values)
df['ConvertedDate'] = pd.to_datetime(df['DateOfBirth'], format='%Y-%m-%d').dt.strftime('%Y%m%d')
print (df)
Output:
DateOfBirth ConvertedDate
0 2021-01-14 20210114
1 2022-11-01 20221101
2 2022-11-01 20221101
This works
from datetime import datetime
initial = "2022-11-01"
time = datetime.strptime(initial, "%Y-%m-%d")
print(time.strftime("%Y%m%d"))
I have a yfinance download that is working fine, but I want the Date column to be in YYYY/MM/DD format when I write to disk.
The Date column is the Index, so I first remove the index. Then I have tried using Pandas' "to_datetime" and also ".str.replace" to get the column data to be formatted in YYYY/MM/DD.
Here is the code:
import pandas
import yfinance as yf
StartDate_T = '2021-12-20'
EndDate_T = '2022-05-14'
df = yf.download('CSCO', start=StartDate_T, end=EndDate_T, rounding=True)
df.sort_values(by=['Date'], inplace=True, ascending=False)
df.reset_index(inplace=True) # Make it no longer an Index
df['Date'] = pandas.to_datetime(df['Date'], format="%Y/%m/%d") # Tried this, but it fails
#df['Date'] = df['Date'].str.replace('-', '/') # Tried this also - but error re str
file1 = open('test.txt', 'w')
df.to_csv(file1, index=True)
file1.close()
How can I fix this?
Change the format of the date after resetting the index:
df.reset_index(inplace=True)
df['Date'] = df['Date'].dt.strftime('%Y/%m/%d')
As noted in Convert datetime to another format without changing dtype, you can not change the format and keep the datetime format, due to how datetime stores the dates internally. So I would use the line above before writing to the file (which changes the column to string format) and convert it back to datetime afterwards, to have the datetime properties.
df['Date'] = pd.to_datetime(df['Date'])
You can pass a date format to the to_csv function:
df.to_csv(file1, date_format='%Y/%m/%d')
enter image description here I want to convert the timestamp into the readable date format column. But when i tried the following code, the output of date is all the same. Can anyone help me with this problem?
import json
import pandas as pd
with open('/Users/Damon/Desktop/percent-utx-os-in-profit.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame(data)
——> what df looks like before
from datetime import date
df["date"] = pd.to_datetime(df.t)
——> what you get and what you want to get
I'm having an issue where the date format is not matching up. Meaning in my .csv file the dates are as follows %m/%d/%Y (ex. 11/3/2001) but in the error it saying %Y/%m/%d or %Y/%d/%m. I've tried all the possible permutations as far as year, month and day and I continue to recieve the same error of ValueError: time data '2001-11-03 ' %Y:%m %d %H:%M:%S'. Below is my code. Thanks.
df = pd.read_excel('.xlsx', header=None)
df.to_csv('.csv', header=None, index=False)
df= pd.read_csv('.csv', index_col[5,8,9,12], date_parser=lambda x: datetime.datetime.strptime(x, '%Y/%m/$d %H:%M:%S').strptime('%m/%d/%Y))
Note: What I'm trying to do is convert an .xlsx file to .csv and then remove the trailing 0:00 from multiple columns within the .csv file. Hope this helps.
Use the parse from dateutil.parser to parse the date appropriately. It is an easy access. The fastest way to parse dates.
from dateutil.parser import parse
df = pd.read_csv('filename.csv', date_parser = parse, index_..)
our you can use to_datetime native to Pandas
pd.to_datetime(df['Date Col'])
In order to format the date properly, you should use the following:
date_parser=lambda x: parse(x)
#parse from dateutil.parser
df['Date Col'] = df['Date Col'].strftime('%m/%d/%Y')
df.to_csv('New File.csv')
You can use to_datetime since you are using pandas. MoreInfo
import pandas as pd
df = pd.DataFrame({"a": ["11/3/2001", '2001-11-03']})
df["a"] = pd.to_datetime(df["a"])
print(df["a"])
Output:
0 2001-11-03
1 2001-11-03
Name: a, dtype: datetime64[ns]
I am reading from an Excel sheet. The header is date in the format of Month-Year and I want to keep it that way. But when it reades it, it changes the format to "2014-01-01 00:00:00". I wrote the following peice to fix it, but doesn't work.
import pandas as pd
import numpy as np
import datetime
from datetime import date
import time
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = 37)
df.columns=pd.to_datetime(df.columns, format='%b-%y')
Which didn't do anything. On another try, I did the following:
df.columns = datetime.datetime.strptime(df.columns, '%Y-%m-%d %H:%M:%S').strftime('%b-%y')
Which returns the must be str, not datetime.datetime error. I don't know how make it read the row cell by cell to read the strings!
Here is a sample data:
NaT 11/14/2015 00:00:00 12/15/2015 00:00:00 1/15/2016 00:00:00
A 5 1 6
B 6 3 3
My main problem with this is that it does not recognize it as the header, e.g., df['11/14/2015 00:00:00'] retuns an keyError.
Any help is appreciated.
UPDATE: Here is a photo to illustrate what I keep geting! Box 6 is the implementation of apply, and box 7 is what my data looks like.
import datetime
df = pd.DataFrame({'data': ["11/14/2015 00:00:00", "11/14/2015 00:10:00", "11/14/2015 00:20:00"]})
df["data"].apply(lambda x: datetime.datetime.strptime(x, '%m/%d/%Y %H:%M:%S').strftime('%b-%y'))
EDIT
If you'd like to work with df.columns you could use map function:
df.columns = list(map(lambda x: datetime.datetime.strptime(x, '%m/%d/%Y %H:%M:%S').strftime('%b-%y'), df1.columns))
You need list if you are using python 3.x because it's iterator by default.
The problem might be that the data in excel isn't stored in the string format you think it is. Perhaps it is stored as a number, and just displayed as a date string in excel.
Excel sometimes uses milliseconds after an epoch to store dates.
Check what the actual values you see in the df array.
What does this show?
from pprint import pprint
pprint(df)