enter image description here I want to convert the timestamp into the readable date format column. But when i tried the following code, the output of date is all the same. Can anyone help me with this problem?
import json
import pandas as pd
with open('/Users/Damon/Desktop/percent-utx-os-in-profit.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame(data)
——> what df looks like before
from datetime import date
df["date"] = pd.to_datetime(df.t)
——> what you get and what you want to get
Related
I have a yfinance download that is working fine, but I want the Date column to be in YYYY/MM/DD format when I write to disk.
The Date column is the Index, so I first remove the index. Then I have tried using Pandas' "to_datetime" and also ".str.replace" to get the column data to be formatted in YYYY/MM/DD.
Here is the code:
import pandas
import yfinance as yf
StartDate_T = '2021-12-20'
EndDate_T = '2022-05-14'
df = yf.download('CSCO', start=StartDate_T, end=EndDate_T, rounding=True)
df.sort_values(by=['Date'], inplace=True, ascending=False)
df.reset_index(inplace=True) # Make it no longer an Index
df['Date'] = pandas.to_datetime(df['Date'], format="%Y/%m/%d") # Tried this, but it fails
#df['Date'] = df['Date'].str.replace('-', '/') # Tried this also - but error re str
file1 = open('test.txt', 'w')
df.to_csv(file1, index=True)
file1.close()
How can I fix this?
Change the format of the date after resetting the index:
df.reset_index(inplace=True)
df['Date'] = df['Date'].dt.strftime('%Y/%m/%d')
As noted in Convert datetime to another format without changing dtype, you can not change the format and keep the datetime format, due to how datetime stores the dates internally. So I would use the line above before writing to the file (which changes the column to string format) and convert it back to datetime afterwards, to have the datetime properties.
df['Date'] = pd.to_datetime(df['Date'])
You can pass a date format to the to_csv function:
df.to_csv(file1, date_format='%Y/%m/%d')
I'm trying to sort the content of a csv file by the given timestamps but it just doesn't seem to work for me. They are given in such a way:
2021-04-16 12:59:26+02:00
My current code:
from datetime import datetime
import csv
from csv import DictReader
with open('List_32_Data_New.csv', 'r') as read_obj:
csv_dict_reader = DictReader(read_obj)
csv_dict_reader = sorted(csv_dict_reader, key = lambda row: datetime.strptime(row['Timestamp'], "%Y-%m-%d %H:%M:%S%z"))
writer = csv.writer(open("Sorted.csv", 'w'))
for row in csv_dict_reader:
writer.writerow(row)
However it always throws the error:
time data '2021-04-16 12:59:26+02:00' does not match format '%Y-%m-%d %H:%M:%S%z'
I tried already an online compiler at apparently it works there.
Any help would be much appreciated.
If you use pandas as a library it could be a bit easier (Credits to: MrFuppes).
import pandas as pd
df = pd.read_csv(r"path/your.csv")
df['new_timestamps'] = pd.to_datetime(df['timestamps'], format='%Y-%m-%d %H:%M:%S%z')
df = df.sort_values(['new_timestamps'], ascending=True)
df.to_csv(r'path/your.csv')
If you still have errors you can also try to parse the date like this (Credits to: Zerox):
from dateutil.parser import parse
df['new_timestamps'] = df['timestamps'].map(lambda x: datetime.strptime((parse(x)).strftime('%Y-%m-%d %H:%M:%S%z'), '%Y-%m-%d %H:%M:%S%z'))
Unsure about the correct datetime-format? You can try auto-detection infer_datetime_format=True:
df['new_timestamps'] = pd.to_datetime(df['timestamps'], infer_datetime_format=True)
Tested with following sample:
df = pd.DataFrame(['2021-04-15 12:59:26+02:00','2021-04-13 12:59:26+02:00','2021-04-16 12:59:26+02:00'], columns=['timestamps'])
Convert date string "1/09/2020" to string "1-Sep-2020" in python. Try every solution mentioned in stackoverflow but not able to change it. Sometimes the Value error comes data format doesn't match, when I try to match it then error come day out of range. Is there problem in excel data or I am writing the code wrong. Please help me to solve this issue???
xlsm_files=['202009 - September - Diamond Plod Day & Night MKY021.xlsm']
import time
import pandas as pd
import numpy as np
import datetime
df=pd.DataFrame()
for fn in xlsm_files:
all_dfs=pd.read_excel(fn, sheet_name=None, engine='openpyxl')
list_data = all_dfs.keys()
all_dfs.pop('Date',None)
all_dfs.pop('Ops Report',None)
all_dfs.pop('Fuel Report',None)
all_dfs.pop('Bit Report',None)
all_dfs.pop('Plod Example',None)
all_dfs.pop('Plod Definitions',None)
all_dfs.pop('Consumables',None)
df2 = pd.DataFrame(columns=["PlodDate"])
for ws in list_data:
df1 = all_dfs[ws]
new_row = {'PlodDate':df1.iloc[3,3]}
df2 = df2.append(new_row,ignore_index=True)
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
df2['PlodDate']=df2['PlodDate'].apply(lambda x: x.strftime("%d-%b-%Y"))
df2
ValueError: day is out of range for month or doesnot match format
Method 1-Tried because it show error date out of range
try:
datetime.datetime.strptime(df2['PlodDate'].astype(str).values[0],"%d/%m/%Y")
except ValueError:
continue
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
df2['PlodDate']=df2['PlodDate'].apply(lambda x: x.strftime("%d-%b-%Y"))
Excel File Attached
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
date = df2['PlodDate'].split('/')
df2['PlodDate'] = datetime.date(int(date[2]), int(date[1]), int(date[0])).strftime('%d-%b-%Y')
I have a date in my dataframe with format like this
"2018-05-01"
"2018-05-02"
"2018-05-03"
I want to convert the date into something like this in my JSON
"2018-05-01T00:00:00.000Z"
"2018-05-02T00:00:00.000Z"
I had tried using pd.to_datetime(df['date'])
but the result is become something like 1523404800000 in my JSON file. What should I do to get a format like "2018-05-02T00:00:00.000Z" in my JSON?
You are looking for date_format='iso' in df.to_json()
Full example:
import pandas as pd
csvdata = '''\
date
2018-05-01
2018-05-02
2018-05-03'''
fileobj = pd.compat.StringIO(csvdata)
df = pd.read_csv(fileobj, sep='\s+')
# without conversion
print(df.to_json())
# with conversion
df['date'] = pd.to_datetime(df['date'])
print(df.to_json())
# with conversion and adding date_format = iso
print(df.to_json(date_format='iso'))
Prints:
{"date":{"0":"2018-05-01","1":"2018-05-02","2":"2018-05-03"}}
{"date":{"0":1525132800000,"1":1525219200000,"2":1525305600000}}
{"date":{"0":"2018-05-01T00:00:00.000Z","1":"2018-05-02T00:00:00.000Z","2":"2018-05-03T00:00:00.000Z"}}
I am making a generic tool which can take up any csv file.I have a csv file which looks something like this. The first row is the column name and the second row is the type of variable.
sam.csv
Time,M1,M2,M3,CityName
temp,num,num,num,city
20-06-13,19,20,0,aligarh
20-02-13,25,42,7,agra
20-03-13,23,35,4,aligarh
20-03-13,21,32,3,allahabad
20-03-13,17,27,1,aligarh
20-02-13,16,40,5,aligarh
Other CSV file looks like:
Time,M1,M2,M3,CityName
temp,num,num,num,city
20/8/16,789,300,10,new york
12/6/17,464,67,23,delhi
12/6/17,904,98,78,delhi
So, there could be any date format or it could be a time stamp.I want to convert it to "20-May-13" or "%d-%b-%y" format string everytime and sort the column from oldest date to the newest date. I have been able to search the column name where the type is "temp" and try to convert it to the required format but all the methods require me to specify the original format which is not possible in my case.
Code--
import csv
import time
from datetime import datetime,date
import pandas as pd
import dateutil
from dateutil.parser import parse
filename = 'sam.csv'
data_date = pd.read_csv(filename)
column_name = data_date.ix[:, data_date.loc[0] == "temp"]
column_work = column_name.iloc[1:]
column_some = column_work.iloc[:,0]
default_date = datetime.combine(date.today(), datetime.min.time()).replace(day=1)
for line in column_some:
print(parse(line[0], default=default_date).strftime("%d-%b-%y"))
In "sam.csv", the dates are in 2013. But in my output it gives the correct format but all the 6 dates as 2-Mar-2018
You can use the dateutil library for converting any date format to your required format.
Ex:
import csv
from dateutil.parser import parse
p = "PATH_TO_YOUR_CSV.csv" #I have used your sample data to test.
with open(p, "r") as infile:
reader = csv.reader(infile)
next(reader) #Skip Header
next(reader) #Skip Header
for line in reader:
print(parse(line[0]).strftime("%d-%B-%y")) #Parse Date and convert it to date-month-year
Output:
20-June-13
20-February-13
20-March-13
20-March-13
20-March-13
20-February-13
20-August-16
06-December-17
06-December-17
MoreInfo on Dateutil