I have a script that goes and collects data. I am running into the TypeError: Timestamp subtraction must have the same timezones or no timezones error. I have looked at other postings on this error, but had trouble finding a solution for me.
How can I bypass this error. Once the data is collected, I don't manipulate it and I don't quite understand why I cannot save this dataframe into an excel document. Can anyone offer help?
import pandas as pd
import numpy as np
import os
import datetime
import pvlib
from pvlib.forecast import GFS, NAM
#directories and filepaths
barnwell_dir = r'D:\Saurabh\Production Forecasting\Machine Learning\Sites\Barnwell'
barnwell_training = r'8760_barnwell.xlsx'
#constants
writer = pd.ExcelWriter('test' + '_PythonExport.xlsx', engine='xlsxwriter')
time_zone = 'Etc/GMT+5'
barnwell_list = [r'8760_barnwell.xlsx', 33.2376, -81.3510]
def get_gfs_processed_data1():
start = pd.Timestamp(datetime.date.today(), tz=time_zone) #used for testing last week
end = start + pd.Timedelta(days=6)
gfs = GFS(resolution='quarter')
#get processed data for lat/long point
forecasted_data = gfs.get_processed_data(barnwell_list[1], barnwell_list[2], start, end)
forecasted_data.to_excel(writer, sheet_name='Sheet1')
get_gfs_processed_data1()
When I run your sample code I get the following warning from XlsxWriter at the end of the stacktrace:
"Excel doesn't support timezones in datetimes. "
TypeError: Excel doesn't support timezones in datetimes.
Set the tzinfo in the datetime/time object to None or use the
'remove_timezone' Workbook() option
I think that is reasonably self-explanatory. To strip the timezones from the timestamps pass the remove_timezone option as recommended:
writer = pd.ExcelWriter('test' + '_PythonExport.xlsx',
engine='xlsxwriter',
options={'remove_timezone': True})
When I make this change the sample runs and produces an xlsx file. Note, the remove_timezone option requires XlsxWriter >= 0.9.5.
You can delete timezone from all your datetime columns like that:
for col in df.select_dtypes(['datetimetz']).columns:
df[col] = df[col].dt.tz_convert(None)
df.to_excel('test' + '_PythonExport.xlsx')
after that you save excel without any problem
Note:
To select Pandas datetimetz dtypes, use 'datetimetz' (new in 0.20.0)
or 'datetime64[ns, tz]'
Related
I'm trying to sort the content of a csv file by the given timestamps but it just doesn't seem to work for me. They are given in such a way:
2021-04-16 12:59:26+02:00
My current code:
from datetime import datetime
import csv
from csv import DictReader
with open('List_32_Data_New.csv', 'r') as read_obj:
csv_dict_reader = DictReader(read_obj)
csv_dict_reader = sorted(csv_dict_reader, key = lambda row: datetime.strptime(row['Timestamp'], "%Y-%m-%d %H:%M:%S%z"))
writer = csv.writer(open("Sorted.csv", 'w'))
for row in csv_dict_reader:
writer.writerow(row)
However it always throws the error:
time data '2021-04-16 12:59:26+02:00' does not match format '%Y-%m-%d %H:%M:%S%z'
I tried already an online compiler at apparently it works there.
Any help would be much appreciated.
If you use pandas as a library it could be a bit easier (Credits to: MrFuppes).
import pandas as pd
df = pd.read_csv(r"path/your.csv")
df['new_timestamps'] = pd.to_datetime(df['timestamps'], format='%Y-%m-%d %H:%M:%S%z')
df = df.sort_values(['new_timestamps'], ascending=True)
df.to_csv(r'path/your.csv')
If you still have errors you can also try to parse the date like this (Credits to: Zerox):
from dateutil.parser import parse
df['new_timestamps'] = df['timestamps'].map(lambda x: datetime.strptime((parse(x)).strftime('%Y-%m-%d %H:%M:%S%z'), '%Y-%m-%d %H:%M:%S%z'))
Unsure about the correct datetime-format? You can try auto-detection infer_datetime_format=True:
df['new_timestamps'] = pd.to_datetime(df['timestamps'], infer_datetime_format=True)
Tested with following sample:
df = pd.DataFrame(['2021-04-15 12:59:26+02:00','2021-04-13 12:59:26+02:00','2021-04-16 12:59:26+02:00'], columns=['timestamps'])
Convert date string "1/09/2020" to string "1-Sep-2020" in python. Try every solution mentioned in stackoverflow but not able to change it. Sometimes the Value error comes data format doesn't match, when I try to match it then error come day out of range. Is there problem in excel data or I am writing the code wrong. Please help me to solve this issue???
xlsm_files=['202009 - September - Diamond Plod Day & Night MKY021.xlsm']
import time
import pandas as pd
import numpy as np
import datetime
df=pd.DataFrame()
for fn in xlsm_files:
all_dfs=pd.read_excel(fn, sheet_name=None, engine='openpyxl')
list_data = all_dfs.keys()
all_dfs.pop('Date',None)
all_dfs.pop('Ops Report',None)
all_dfs.pop('Fuel Report',None)
all_dfs.pop('Bit Report',None)
all_dfs.pop('Plod Example',None)
all_dfs.pop('Plod Definitions',None)
all_dfs.pop('Consumables',None)
df2 = pd.DataFrame(columns=["PlodDate"])
for ws in list_data:
df1 = all_dfs[ws]
new_row = {'PlodDate':df1.iloc[3,3]}
df2 = df2.append(new_row,ignore_index=True)
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
df2['PlodDate']=df2['PlodDate'].apply(lambda x: x.strftime("%d-%b-%Y"))
df2
ValueError: day is out of range for month or doesnot match format
Method 1-Tried because it show error date out of range
try:
datetime.datetime.strptime(df2['PlodDate'].astype(str).values[0],"%d/%m/%Y")
except ValueError:
continue
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
df2['PlodDate']=df2['PlodDate'].apply(lambda x: x.strftime("%d-%b-%Y"))
Excel File Attached
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
date = df2['PlodDate'].split('/')
df2['PlodDate'] = datetime.date(int(date[2]), int(date[1]), int(date[0])).strftime('%d-%b-%Y')
Is it possible to write date values using pywin32 to Excel without the time? Even though the datetime object I create has no time nor UTC associated to it, when writing the value to Excel it still adds an hour component which is related to UTC. How can I solve this simple problem?
import win32com.client
from datetime import datetime
excel = win32com.client.Dispatch('Excel.Application')
excel.Visible = True
wb = excel.Workbooks.Add()
ws = wb.Sheets['Sheet1']
# Writes '01/01/2019 03:00:00' instead of '01/01/2019'
ws.Cells(1, 1).Value = datetime(2019, 1, 1)
If you just want the date with no time of day, you can call datatime.date() to get it. Unfortunately the value must be converted to a string because the win32com.client won't accept a datetime.date object directly.
# Writes '1/1/2019'
ws.Cells(1, 1).Value = str(datetime(2019, 1, 1).date())
Update:
You can workaround the cell having a text entry by assigning an Excel formula to the cell instead. Doing this will allow you to use the cell more easily in conjunction with other formulas and its other capabilities (such as sorting, charting, etc).
# Writes 1/1/2019
ws.Cells(1, 1).Formula = datetime(2019, 1, 1).strftime('=DATE(%Y,%m,%d)')
First adding the code:
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
for row in readcsv:
# if-else construct to read both empty & time string
if row[0]==str():
date=str()
else:date=datetime.strptime(row[0],'%y%m%d')
# stripping the 170101 part with str(ing)p(arse)time and
# changing the style/format into '%Y/%m/%d' format with strftime.
ID=str(row[5])
if row[6]==str():
O_all=str()
else:O_all=datetime.strptime(row[6],'%H:%M').strftime('%H:%M')
Combined_datetime=datetime.combine(date,O_all)
data.append([Combined_datetime,ID])
print(data)
Yields the error:
Combined_datetime=datetime.combine(date,O_all)
TypeError: combine() argument 2 must be datetime.time, not str
But if I check the types, both "date" & "O_all" are 'datetime.datetime' objects. I guess I'm missing something or understood something wrongly. What could be the remedy to get a timetuple named as 'Combined_datetime'?
Update with this code
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
Combined_datetime =
for row in readcsv:
if len(row[0])<0:
date=str()
else:
date=datetime.strptime(row[0],'%y%m%d')
ID=str(row[5])
if len(row[6])<0:
O_all=str()
else:
O_all=datetime.strptime(row[6],'%H:%M').time()
if date and O_all:
Combined_datetime = datetime.combine(date, O_all).strftime('%y%m%d %H:%M')
data.append([Combined_datetime,ID])
else:
data.append(['',ID])
print(data)
You can explore more pythonic way of writing code. The above code seems very basic and sorry for that.
The problem was elsewhere. While reading my csv, I could read all the row with simple if-else construct (empty time was considered as empty string). But while combining the date & time with datetime.combine(d,t), it could not handle the empty strings.
#amarnath Your suggestion for adding .time() helped though. I removed the rows with empty time this time. Actually it overrides the TypeError: combine() argument 2 must be datetime.time, not str error in this way.
Here is the complete working code:
#Reading T2 & writing new csv
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
for row in readcsv:
#if-else construct to read both empty & time string
if row[6]!=str():
d=datetime.strptime(row[0],'%y%m%d')
t=datetime.strptime(row[6],'%H:%M').time()
print(type(d))
##stripping the 170101 part with str(ing)p(arse)time and
##changing t from datetime.datetime to datetime.time with `.time()`
print(type(t))
ID=str(row[5])
Combined_datetime = datetime.combine(d, t)
data.append([Combined_datetime,ID])
print(data)
with open('T2_w3.csv','w',newline='') as writecsvfile:
writecsv=csv.writer(writecsvfile)
writecsv.writerow(['Combined_datetime','ID'])
for i in range(len(data)):
ROW=data[i]
CDT=ROW[0]
ID=ROW[1]
Final_list=[CDT,ID]
writecsv.writerow(Final_list)
Python newbie here! :)
Basically, I am trying to scan an excel file's column A (which contains all dates) and if the date in the cell is 7 days in the future...do something. Since I am learning, I am just looking at one cell before I progress and start looping through the data.
Here is my current code which isn't working.
import openpyxl, smtplib, datetime, xlrd
from openpyxl import load_workbook
from datetime import datetime
wb = load_workbook(filename = 'FRANKLIN.xlsx')
sheet = wb.get_sheet_by_name('Master')
msg = 'Subject: %s\n%s' % ("Shift Reminder", "Dear a rem ")
cell = sheet['j7'].value
if xlrd.xldate_as_tuple(cell.datemode) == datetime.today.date() + 7:
print('ok!')
Here is the error code I am getting: 'datetime.datetime' object has no attribute 'datemode'
I've tried searching high and low, but can't quite find the solution.
Your cell variable seems to be datetime.datetime object. So you can compare it like this:
from datetime import timedelta
if cell.date() == (datetime.now().date() + timedelta(days=7)):
print("ok")