Comparing Excel dates to current date in Python - python

Python newbie here! :)
Basically, I am trying to scan an excel file's column A (which contains all dates) and if the date in the cell is 7 days in the future...do something. Since I am learning, I am just looking at one cell before I progress and start looping through the data.
Here is my current code which isn't working.
import openpyxl, smtplib, datetime, xlrd
from openpyxl import load_workbook
from datetime import datetime
wb = load_workbook(filename = 'FRANKLIN.xlsx')
sheet = wb.get_sheet_by_name('Master')
msg = 'Subject: %s\n%s' % ("Shift Reminder", "Dear a rem ")
cell = sheet['j7'].value
if xlrd.xldate_as_tuple(cell.datemode) == datetime.today.date() + 7:
print('ok!')
Here is the error code I am getting: 'datetime.datetime' object has no attribute 'datemode'
I've tried searching high and low, but can't quite find the solution.

Your cell variable seems to be datetime.datetime object. So you can compare it like this:
from datetime import timedelta
if cell.date() == (datetime.now().date() + timedelta(days=7)):
print("ok")

Related

Not able to change DateTime Format to a specified format in python

Convert date string "1/09/2020" to string "1-Sep-2020" in python. Try every solution mentioned in stackoverflow but not able to change it. Sometimes the Value error comes data format doesn't match, when I try to match it then error come day out of range. Is there problem in excel data or I am writing the code wrong. Please help me to solve this issue???
xlsm_files=['202009 - September - Diamond Plod Day & Night MKY021.xlsm']
import time
import pandas as pd
import numpy as np
import datetime
df=pd.DataFrame()
for fn in xlsm_files:
all_dfs=pd.read_excel(fn, sheet_name=None, engine='openpyxl')
list_data = all_dfs.keys()
all_dfs.pop('Date',None)
all_dfs.pop('Ops Report',None)
all_dfs.pop('Fuel Report',None)
all_dfs.pop('Bit Report',None)
all_dfs.pop('Plod Example',None)
all_dfs.pop('Plod Definitions',None)
all_dfs.pop('Consumables',None)
df2 = pd.DataFrame(columns=["PlodDate"])
for ws in list_data:
df1 = all_dfs[ws]
new_row = {'PlodDate':df1.iloc[3,3]}
df2 = df2.append(new_row,ignore_index=True)
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
df2['PlodDate']=df2['PlodDate'].apply(lambda x: x.strftime("%d-%b-%Y"))
df2
ValueError: day is out of range for month or doesnot match format
Method 1-Tried because it show error date out of range
try:
datetime.datetime.strptime(df2['PlodDate'].astype(str).values[0],"%d/%m/%Y")
except ValueError:
continue
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
df2['PlodDate']=df2['PlodDate'].apply(lambda x: x.strftime("%d-%b-%Y"))
Excel File Attached
df2['PlodDate']=pd.to_datetime(df2['PlodDate'].astype(str), format="%d/%m/%Y")
date = df2['PlodDate'].split('/')
df2['PlodDate'] = datetime.date(int(date[2]), int(date[1]), int(date[0])).strftime('%d-%b-%Y')

Editing timeslot for SQL in Python

I have an excel file that consists of hourly basis values, date and time as timeslot.
I want to import this excel to PostgreSQL DB, but I need to edit the date and time before. Otherwise its giving error.
Editing that date and time in excel also okay but I think doing it in python would be much easier.
I want to transform that date and time into this timeslot:
2018-05-27 00:00:00
I solve it like this, and it worked. Someone may wonder about the solution.
import pandas as pd
from xlwt import Workbook
from datetime import datetime
df = pd.read_excel('yourexcell file.xlsx')
date=[]
for i in df.index:
start_date = df['date'][i] #date column name in excel file
start_time=df['time'][i] #time column name in excel file
date_time = datetime.strptime(start_date, "%d.%m.%Y")
comb=datetime.combine(date_time,start_time)
date.append(comb)
#dateandtime combined and ready.
wb = Workbook()
sheet = wb.add_sheet('Sheet 1')
for k in range(len(date)):
sheet.write(k,0,date[k])
wb.save('datetime_generated'+".CSV")
print("csv file saved.")
and here it is:
output
2018-05-27 00:00:00

TypeError: Timestamp subtraction

I have a script that goes and collects data. I am running into the TypeError: Timestamp subtraction must have the same timezones or no timezones error. I have looked at other postings on this error, but had trouble finding a solution for me.
How can I bypass this error. Once the data is collected, I don't manipulate it and I don't quite understand why I cannot save this dataframe into an excel document. Can anyone offer help?
import pandas as pd
import numpy as np
import os
import datetime
import pvlib
from pvlib.forecast import GFS, NAM
#directories and filepaths
barnwell_dir = r'D:\Saurabh\Production Forecasting\Machine Learning\Sites\Barnwell'
barnwell_training = r'8760_barnwell.xlsx'
#constants
writer = pd.ExcelWriter('test' + '_PythonExport.xlsx', engine='xlsxwriter')
time_zone = 'Etc/GMT+5'
barnwell_list = [r'8760_barnwell.xlsx', 33.2376, -81.3510]
def get_gfs_processed_data1():
start = pd.Timestamp(datetime.date.today(), tz=time_zone) #used for testing last week
end = start + pd.Timedelta(days=6)
gfs = GFS(resolution='quarter')
#get processed data for lat/long point
forecasted_data = gfs.get_processed_data(barnwell_list[1], barnwell_list[2], start, end)
forecasted_data.to_excel(writer, sheet_name='Sheet1')
get_gfs_processed_data1()
When I run your sample code I get the following warning from XlsxWriter at the end of the stacktrace:
"Excel doesn't support timezones in datetimes. "
TypeError: Excel doesn't support timezones in datetimes.
Set the tzinfo in the datetime/time object to None or use the
'remove_timezone' Workbook() option
I think that is reasonably self-explanatory. To strip the timezones from the timestamps pass the remove_timezone option as recommended:
writer = pd.ExcelWriter('test' + '_PythonExport.xlsx',
engine='xlsxwriter',
options={'remove_timezone': True})
When I make this change the sample runs and produces an xlsx file. Note, the remove_timezone option requires XlsxWriter >= 0.9.5.
You can delete timezone from all your datetime columns like that:
for col in df.select_dtypes(['datetimetz']).columns:
df[col] = df[col].dt.tz_convert(None)
df.to_excel('test' + '_PythonExport.xlsx')
after that you save excel without any problem
Note:
To select Pandas datetimetz dtypes, use 'datetimetz' (new in 0.20.0)
or 'datetime64[ns, tz]'

How to change str object to datetime.time in python3?

First adding the code:
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
for row in readcsv:
# if-else construct to read both empty & time string
if row[0]==str():
date=str()
else:date=datetime.strptime(row[0],'%y%m%d')
# stripping the 170101 part with str(ing)p(arse)time and
# changing the style/format into '%Y/%m/%d' format with strftime.
ID=str(row[5])
if row[6]==str():
O_all=str()
else:O_all=datetime.strptime(row[6],'%H:%M').strftime('%H:%M')
Combined_datetime=datetime.combine(date,O_all)
data.append([Combined_datetime,ID])
print(data)
Yields the error:
Combined_datetime=datetime.combine(date,O_all)
TypeError: combine() argument 2 must be datetime.time, not str
But if I check the types, both "date" & "O_all" are 'datetime.datetime' objects. I guess I'm missing something or understood something wrongly. What could be the remedy to get a timetuple named as 'Combined_datetime'?
Update with this code
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
Combined_datetime =
for row in readcsv:
if len(row[0])<0:
date=str()
else:
date=datetime.strptime(row[0],'%y%m%d')
ID=str(row[5])
if len(row[6])<0:
O_all=str()
else:
O_all=datetime.strptime(row[6],'%H:%M').time()
if date and O_all:
Combined_datetime = datetime.combine(date, O_all).strftime('%y%m%d %H:%M')
data.append([Combined_datetime,ID])
else:
data.append(['',ID])
print(data)
You can explore more pythonic way of writing code. The above code seems very basic and sorry for that.
The problem was elsewhere. While reading my csv, I could read all the row with simple if-else construct (empty time was considered as empty string). But while combining the date & time with datetime.combine(d,t), it could not handle the empty strings.
#amarnath Your suggestion for adding .time() helped though. I removed the rows with empty time this time. Actually it overrides the TypeError: combine() argument 2 must be datetime.time, not str error in this way.
Here is the complete working code:
#Reading T2 & writing new csv
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
for row in readcsv:
#if-else construct to read both empty & time string
if row[6]!=str():
d=datetime.strptime(row[0],'%y%m%d')
t=datetime.strptime(row[6],'%H:%M').time()
print(type(d))
##stripping the 170101 part with str(ing)p(arse)time and
##changing t from datetime.datetime to datetime.time with `.time()`
print(type(t))
ID=str(row[5])
Combined_datetime = datetime.combine(d, t)
data.append([Combined_datetime,ID])
print(data)
with open('T2_w3.csv','w',newline='') as writecsvfile:
writecsv=csv.writer(writecsvfile)
writecsv.writerow(['Combined_datetime','ID'])
for i in range(len(data)):
ROW=data[i]
CDT=ROW[0]
ID=ROW[1]
Final_list=[CDT,ID]
writecsv.writerow(Final_list)

How to read timestamp from excel list in python

I am quite new to python and already struggling with an easy task like importing the timestamps of a series of measurement from an excel list.
The excel file has one column for date and one for time. I need the data for further calculation like time difference etc.
I tried to different ways how to get the data. So far my codes looks like this:
method with pyexcel
import pyexcel as pe
import datetime
import time
from datetime import time
import timestring
for n in range(len(users)):
sheet = pe.get_sheet(file_name=users[n],name_columns_by_row=0)
sheet = sheet.to_array()
data_meas = np.array(sheet)
for row in range(len(data_meas)):
print(type(row))
input_time = data_meas[row,1]
input_date = data_meas[row,0]
times = [datetime.datetime.strptime(input_date, input_time, "%d %b %Y %H:%M:%S")]
I get this error for the last line:
TypeError: strptime() takes exactly 2 arguments (3 given)
method with xlrd
import xlrd
from datetime import time
inputdata = xlrd.open_workbook('file.xls')
sheet = inputdata.sheet_by_index(0)
for row in sheet:
input_date=sheet.cell_value(row,0)
input_time=sheet.cell_value(row,1)
date_values = xlrd.xldate_as_tuple(input_time, inputdata.datemode)
time_value = time(*date_values[3:])
TypeError: 'Sheet' object is not iterable
Does anybody know how to help me?
I appreciate every hint.
Regarding your first solution, strptime takes only one date string as input.
You should join input_date and input_time:
input_time = '18:20:00'
input_date = 'Mon, 30 Nov 2015'
time = datetime.datetime.strptime(' '.join([input_date, input_time]), "%a, %d %b %Y %H:%M:%S")
To create the whole list of datetime objects, you can try:
times = [datetime.datetime.strptime(' '.join([data_meas[row,0], data_meas[row,1]]), "%a, %d %b %Y %H:%M:%S") for row in range(len(data_meas))]
Edit:
If you want to keep the for loop, you have to append each datetime object to your list (otherwise you will only keep the last date):
data_meas = np.array([['07/11/2015 18:20:00'],['09/11/2015 21:20:00']])
#list initilization
times = []
for row in range(len(data_meas)):
input_date = data_meas[row,0]
#we add a new item to our list
times.append(datetime.datetime.strptime(input_date, "%d/%m/%Y %H:%M:%S"))
Now, you can access each datetime in the list times. To calculate time differences, you can check the documentation on timedelta.
#Create a timedelta object
t1 = times[1] - times[0]
#Convert time difference in seconds
t2 = t1.total_seconds()

Categories