First adding the code:
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
for row in readcsv:
# if-else construct to read both empty & time string
if row[0]==str():
date=str()
else:date=datetime.strptime(row[0],'%y%m%d')
# stripping the 170101 part with str(ing)p(arse)time and
# changing the style/format into '%Y/%m/%d' format with strftime.
ID=str(row[5])
if row[6]==str():
O_all=str()
else:O_all=datetime.strptime(row[6],'%H:%M').strftime('%H:%M')
Combined_datetime=datetime.combine(date,O_all)
data.append([Combined_datetime,ID])
print(data)
Yields the error:
Combined_datetime=datetime.combine(date,O_all)
TypeError: combine() argument 2 must be datetime.time, not str
But if I check the types, both "date" & "O_all" are 'datetime.datetime' objects. I guess I'm missing something or understood something wrongly. What could be the remedy to get a timetuple named as 'Combined_datetime'?
Update with this code
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
Combined_datetime =
for row in readcsv:
if len(row[0])<0:
date=str()
else:
date=datetime.strptime(row[0],'%y%m%d')
ID=str(row[5])
if len(row[6])<0:
O_all=str()
else:
O_all=datetime.strptime(row[6],'%H:%M').time()
if date and O_all:
Combined_datetime = datetime.combine(date, O_all).strftime('%y%m%d %H:%M')
data.append([Combined_datetime,ID])
else:
data.append(['',ID])
print(data)
You can explore more pythonic way of writing code. The above code seems very basic and sorry for that.
The problem was elsewhere. While reading my csv, I could read all the row with simple if-else construct (empty time was considered as empty string). But while combining the date & time with datetime.combine(d,t), it could not handle the empty strings.
#amarnath Your suggestion for adding .time() helped though. I removed the rows with empty time this time. Actually it overrides the TypeError: combine() argument 2 must be datetime.time, not str error in this way.
Here is the complete working code:
#Reading T2 & writing new csv
import csv
from datetime import datetime, date, time
with open('T2.csv') as readcsvfile:
readcsv=csv.reader(readcsvfile)
header=next(readcsv)
data=[]
for row in readcsv:
#if-else construct to read both empty & time string
if row[6]!=str():
d=datetime.strptime(row[0],'%y%m%d')
t=datetime.strptime(row[6],'%H:%M').time()
print(type(d))
##stripping the 170101 part with str(ing)p(arse)time and
##changing t from datetime.datetime to datetime.time with `.time()`
print(type(t))
ID=str(row[5])
Combined_datetime = datetime.combine(d, t)
data.append([Combined_datetime,ID])
print(data)
with open('T2_w3.csv','w',newline='') as writecsvfile:
writecsv=csv.writer(writecsvfile)
writecsv.writerow(['Combined_datetime','ID'])
for i in range(len(data)):
ROW=data[i]
CDT=ROW[0]
ID=ROW[1]
Final_list=[CDT,ID]
writecsv.writerow(Final_list)
Related
Is there a way to convert a string date that is stored in some non-traditional custom manner into a date using datetime (or something equivalent)? The dates I am dealing with are S3 partitions that look like this:
year=2023/month=2/dayofmonth=3
I can accomplish this with several replaces but im hoping to find a clean single operation to do this.
You might provide datetime.datetime.strptime with format string holding text, in this case
import datetime
dt = datetime.datetime.strptime("year=2023/month=2/dayofmonth=3","year=%Y/month=%m/dayofmonth=%d")
d = dt.date()
print(d) # 2023-02-03
you can do that converting your string into a date object using "datetime" combined with strptime() method.
The strtime() takes two arguments, the first is the string to be parsed, and the second a string with the format.
Here's an example:
from datetime import datetime
# your string
date_string = "year=2023/month=2/dayofmonth=3"
# parse the string into a datetime object
date = datetime.strptime(date_string, "year=%Y/month=%m/dayofmonth=%d")
# print the datetime object
print(date)
I wrote a script that retrieves a date from a textfile, converts that to a datetime and checks if the current time is later than the datetime in the file. I wrote the following code for that:
from datetime import datetime
f = open("token.txt", "r")
expiry_date = f.readline()
f.close()
if datetime.now() >= datetime.strptime(expiry_date, "%Y-%m-%d %H:%M:%S.%f"):
#DO STUFF
However, I get the following error:
ValueError: unconverted data remains:
Anyone knows where I went wrong and how I can fix this?
The line I want to retrieve from the textfile contains a date formatted like this:
2020-05-10 19:29:51.503962
When you call readline(), there is a \n appended to the line. strip the newline first.
Please try:
if datetime.now() >= datetime.strptime(expiry_date.strip(), "%Y-%m-%d %H:%M:%S.%f"):
#DO STUFF
It will work.
In my script I have a string containing the date and the time in the following format:
>>>mystring.text
'05/08/201714:00:00'
What is the best way to compare the string with the output from the datetime.now() method to check if the string contains the most recent hour? Basically, what is the 'operation' I need to do in order to make the following conditional statement:
time = operation(mystring.text)
if time == datetime.now().replace(microsecond=0,second=0,minute=0):
pass
It would be probably make sense to do the comparison in datetime format as follows:
from datetime import datetime
mystring = "05/08/201714:00:00"
dt_mystring = datetime.strptime(mystring, "%d/%m/%Y%H:%M:%S")
print dt_mystring.replace(microsecond=0,second=0,minute=0) == datetime.now().replace(microsecond=0,second=0,minute=0)eplace(microsecond=0,second=0,minute=0)
strptime() is used to convert your string into a datetime object.
The formatting characters are: strptime() Behavior
My CSV file is arranged so that there's a row named "Dates," and below that row is a gigantic column of a million dates, in the traditional format like "4/22/2015" and "3/27/2014".
How can I write a program that identifies the earliest and latest dates in the CSV file, while maintaining the original format (month/day/year)?
I've tried
for line in count_dates:
dates = line.strip().split(sep="/")
all_dates.append(dates)
print (all_dates)
I've tried to take away the "/" and replace it with a blank space, but it does not print anything.
import pandas as pd
import datetime
df = pd.read_csv('file_name.csv')
df['Dates'] = df['Dates'].apply(lambda v: datetime.datetime.strptime(v, '%m/%d/%Y'))
print df['Dates'].min(), df['Dates'].max()
Considering you have a large file, reading it in its entirety into memory is a bad idea.
Read the file line by line, manually keeping track of the earliest and latest dates. Use datetime.datetime.strptime to convert the strings to dates (takes the string format as parameter.
import datetime
with open("input.csv") as f:
f.readline() # get the "Dates" header out of the way
first = f.readline().strip()
earliest = datetime.datetime.strptime(first, "%m/%d/%Y")
latest = datetime.datetime.strptime(first, "%m/%d/%Y")
for line in f:
date = datetime.datetime.strptime(line.strip(), "%m/%d/%Y")
if date < earliest: earliest = date
if date > latest: latest = date
print "Earliest date:", earliest
print "Latest date:", latest
Let's open the csv file, read out all the dates. Then use strptime to turn them into comparable datetime objects (now, we can use max). Lastly, let's print out the biggest (latest) date
import csv
from datetime import datetime as dt
with open('path/to/file') as infile:
dt.strftime(max(dt.strptime(row[0], "%m/%d/%Y") \
for row in csv.reader(infile)), \
"%m/%d/%Y")
Naturally, you can use min to get the earliest date. However, this takes two linear runs, and you can do this with just one, if you are willing to do some heavy lifting yourself:
import csv
from datetime import datetime as dt
with open('path/to/file') as infile:
reader = csv.reader(infile)
date, *_rest = next(infile)
date = dt.strptime(date, "%m/%d/%Y")
for date, *_rest in reader:
date = dt.strptime(date, "%m/%d/%Y")
earliest = min(date, earliest)
latest = max(date, latest)
print("earliest:", dt.strftime(earliest, "%m/%d/%Y"))
print("latest:", dt.strftime(latest, "%m/%d/%Y"))
A bit of an RTFM answer: Open the file in csv format (see the csv library), and then iterate line by line converting the field that is a date into a date object (see the docs for converting a string to a date object), and if it is less than minimum so far store it as minimum, similar for max, with a special condition on the first line that the date becomes both min and max dates.
Or for some overkill you could just use Pandas to read it into a data frame specifying the specific column as date format then just use max & min.
I think it is more convenient to use pandas for this purpose.
import pandas as pd
df = pd.read_csv('file_name.csv')
df['name_of_column_with_date'] = pd.to_datetime(df['name_of_column_with_date'], format='%-m/%d/%Y')
print('min_date{}'.format(min(df['name_of_column_with_date'])))
print('max_date{}'.format(max(df['name_of_column_with_date'])))
The built-in functions work well with Pandas Dataframes.
For more understanding of the format feature in pd.to_datatime you can use Python strftime cheat sheet
I would like to read in a csv file of dates (shown below) and loop through it using solar.GetAltitude on each date to calculate a list of sun altitudes. (I'm using Python 2.7.2 on Windows 7 Enterprise.)
CSV file: TimeStamp 01/01/2014 00:10 01/01/2014 00:20 01/01/2014 00:30
01/01/2014 00:40
My code gives the following error ValueError: unconverted data remains:. This suggests the wrong date format, but it works fine on a single date, rather than a string of dates.
I've researched this topic carefully on Stack Overflow. I've also tried the map function, np.datetime64 and reading to a list rather than a string but get a different error referring to no attribute 'year'.
I'd really appreciate any help because I'm running out of ideas.
import datetime
from datetime import datetime
import julian
import solar
from solar import *
import os
import csv
# Create lists to hold the records.
dates = []
# Navigate to correct directory
os.chdir('D:\\Di_Python')
filename = 'SPA timestamp small.csv'
# Read through the entire file, skip the first line
with open(filename) as f:
# Create a csv reader object.
reader = csv.reader(f)
# Ignore the header row.
next(reader)
# Store the dates in the appropriate list.
for row in reader:
dates.append(row)
print row
# Change list to string so can use a function on it
lines = []
for date in dates:
lines.append('\t'.join(map(str, date)))
result = '\n'.join(lines)
print result
minutes = []
minutes.append(datetime.datetime.strptime(result,'%d/%m/%Y %H:%M'))
# Inputs
latitude_deg = 52.8
longitude_deg = -1.2
elevation = 0
# i should be 52560 - 10 min interval whole year
for i in minutes:
utc_datetime = i
altitude = solar.GetAltitude(latitude_deg, longitude_deg, utc_datetime)
altitude_list.append(altitude)
print altitude_list
First of all, the code is not indented properly making it harder to guess.
I think the input to datetime.datetime.strptime is not correct. You create result by using a '\n'.join(...) but the format string does not contain the '\n'. Creating a string from the list of dates seems unnecessary to me.
I think what you want is this:
for date in dates:
minutes.append(datetime.datetime.strptime(date, '%d/%m/%Y %H:%M'))
Note that the names you use for the lists are misleading as minutes holds datetime.datetime objects rather than minute values!
Many thanks to Vikramis and Lutz Horn for their help and comments. After experimenting with Vikramis' code, I achieved a working version which I have copied below.
My error occurred at line 40:
minutes.append(datetime.datetime.strptime(result,'%d/%m/%Y %H:%M'))
I found that I needed to create a string from the list to avoid the following error "TypeError: must be string, not list". I have now tidied this up by using (str(date) to replace the for loop and hopefully used more sensible names.
My problem was with the formatting. It needs to be
"['%d/%m/%Y %H:%M']" because I'm accessing items in a list, rather than "'%d/%m/%Y %H:%M'" which works in the shell for a single date.
import datetime
from datetime import datetime
import julian
import solar
from solar import *
import os
import csv
# Create lists to hold the records.
dates = []
datetimeObj = []
altitude_list = []
# Navigate to correct directory
os.chdir('D:\\Di_Python')
filename = 'SPA timestamp small.csv'
# Read through the entire file, skip the first line
with open(filename) as f:
# Create a csv reader object.
reader = csv.reader(f)
# Ignore the header row.
next(reader)
# Store the dates in the appropriate list.
for row in reader:
dates.append(row)
print row
# Change format to datetime
# str(date) used to avoid TypeError: must be string, not list
for date in dates:
datetimeObj.append(datetime.datetime.strptime(str(date),"['%d/%m/%Y %H:%M']"))
for j in datetimeObj:
print j
# Inputs
latitude_deg = 52.8
longitude_deg = -1.2
elevation = 0
# i should be 52560 - 10 min interval whole year
for i in datetimeObj:
utc_datetime = i
altitude = solar.GetAltitude(latitude_deg, longitude_deg, utc_datetime)
print altitude
altitude_list.append(altitude)
# print altitude_list