Using Python 2.7, I am trying to create an array of relative time, where relative time is the difference between the middle of a thunderstorm and some other time during that storm. Ultimately, I am hoping that my relative time array will be of the format -minutes, 0, and +minutes (where -minutes are minutes before the middle of the storm, 0 is the middle of the storm, and +minutes are minutes after the middle of the storm). I figured a loop was the most efficient way to do this. I already have a 1-D array, MDAdatetime, filled with the other storm times I mentioned, as strings. I specified the middle of the storm at the beginning of my code, and it is a string, as well.
So far, my code is as follows:
import csv
import datetime
casedate = '06052009'
MDAfile = "/atmomounts/home/grad/mserino/Desktop/"+casedate+"MDA.csv"
stormmidpoint = '200906052226' #midpoint in YYYYMMDDhhmm
relativetime = []
MDAlons = []
MDAlats = []
MDAdatetime = []
with open(MDAfile, 'rU') as f: #open to read in universal-newline mode
reader = csv.reader(f, dialect=csv.excel, delimiter=',')
for i,row in enumerate(reader):
if i == 0:
continue #skip header row
MDAlons.append(float(row[1]))
MDAlats.append(float(row[2]))
MDAdatetime.append(str(row[0])) #this is the array I'm dealing with now in the section below; each string is of the format YYYYMMDDhhmmss
## This is the section I'm having trouble with ##
for j in range(len(MDAdatetime)):
reltime = datetime.datetime.strptime(MDAdatetime[j],'%YYYY%mm%dd%HH%MM%SS') - datetime.datetime(stormmidpoint,'%YYYY%mm%dd%HH%MM')
retime.strftime('%MM') #convert the result to minutes
reativetime.append(reltime)
print relativetime
So far, I have been getting the error:
ValueError: time data '20090605212523' does not match format '%YYYY%mm%dd%HH%MM%SS'
I am trying to learn as much as I can about the datetime module. I have seen some other posts and resources mention dateutil, but it seems that datetime will be the most useful for me. I could be wrong, though, and I appreciate any advice and help. Please let me know if I need to clarify anything or provide more information.
%Y matches 2016. Not %YYYY, similarly for month, date, etc..
So, your format matcher should be %Y%m%d%H%M%S
Something like this:
datetime.datetime.strptime("20090605212523", "%Y%m%d%H%M%S")
Demo
>>> datetime.datetime.strptime("20090605212523", "%YYYY%mm%dd%HH%MM%SS")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '20090605212523' does not match format '%YYYY%mm%dd%HH%MM%SS'
>>> datetime.datetime.strptime("20090605212523", "%Y%m%d%H%M%S")
datetime.datetime(2009, 6, 5, 21, 25, 23)
After #karthikr pointed out that my strptime formatting was incorrect, I was able to successfully get an answer in minutes. There may be a better way, but I converted to minutes by hand in the code. I also changed my stormmidpoint variable to a string, as it should be. #karthikr was also right about that. Thanks for the help!
for j in range(len(MDAdatetime)):
reltime = datetime.datetime.strptime(MDAdatetime[j],'%Y%m%d%H%M%S') - datetime.datetime.strptime(stormmidpoint,'%Y%m%d%H%M%S')
relativetime.append(int(math.ceil(((reltime.seconds/60.0) + (reltime.days*1440.0))))) #convert to integer minutes
print relativetime
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am trying to read a text file line by line as integers. I did every suggestion I saw here but none works for me. here is the code I'm using. It reads some seismic data from the datadir and evaluates the SNR ratio to decide whether keep the data or remove it. To do so, I need to calculate the distance between stations and the earthquake which the info comes from input files.
from obspy import UTCDateTime
import os
datadir = "/home/alireza/Desktop/Saman/Eqcomplete"
homedir = "/home/alireza/Desktop/Saman"
eventlist = os.path.join (homedir, 'events.dat')
stationlist = os.path.join (homedir, 'all_st')
e = open (eventlist, "r")
for event in e.readlines():
year, mon, day, time, lat, lon = event.split (" ")
h = str (time.split (":")[0]) # hour
m = str (time.split (":")[1]) # minute
s = str (time.split (":")[2]) # second
s = open (stationlist, "r")
for station in s.readlines():
stname, stlo, stla = station.split (" ")
OafterB = UTCDateTime (int(year), int(mon), int(day), int(h), int(m), int(s))
print (OafterB) # just to have an output!
s.close ()
e.close ()`
Update:
There are two input files:
events.dat which is like:
2020 03 18 17:45:39 -11.0521 115.1378
all_st which is like:
AHWZ 48.644 31.430
AFRZ 59.015 33.525
NHDN 60.050 31.493
BDRS 48.881 34.054
BMDN 48.825 33.772
HAGD 49.139 34.922
Here is the output:
Traceback (most recent call last):
File "SNR.py", line 21, in <module>
OafterB = UTCDateTime (int(year), int(mon), int(day), int(h), int(m), int(s))
TypeError: int() argument must be a string, a bytes-like object or a number, not '_io.TextIOWrapper'
Here to test the code you need to install the obspy package.
pip install obspy may work.
You define s here:
s = str (time.split (":")[2]) # second
But then, immediately afterward, you refine it:
s = open (stationlist, "r")
Now s points to a file object, so int(s) fails with the above error. Name your station list file object something different, and the problem goes away.
Other tips which you may find helpful:
split() will automatically split on whitespace unless you tell it otherwise, so there's no need to specify " ".
You can use multiple assignment to assign h, m, and s the same way you did with the previous line. Currently, you're performing the same split operation three different times.
It's recommended to open files using the with keyword, which will automatically handle closing the file, even if an exception occurs.
You can iterate over a file object directly, without creating a list with readlines().
Using pathlib can make it much simpler and cleaner to deal with filesystem paths and separators.
It's considered bad form to put spaces between the name of a function and the parentheses.
There's also a convention that variable names (other than class names) are usually all lowercase, with underscores between words as needed. (See PEP 8 for a helpful rundown of all such style conventions. They're not hard and fast rules, but they can help make code more consistent and readable.)
With those things in mind, here's a slightly spruced up version of your above code:
from pathlib import Path
from obspy import UTCDateTime
data_dir = Path('/home/alireza/Desktop/Saman/Eqcomplete')
home_dir = Path('/home/alireza/Desktop/Saman')
event_list = home_dir / 'events.dat'
station_list = home_dir / 'all_st'
with open(event_list) as e_file:
for event in e_file:
year, mon, day, time, lat, lon = event.split()
h, m, s = time.split(':')
with open(station_list) as s_file:
for station in s_file:
stname, stlo, stla = station.split()
o_after_b = UTCDateTime(
int(year), int(mon), int(day), int(h), int(m), int(s)
)
print(o_after_b)
I am able to parse strings containing date/time with time.strptime
>>> import time
>>> time.strptime('30/03/09 16:31:32', '%d/%m/%y %H:%M:%S')
(2009, 3, 30, 16, 31, 32, 0, 89, -1)
How can I parse a time string that contains milliseconds?
>>> time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/_strptime.py", line 333, in strptime
data_string[found.end():])
ValueError: unconverted data remains: .123
Python 2.6 added a new strftime/strptime macro %f. The docs are a bit misleading as they only mention microseconds, but %f actually parses any decimal fraction of seconds with up to 6 digits, meaning it also works for milliseconds or even centiseconds or deciseconds.
time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')
However, time.struct_time doesn't actually store milliseconds/microseconds. You're better off using datetime, like this:
>>> from datetime import datetime
>>> a = datetime.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')
>>> a.microsecond
123000
As you can see, .123 is correctly interpreted as 123 000 microseconds.
I know this is an older question but I'm still using Python 2.4.3 and I needed to find a better way of converting the string of data to a datetime.
The solution if datetime doesn't support %f and without needing a try/except is:
(dt, mSecs) = row[5].strip().split(".")
dt = datetime.datetime(*time.strptime(dt, "%Y-%m-%d %H:%M:%S")[0:6])
mSeconds = datetime.timedelta(microseconds = int(mSecs))
fullDateTime = dt + mSeconds
This works for the input string "2010-10-06 09:42:52.266000"
To give the code that nstehr's answer refers to (from its source):
def timeparse(t, format):
"""Parse a time string that might contain fractions of a second.
Fractional seconds are supported using a fragile, miserable hack.
Given a time string like '02:03:04.234234' and a format string of
'%H:%M:%S', time.strptime() will raise a ValueError with this
message: 'unconverted data remains: .234234'. If %S is in the
format string and the ValueError matches as above, a datetime
object will be created from the part that matches and the
microseconds in the time string.
"""
try:
return datetime.datetime(*time.strptime(t, format)[0:6]).time()
except ValueError, msg:
if "%S" in format:
msg = str(msg)
mat = re.match(r"unconverted data remains:"
" \.([0-9]{1,6})$", msg)
if mat is not None:
# fractional seconds are present - this is the style
# used by datetime's isoformat() method
frac = "." + mat.group(1)
t = t[:-len(frac)]
t = datetime.datetime(*time.strptime(t, format)[0:6])
microsecond = int(float(frac)*1e6)
return t.replace(microsecond=microsecond)
else:
mat = re.match(r"unconverted data remains:"
" \,([0-9]{3,3})$", msg)
if mat is not None:
# fractional seconds are present - this is the style
# used by the logging module
frac = "." + mat.group(1)
t = t[:-len(frac)]
t = datetime.datetime(*time.strptime(t, format)[0:6])
microsecond = int(float(frac)*1e6)
return t.replace(microsecond=microsecond)
raise
DNS answer above is actually incorrect. The SO is asking about milliseconds but the answer is for microseconds. Unfortunately, Python`s doesn't have a directive for milliseconds, just microseconds (see doc), but you can workaround it by appending three zeros at the end of the string and parsing the string as microseconds, something like:
datetime.strptime(time_str + '000', '%d/%m/%y %H:%M:%S.%f')
where time_str is formatted like 30/03/09 16:31:32.123.
Hope this helps.
My first thought was to try passing it '30/03/09 16:31:32.123' (with a period instead of a colon between the seconds and the milliseconds.) But that didn't work. A quick glance at the docs indicates that fractional seconds are ignored in any case...
Ah, version differences. This was reported as a bug and now in 2.6+ you can use "%S.%f" to parse it.
from python mailing lists: parsing millisecond thread. There is a function posted there that seems to get the job done, although as mentioned in the author's comments it is kind of a hack. It uses regular expressions to handle the exception that gets raised, and then does some calculations.
You could also try do the regular expressions and calculations up front, before passing it to strptime.
For python 2 i did this
print ( time.strftime("%H:%M:%S", time.localtime(time.time())) + "." + str(time.time()).split(".",1)[1])
it prints time "%H:%M:%S" , splits the time.time() to two substrings (before and after the .) xxxxxxx.xx and since .xx are my milliseconds i add the second substring to my "%H:%M:%S"
hope that makes sense :)
Example output:
13:31:21.72
Blink 01
13:31:21.81
END OF BLINK 01
13:31:26.3
Blink 01
13:31:26.39
END OF BLINK 01
13:31:34.65
Starting Lane 01
I have this code that will allow me to count the number of missing rows of numbers within the csv for a script in Python 3.6. However, these are the following errors in the program:
Error:
Traceback (most recent call last):
File "C:\Users\GapReport.py", line 14, in <module>
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
File "C:\Users\GapReport.py", line 14, in <genexpr>
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
ValueError: invalid literal for int() with base 10: 'AC-SEC 000000001'
Code:
import csv
def out(*args):
print('{},{}'.format(*(str(i).rjust(4, "0") for i in args)))
prev = 0
data = csv.reader(open('Padded Numbers_export.csv'))
print(*next(data), sep=', ') # header
for line in data:
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
if start != prev+1:
out(prev+1, start-1)
prev = end
out(start, end)
I'm stumped on how to fix these issues.Also, I think the csv many lines in it, so if there's a section that limits it to a few numbers, please feel free to update me on so.
CSV Snippet (Sorry if I wasn't clear before!):
The values you have in your CSV file are not numeric.
For example, FMAC-SEC 000000001 is not a number. So when you run int(s.strip()[2:]), it is not able to convert it to an int.
Some more comments on the code:
What is the utility of doing EndDoc_Padded, EndDoc_Padded = (...)? Currently you are assigning values to two variables with the same name. Either name one of them something else, or just have one variable there.
Are you trying to get the two different values from each column? In that case, you need to split line into two first. Are the contents of your file comma separated? If yes, then do for s in line.split(','), otherwise use the appropriate separator value in split().
You are running this inside a loop, so each time the values of the two variables would get updated to the values from the last line. If you're trying to obtain 2 lists of all the values, then this won't work.
I need to find the time difference between date/time TrackLogs on a table using a for loop. The trackLog entries are found in the following format:
2013-08-02T14:30:10Z
I need to take into consideration that the 'for' loop fails when it tries to calculate the first timediff. because the first log has no preceding entry to execute the calculation. Therefore I need to include an 'if' statement (with a boolean) to allow the script to run when it finds the first log.
This is what I have got so far:
for row in cur:
period=row.tracklogs
year=period[0:4]
month=period[6:7]
day=period[8:10]
hour=period[11:13]
minut=period[14:16]
second=period[17:19]
print period
firstline="yes"
if firstline=="yes":
prev_period=row.tracklogs
prev_coord=row.shape
firstline="no"
else:
new_period=row.tracklogs
new_coord=row.shape
period1=datetime.datetime(int(prev_period))
period2=datetime.datetime(int(new_period))
timediff=(a2-a1).seconds
print timediff
I think I need an integer for the datetime operation but then I run into the following exception:
line 36, in <module>
tid1=datetime.datetime(int(tidl_tid))
ValueError: invalid literal for int() with base 10: '2009-05-19T11:51:47Z'
I know I'm doing something wrong but can't tell what it is. Any help m appreciated.
You are trying to convert a string 2009-05-19T11:51:47Z to integer, where the string is not a proper base 10 number. So python is throwing up the error
You need to parse your date strings and obtain the necessary datetime objects.
Assuming
date_string = "2009-05-19T11:51:47Z"
Either do
from dateutil import parser
d1 = parser.parse(date_string)
or
from datetime import datetime
d1 = datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%SZ")
And use the next command to get the difference between two dates d1 and d2 in seconds
(d1-d2).total_seconds()
I'm trying to run a predictive RNN from this repo https://github.com/jgpavez/LSTM---Stock-prediction. "python lstm_forex.py"
It seems to be having trouble creating an empty Numpy array
The function giving me problems, starting with the line 'days', fourth from the bottom.
def read_data(path="full_USDJPY.csv", dir="/Users/Computer/stock/LSTM2/",
max_len=30, valid_portion=0.1, columns=4, up=False, params_file='params.npz',min=False):
'''
Reading forex data, daily or minute
'''
path = os.path.join(dir, path)
#data = read_csv(path,delimiter=delimiter)
data = genfromtxt(path, delimiter=',',skip_header=1)
# Adding data bu minute
if min == False:
date_index = 1
values_index = 3
hours = data[:,2]
else:
date_index = 0
values_index = 1
dates = data[:,date_index]
print (dates)
days = numpy.array([datetime.datetime(int(str(date)[0:-2][0:4]),int(str(date)[0:-2][4:6]),
int(str(date)[0:-2][6:8])).weekday() for date in dates])
months = numpy.array([datetime.datetime(int(str(date)[0:-2][0:4]),int(str(date)[0:-2][4:6]),
int(str(date)[0:-2][6:8])).month for date in dates])
Gives the error...
Traceback (most recent call last):
File "lstm_forex.py", line 778, in <module>
tick=tick
File "lstm_forex.py", line 560, in train_lstm
train, valid, test, mean, std = read_data(max_len=n_iter, path=dataset, params_file=params_file,min=(tick=='minute'))
File "/Users/Computer/stock/LSTM2/forex.py", line 85, in read_data
int(str(date)[0:-2][6:8])).weekday() for date in dates])
ValueError: invalid literal for int() with base 10: 'n'
I've seen a similar problem that involded putting '.strip' at the end of something. This code is so complicated I don't quite know where to put it. I tried everywhere and got usually the same error 'has no attribute' on others. Now I'm not sure what might fix it.
You're trying to int() the string 'n' in your assertion. To get the same error:
int('n')
ValueError Traceback (most recent call last)
<ipython-input-18-35fea8808c96> in <module>()
----> 1 int('n')
ValueError: invalid literal for int() with base 10: 'n'
What exactly are you trying to pull out in that list comprehension? It looks like sort of a tuple of date information, but a bit more information about what you're trying to pull out, or comments in the code explaining the logic more clearly would help us get you to the solution.
EDIT: If you use pandas.Timestamp it may do all that conversion for you - now that I look at the code it looks like you're just trying to pull out the day of the week, and the month. It may not work if it can't cnovert the timestamp for you, but it's pretty likely that it would. A small sample of the CSV data you're using would confirm easily enough.
days = numpy.array(pandas.Timestamp(date).weekday() for date in dates])
months = numpy.array(pandas.Timestamp(date).month() for date in dates])