Related
I am trying to create a list of time and date at specific intervals. The times and dates are present in a time series csv and I want to write a code that extracts data from specific time intervals. I made two lists for day and hour and I am creating a new variable that that stores the date and time of interest. I have trying the following code but I get error:
day = ['01', '02', '03', '04', "05", '06', '07', '08', '09', '10', '11', '12','13','14','15','16','17','18'
'19','20','21','22','23','24','25','26','27','28','29','30','31']
hour = ['0', '3', '6', '9', '12', '15','18','21']
year, month, day, hour = year, month, day, hour # 2016-01-01 #01:00 am
day_time = []
for i in day.index:
for j in hour.index:
day_time = int("".join(day[i], hour[j], "00",))
print(day_time)
TypeError Traceback (most recent call last)
<ipython-input-72-15de17abf279> in <module>
6 year, month, day, hour = year, month, day, hour # 2016-01-01 #01:00 am
7 day_time = []
----> 8 for i in day.index:
9 for j in hour.index:
10 day_time = int("".join(day[i], hour[j], "00",))
TypeError: 'builtin_function_or_method' object is not iterable
can someone suggest a solution?
index is a function, not an attribute for list instance. please refer to Data structures
also, the join function of a str data type takes iterables, refer to here
Also, as #Lecdi pointed, you should use append to add to a list instead of redefinition of the variable using =; please refer to here
to be able to do what you want to do:
day = ['01', '02', '03', '04', "05", '06', '07', '08', '09', '10', '11', '12','13','14','15','16','17','18'
'19','20','21','22','23','24','25','26','27','28','29','30','31']
hour = ['0', '3', '6', '9', '12', '15','18','21']
year, month, day, hour = year, month, day, hour # 2016-01-01 #01:00 am
day_time = []
for day_i in day:
for hour_i in hour:
day_time.append(int("".join([day_i, hour_i, "00"])))
print(day_time)
I think enumerate() would work better for you
for indexDay, valueDay in enumerate(day):
for indexHour, valueHour in enumerate(hour):
day_time.append(int("".join([valueDay, valueHour, "00"])))
Im having some difficulty understanding why my loop is not deleting invalid dates from a list of date tuples in the format of dd/mm/yyyy . heres what i have so far :
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
print(dates)
for date in dates :
day = int(date[0])
month = int(date[1])
year = int(date[2])
if day > 31 :
dates.remove(date)
if month > 12 :
dates.remove(date)
print(dates)
and heres the result :
[('12', '10', '1987'), ('13', '09', '2010'), ('34', '02', '2002'), ('02', '15', '2005'), ('37', '10', '2016'), ('39', '11', '2001')]
[('12', '10', '1987'), ('13', '09', '2010'), ('02', '15', '2005'), ('39', '11', '2001')]
I'm a total beginner and any help would be much appreciated.
Never modify the (length of the) list you are looping over. Instead, use for example a temporary list:
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
print(dates)
out = []
for date in dates :
day = int(date[0])
month = int(date[1])
year = int(date[2])
if day > 31 or month > 12:
continue
out.append(date)
dates = out
print(dates)
The continue statement jumps back to the first line of the loop, so the unwanted dates will be skipped.
Better alternative conserning dates
Commenting on the "date checking" functionality of the program: It might be really hard to determine by your own rules what dates are acceptable and what are not. Consider for example the Feb 29th, which is only valid on every fourth year.
What you could do instead is to use the datetime library to try to parse the strings to datetime objects, and if the parsing fails, you know the date is illegal.
import datetime as dt
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
def filter_bad_dates(dates):
out = []
for date in dates:
try:
dt.datetime.strptime('-'.join(date), '%d-%m-%Y')
except ValueError:
continue
out.append(date)
return out
dates = filter_bad_dates(dates)
print(dates)
This try - except pattern is also called "Duck Typing":
If it looks like a date and gets parsed like a proper date, then it is probably a proper date.
You can easily accomplish that with this list comprehension:
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
dates = [date for date in dates if int(date[1]) < 12 and int(date[0]) < 31]
print(dates)
Output:
[('12', '10', '1987'), ('13', '09', '2010')]
I like #AnnZen's comprehension approach (+1) though my tendency would be to go more symbolic at the waste of some time and space:
dates = [ \
('12', '10', '1987'), \
('13', '09', '2010'), \
('34', '02', '2002'), \
('02', '15', '2005'), \
('37', '10', '2016'), \
('39', '11', '2001'), \
]
dates = [date for (day, month, _), date in zip(dates, dates) if day < '31' and month < '12']
print(dates)
OUTPUT
> python3 test.py
[('12', '10', '1987'), ('13', '09', '2010')]
>
As far as #np8's "Never modify the list you are looping over.", that's excellent advice. Though, again, I might waste some space making the copy upfront to make my code simpler:
for date in list(dates): # iterate over a copy
day, month, _ = date
if int(day) > 31 or int(month) > 12:
dates.remove(date)
Though in the end, #np8's filtering through datetime seems the most reliable solution. (+1)
I am trying to do a script that read a seismic USGS bulletin and take some data to build a new txt file in order to have an input for other program called Zmap to do seismic statistics
SO I have the following USGS bulletin format:
time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,net,id,updated,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource
2016-03-31T07:53:28.830Z,-22.6577,-68.5345,95.74,4.8,mww,,33,0.35,0.97,us,us20005dm3,2016-05-07T05:09:39.040Z,"43km NW of San Pedro de Atacama, Chile",earthquake,6.5,4.3,,,reviewed,us,us
2016-03-31T07:17:19.300Z,-18.779,-67.3104,242.42,4.5,mb,,65,1.987,0.85,us,us20005dlx,2016-04-24T07:21:05.358Z,"55km WSW of Totoral, Bolivia",earthquake,10.2,12.6,0.204,7,reviewed,us,us
this has many seismics events, so I did the following code which basically tries to read, split and save some variables in list to put them all together in a final *txt file.
import os, sys
import csv
import string
from itertools import (takewhile,repeat)
os.chdir('D:\\Seismic_Inves\\b-value_osc\\try_tonino')
archi=raw_input('NOMBRE DEL BOLETIN---> ')
ff=open(archi,'rb')
bufgen=takewhile(lambda x: x, (ff.read(1024*1024) for _ in repeat(None)))
numdelins= sum(buf.count(b'\n') for buf in bufgen if buf) - 1
with open(archi,'rb') as f:
next(f)
tiempo=[]
lat=[]
lon=[]
prof=[]
mag=[]
t_mag=[]
leo=csv.reader(f,delimiter=',')
for line in leo:
tiempo.append(line[0])
lat.append(line[1])
lon.append(line[2])
prof.append(line[3])
mag.append(line[4])
t_mag.append(line[5])
tiempo=[s.replace('T', ' ') for s in tiempo] #remplaza el tema de la T por espacio
tiempo=[s.replace('Z','') for s in tiempo] #quito la Z
tiempo=[s.replace(':',' ') for s in tiempo] # quito los :
tiempo=[s.replace('-',' ') for s in tiempo] # quito los -
From the USGS catalog I'd like to take the: Latitude (lat), longitude(lon), time(tiempo), depth (prof), magnitude (mag), type of magnitude (t_mag), with this part of teh code I took the variables I needed:
next(f)
tiempo=[]
lat=[]
lon=[]
prof=[]
mag=[]
t_mag=[]
leo=csv.reader(f,delimiter=',')
for line in leo:
tiempo.append(line[0])
lat.append(line[1])
lon.append(line[2])
prof.append(line[3])
mag.append(line[4])
t_mag.append(line[5])
but I had some troubles with the tim, so I applied my newbie knowledge to split the time from 2016-03-31T07:53:28.830Z to 2016 03 31 07 53 28.830.
Now I am suffering trying to have in one list the year ([2016,2016,2016,...]) in other list the months ([01,01,...03,03,...12]), in other the day ([12,14,...03,11]), in other the hour ([13,22,14,17...]), and the minutes with seconds merged by a point (.) like ([minute.seconds]) or ([12.234,14.443,...]), so I tryied to do this (to plit the spaces) and no success
tiempo2=[]
for element in tiempo:
tiempo2.append(element.split(' '))
print tiempo2
no success because i got this result:
[['2016', '03', '31', '07', '53', '28.830'], ['2016', '03', '31', '07', '17', '19.300'].
can you give me a hand in this part?, or is there a pythonic way to split the date like I said before.
Thank you for the time you spent reading it.
best regards.
Tonino
suppose our tiempo2 holds the following value extracted from the csv :
>>> tiempo2 = [['2016', '03', '31', '07', '53', '28.830'], ['2016', '03', '31', '07', '17', '19.300']]
>>> list (map (list, (map (float, items) if index == 5 else map (int, items) for index, items in enumerate (zip (*tiempo2)))))
[[2016, 2016], [3, 3], [31, 31], [7, 7], [53, 17], [28.83, 19.3]]
here we used the zip function to zip years, months, days, etc ...
I applied the conditional mapping for each item to an int if the index of the list is not the last otherwise to a float
I would suggest using the time.strptime() function to parse the time string into a Python time.struct_time which is a namedtuple. That means you can access any attributes you want using . notation.
Here's what I mean:
import time
time_string = '2016-03-31T07:53:28.830Z'
timestamp = time.strptime(time_string, '%Y-%m-%dT%H:%M:%S.%fZ')
print(type(timestamp))
print(timestamp.tm_year) # -> 2016
print(timestamp.tm_mon) # -> 3
print(timestamp.tm_mday) # -> 31
print(timestamp.tm_hour) # -> 7
print(timestamp.tm_min) # -> 53
print(timestamp.tm_sec) # -> 28
print(timestamp.tm_wday) # -> 3
print(timestamp.tm_yday) # -> 91
print(timestamp.tm_isdst) # -> -1
You could process a list of time strings by using a for loop as shown below:
import time
tiempo = ['2016-03-31T07:53:28.830Z', '2016-03-31T07:17:19.300Z']
for time_string in tiempo:
timestamp = time.strptime(time_string, '%Y-%m-%dT%H:%M:%S.%fZ')
print('year: {}, mon: {}, day: {}, hour: {}, min: {}, sec: {}'.format(
timestamp.tm_year, timestamp.tm_mon, timestamp.tm_mday,
timestamp.tm_hour, timestamp.tm_min, timestamp.tm_sec))
Output:
year: 2016, mon: 3, day: 31, hour: 7, min: 53, sec: 28
year: 2016, mon: 3, day: 31, hour: 7, min: 17, sec: 19
Another solution with the iso8601 add-on (pip install iso8601)
>>> import iso8601
>>> dt = iso8601.parse_date('2016-03-31T07:17:19.300Z')
>>> dt.year
2016
>>> dt.month
3
>>> dt.day
31
>>> dt.hour
7
>>> dt.minute
17
>>> dt.second
10
>>> dt.microsecond
300000
>>> dt.tzname()
'UTC'
Edited 2017/8/6 12h55
IMHO, it is a bad idea to split the datetime timestamp objects into components (year, month, ...) in individual lists. Keeping the datetime timestamp objects as provided by iso8601.parse_date(...) could help to compute time deltas between events, check the chronological order, ... See the doc of the datetime module for more https://docs.python.org/3/library/datetime.html
Having distinct lists for year, month, (...) would make such operations difficult. Anyway, if you prefer this solution, here are the changes
import iso8601
# Start as former solution
with open(archi,'rb') as f:
next(f)
# tiempo=[]
dt_years = []
dt_months = []
dt_days = []
dt_hours = []
dt_minutes = []
dt_timezones = []
lat=[]
lon=[]
prof=[]
mag=[]
t_mag=[]
leo=csv.reader(f,delimiter=',')
for line in leo:
# tiempo.append(line[0])
dt = iso8601.parse_date(line[0])
dt_years.append(dt.year)
dt_months.append(dt.month)
dt_days.append(dt.day)
dt_hours.append(dt.hour)
dec_minutes = dt.minute + (dt.seconds / 60) + (dt.microsecond / 600000000)
dt_minutes.append(dec_minutes)
dt_timezones.append(dt.tzname())
lat.append(line[1])
lon.append(line[2])
prof.append(line[3])
mag.append(line[4])
t_mag.append(line[5])
I am new in python and i would greatly appreciate some help.
I have data generated from a weather station(rawdate) in the format 2015-04-26 00:00:48 like this
Date,Ambient Temperature (C),Wind Speed (m/s)
2015-04-26 00:00:48,10.75,0.00
2015-04-26 00:01:48,10.81,0.43
2015-04-26 00:02:48,10.81,0.32
and i would like to split them into year month day hour and minute. My attempt so far is this:
for i in range(len(rawdate)):
x=rawdate[1].split()
date.append(x)
but it gives me a list full of empty lists. My target is to convert this into a list of lists (using the command split) where the new data will be stored into x in the form of [date, time]. Then i want to split further using split with "-" and ":". Can someone offer some advice?
>>> from datetime import datetime
>>> str_date = '2015-04-26 00:00:48'
>>> datte = datetime.strptime(str_date, '%Y-%m-%d %H:%M:%S')
>>> t = datte.timetuple()
>>> y, m, d, h, min, sec, wd, yd, i = t
>>> y
2015
>>> m
4
>>> min
0
>>> sec
48
Your code is clearly broken, because you are not using the loop in any way other than repeating the same operation on rawdate[1], len(rawdate) times.
It's possible that you meant i where you have 1.
For this to make sense, your rawdate would have to be a list of strings (as suggested by #SuperBiasedMan)
Maybe something close to what you were after is like this:
>>> dates = []
>>> rawdates = ['2015-04-26 00:00:48', '2015-04-26 00:00:49']
>>> for i in range(len(rawdates)):
... the_date = rawdates[i].split()
... dates.append(the_date)
...
>>> dates
[['2015-04-26', '00:00:48'], ['2015-04-26', '00:00:49']]
>>>
Use meaningful names always.
rawdate[1] will always return a 0 cause '2015...'[1] is 0.
>>>a = '2015-04-26 00:00:48'
>>>print([date for date in [i for i in a.split(' ')][0].split('-')] + [time for time in [i for i in a.split(' ')][1].split(':')])
>>>['2015', '04', '26', '00', '00', '48']
I've data like this.
startDateTime: {'timeZoneID': 'America/New_York', 'date': {'year': '2014', 'day': '29', 'month': '1'}, 'second': '0', 'hour': '12', 'minute': '0'}
This is just a representation for 1 attribute. Like this i've 5 other attributes. LastModified, created etc.
I wanted to derive this as ISO Date format yyyy-mm-dd hh:mi:ss. is this the right way for doing this?
def parse_date(datecol):
x=datecol;
y=str(x.get('date').get('year'))+'-'+str(x.get('date').get('month')).zfill(2)+'-'+str(x.get('date').get('day')).zfill(2)+' '+str(x.get('hour')).zfill(2)+':'+str(x.get('minute')).zfill(2)+':'+str(x.get('second')).zfill(2)
print y;
return;
That works, but I'd say it's cleaner to use the string formatting operator here:
def parse_date(c):
d = c["date"]
print "%04d-%02d-%02d %02d:%02d:%02d" % tuple(map(str, (d["year"], d["month"], d["day"], c["hour"], c["minute"], c["second"])))
Alternatively, you can use the time module to convert your fields into a Python time value, and then format that using strftime. Remember the time zone, though.