Iteration over two lists using index - python

I am trying to create a list of time and date at specific intervals. The times and dates are present in a time series csv and I want to write a code that extracts data from specific time intervals. I made two lists for day and hour and I am creating a new variable that that stores the date and time of interest. I have trying the following code but I get error:
day = ['01', '02', '03', '04', "05", '06', '07', '08', '09', '10', '11', '12','13','14','15','16','17','18'
'19','20','21','22','23','24','25','26','27','28','29','30','31']
hour = ['0', '3', '6', '9', '12', '15','18','21']
year, month, day, hour = year, month, day, hour # 2016-01-01 #01:00 am
day_time = []
for i in day.index:
for j in hour.index:
day_time = int("".join(day[i], hour[j], "00",))
print(day_time)
TypeError Traceback (most recent call last)
<ipython-input-72-15de17abf279> in <module>
6 year, month, day, hour = year, month, day, hour # 2016-01-01 #01:00 am
7 day_time = []
----> 8 for i in day.index:
9 for j in hour.index:
10 day_time = int("".join(day[i], hour[j], "00",))
TypeError: 'builtin_function_or_method' object is not iterable
can someone suggest a solution?

index is a function, not an attribute for list instance. please refer to Data structures
also, the join function of a str data type takes iterables, refer to here
Also, as #Lecdi pointed, you should use append to add to a list instead of redefinition of the variable using =; please refer to here
to be able to do what you want to do:
day = ['01', '02', '03', '04', "05", '06', '07', '08', '09', '10', '11', '12','13','14','15','16','17','18'
'19','20','21','22','23','24','25','26','27','28','29','30','31']
hour = ['0', '3', '6', '9', '12', '15','18','21']
year, month, day, hour = year, month, day, hour # 2016-01-01 #01:00 am
day_time = []
for day_i in day:
for hour_i in hour:
day_time.append(int("".join([day_i, hour_i, "00"])))
print(day_time)

I think enumerate() would work better for you
for indexDay, valueDay in enumerate(day):
for indexHour, valueHour in enumerate(hour):
day_time.append(int("".join([valueDay, valueHour, "00"])))

Related

Python - Read file, rearrange date and change year from yy to yyyy

I'm a Python newb. I've been learning a bit and am on files. So I have a program that I'm trying to write where it reads data from a file. Each line in the file is a date in the format dd-mmm-yy for example, 13-SEP-20. For simplicity, each year is assumed to be in the year 2000 or later. I need to replace SEP with 09 and rearrange the output to be mm-dd-yyyy. So the end result would be:
13-SEP-20 to 09-13-2020
So far, I have the below code. It's replacing values on each line using the dictionary for the month and rearranging to the correct order. Next, I need to change the year to 20nn but I'm not sure how to do that part.
import datetime
months = {'JAN': '01', 'FEB': '02', 'MAR': '03', 'APR': '04', 'MAY': '05', 'JUN': '06',
'JUL': '07', 'AUG': '08', 'SEP': '09', 'OCT': '10', 'NOV': '11', 'DEC': '12'}
result_dic = {}
#replace mmm with mm int
with open('date_file.txt') as fh:
giv_date=fh.read().splitlines()
for line in giv_date:
line = line.rstrip()
for mon_alph, mon_num in months.items():
if mon_alph in line:
line = line.replace(mon_alph, mon_num)
line = datetime.datetime.strptime(line,'%d-%m-%y')
line = datetime.datetime.strftime(line,'%m-%d-%y')
print(line)
The output from the above for the first few lines is:
09-13-20
09-11-20
09-10-20
08-27-19
08-24-20
Can someone assist me with how I can change the yy to yyyy? For simplicity I'd say possibly just adding 2000 to each year but I think that would possibly be too complex and there may be a simpler way? Thank you in advance for your assistance.
Try %Y instead of %y. You can read more about it in the python documentation about strftime. Basically, the %y will return only 2 last digits, while %Y will return 2021.
To have 4 digits year representation replace last line with
line = datetime.datetime.strftime(line,'%m-%d-%Y')

Looping through a list of tuples and removing them

Im having some difficulty understanding why my loop is not deleting invalid dates from a list of date tuples in the format of dd/mm/yyyy . heres what i have so far :
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
print(dates)
for date in dates :
day = int(date[0])
month = int(date[1])
year = int(date[2])
if day > 31 :
dates.remove(date)
if month > 12 :
dates.remove(date)
print(dates)
and heres the result :
[('12', '10', '1987'), ('13', '09', '2010'), ('34', '02', '2002'), ('02', '15', '2005'), ('37', '10', '2016'), ('39', '11', '2001')]
[('12', '10', '1987'), ('13', '09', '2010'), ('02', '15', '2005'), ('39', '11', '2001')]
I'm a total beginner and any help would be much appreciated.
Never modify the (length of the) list you are looping over. Instead, use for example a temporary list:
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
print(dates)
out = []
for date in dates :
day = int(date[0])
month = int(date[1])
year = int(date[2])
if day > 31 or month > 12:
continue
out.append(date)
dates = out
print(dates)
The continue statement jumps back to the first line of the loop, so the unwanted dates will be skipped.
Better alternative conserning dates
Commenting on the "date checking" functionality of the program: It might be really hard to determine by your own rules what dates are acceptable and what are not. Consider for example the Feb 29th, which is only valid on every fourth year.
What you could do instead is to use the datetime library to try to parse the strings to datetime objects, and if the parsing fails, you know the date is illegal.
import datetime as dt
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
def filter_bad_dates(dates):
out = []
for date in dates:
try:
dt.datetime.strptime('-'.join(date), '%d-%m-%Y')
except ValueError:
continue
out.append(date)
return out
dates = filter_bad_dates(dates)
print(dates)
This try - except pattern is also called "Duck Typing":
If it looks like a date and gets parsed like a proper date, then it is probably a proper date.
You can easily accomplish that with this list comprehension:
dates = [('12','10','1987'),('13','09','2010'), ('34','02','2002'), ('02','15','2005'),('37','10','2016'),('39','11','2001')]
dates = [date for date in dates if int(date[1]) < 12 and int(date[0]) < 31]
print(dates)
Output:
[('12', '10', '1987'), ('13', '09', '2010')]
I like #AnnZen's comprehension approach (+1) though my tendency would be to go more symbolic at the waste of some time and space:
dates = [ \
('12', '10', '1987'), \
('13', '09', '2010'), \
('34', '02', '2002'), \
('02', '15', '2005'), \
('37', '10', '2016'), \
('39', '11', '2001'), \
]
dates = [date for (day, month, _), date in zip(dates, dates) if day < '31' and month < '12']
print(dates)
OUTPUT
> python3 test.py
[('12', '10', '1987'), ('13', '09', '2010')]
>
As far as #np8's "Never modify the list you are looping over.", that's excellent advice. Though, again, I might waste some space making the copy upfront to make my code simpler:
for date in list(dates): # iterate over a copy
day, month, _ = date
if int(day) > 31 or int(month) > 12:
dates.remove(date)
Though in the end, #np8's filtering through datetime seems the most reliable solution. (+1)

In python..looking for a simple code to output string from datetime and float from a list

I would like to loop through each average(index[0]) and each hour(index[1]) (in this order) in the first five lists of:
a = [[38.59, '15'], [23.81, '02'], [21.52, '20'], [16.8, '16'], [16.01, '21'], [14.74, '13'], [13.44, '10'], [13.24, '18'], [13.23, '14'], [11.46, '17']]
I would like to use the str.format() method to print the hour and average in the following format:
output str = "15:00: 38.59 average comments per post"
To format the hours, I can use the datetime.strptime() constructor to return a datetime object and then use the strftime() method to specify the format of the time.
To format the average, I can use {:.2f} to indicate that just two decimal places should be used.
How can I accomplish this with 2-3 lines of coding?
from datetime import datetime as dt
a = [[38.59, '15'], [23.81, '02'], [21.52, '20'], [16.8, '16'], [16.01, '21'], [14.74, '13'], [13.44, '10'], [13.24, '18'], [13.23, '14'], [11.46, '17']]
for elem in a[:5]:
dtObj = dt.strptime(elem[1], '%H')
timeString = dt.strftime(dtObj, '%H:%M')
roundedAverageString = "{0:.2f}".format(elem[0])
print("{}: {} average comments per post".format(timeString, roundedAverageString))
output example:
15:00: 38.59 average comments per post
Using datetime is overkill given the data:
data = [[38.59, '15'], [23.81, '02'], [21.52, '20'], [16.8, '16'], [16.01, '21'], [14.74, '13'], [13.44, '10'], [13.24, '18'], [13.23, '14'], [11.46, '17']]
for ave,hr in data[:5]:
print(f'{hr}:00: {ave:5.2f} average comments per post')
15:00: 38.59 average comments per post
02:00: 23.81 average comments per post
20:00: 21.52 average comments per post
16:00: 16.80 average comments per post
21:00: 16.01 average comments per post

create separate new list from one mother list

I am trying to do a script that read a seismic USGS bulletin and take some data to build a new txt file in order to have an input for other program called Zmap to do seismic statistics
SO I have the following USGS bulletin format:
time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,net,id,updated,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource
2016-03-31T07:53:28.830Z,-22.6577,-68.5345,95.74,4.8,mww,,33,0.35,0.97,us,us20005dm3,2016-05-07T05:09:39.040Z,"43km NW of San Pedro de Atacama, Chile",earthquake,6.5,4.3,,,reviewed,us,us
2016-03-31T07:17:19.300Z,-18.779,-67.3104,242.42,4.5,mb,,65,1.987,0.85,us,us20005dlx,2016-04-24T07:21:05.358Z,"55km WSW of Totoral, Bolivia",earthquake,10.2,12.6,0.204,7,reviewed,us,us
this has many seismics events, so I did the following code which basically tries to read, split and save some variables in list to put them all together in a final *txt file.
import os, sys
import csv
import string
from itertools import (takewhile,repeat)
os.chdir('D:\\Seismic_Inves\\b-value_osc\\try_tonino')
archi=raw_input('NOMBRE DEL BOLETIN---> ')
ff=open(archi,'rb')
bufgen=takewhile(lambda x: x, (ff.read(1024*1024) for _ in repeat(None)))
numdelins= sum(buf.count(b'\n') for buf in bufgen if buf) - 1
with open(archi,'rb') as f:
next(f)
tiempo=[]
lat=[]
lon=[]
prof=[]
mag=[]
t_mag=[]
leo=csv.reader(f,delimiter=',')
for line in leo:
tiempo.append(line[0])
lat.append(line[1])
lon.append(line[2])
prof.append(line[3])
mag.append(line[4])
t_mag.append(line[5])
tiempo=[s.replace('T', ' ') for s in tiempo] #remplaza el tema de la T por espacio
tiempo=[s.replace('Z','') for s in tiempo] #quito la Z
tiempo=[s.replace(':',' ') for s in tiempo] # quito los :
tiempo=[s.replace('-',' ') for s in tiempo] # quito los -
From the USGS catalog I'd like to take the: Latitude (lat), longitude(lon), time(tiempo), depth (prof), magnitude (mag), type of magnitude (t_mag), with this part of teh code I took the variables I needed:
next(f)
tiempo=[]
lat=[]
lon=[]
prof=[]
mag=[]
t_mag=[]
leo=csv.reader(f,delimiter=',')
for line in leo:
tiempo.append(line[0])
lat.append(line[1])
lon.append(line[2])
prof.append(line[3])
mag.append(line[4])
t_mag.append(line[5])
but I had some troubles with the tim, so I applied my newbie knowledge to split the time from 2016-03-31T07:53:28.830Z to 2016 03 31 07 53 28.830.
Now I am suffering trying to have in one list the year ([2016,2016,2016,...]) in other list the months ([01,01,...03,03,...12]), in other the day ([12,14,...03,11]), in other the hour ([13,22,14,17...]), and the minutes with seconds merged by a point (.) like ([minute.seconds]) or ([12.234,14.443,...]), so I tryied to do this (to plit the spaces) and no success
tiempo2=[]
for element in tiempo:
tiempo2.append(element.split(' '))
print tiempo2
no success because i got this result:
[['2016', '03', '31', '07', '53', '28.830'], ['2016', '03', '31', '07', '17', '19.300'].
can you give me a hand in this part?, or is there a pythonic way to split the date like I said before.
Thank you for the time you spent reading it.
best regards.
Tonino
suppose our tiempo2 holds the following value extracted from the csv :
>>> tiempo2 = [['2016', '03', '31', '07', '53', '28.830'], ['2016', '03', '31', '07', '17', '19.300']]
>>> list (map (list, (map (float, items) if index == 5 else map (int, items) for index, items in enumerate (zip (*tiempo2)))))
[[2016, 2016], [3, 3], [31, 31], [7, 7], [53, 17], [28.83, 19.3]]
here we used the zip function to zip years, months, days, etc ...
I applied the conditional mapping for each item to an int if the index of the list is not the last otherwise to a float
I would suggest using the time.strptime() function to parse the time string into a Python time.struct_time which is a namedtuple. That means you can access any attributes you want using . notation.
Here's what I mean:
import time
time_string = '2016-03-31T07:53:28.830Z'
timestamp = time.strptime(time_string, '%Y-%m-%dT%H:%M:%S.%fZ')
print(type(timestamp))
print(timestamp.tm_year) # -> 2016
print(timestamp.tm_mon) # -> 3
print(timestamp.tm_mday) # -> 31
print(timestamp.tm_hour) # -> 7
print(timestamp.tm_min) # -> 53
print(timestamp.tm_sec) # -> 28
print(timestamp.tm_wday) # -> 3
print(timestamp.tm_yday) # -> 91
print(timestamp.tm_isdst) # -> -1
You could process a list of time strings by using a for loop as shown below:
import time
tiempo = ['2016-03-31T07:53:28.830Z', '2016-03-31T07:17:19.300Z']
for time_string in tiempo:
timestamp = time.strptime(time_string, '%Y-%m-%dT%H:%M:%S.%fZ')
print('year: {}, mon: {}, day: {}, hour: {}, min: {}, sec: {}'.format(
timestamp.tm_year, timestamp.tm_mon, timestamp.tm_mday,
timestamp.tm_hour, timestamp.tm_min, timestamp.tm_sec))
Output:
year: 2016, mon: 3, day: 31, hour: 7, min: 53, sec: 28
year: 2016, mon: 3, day: 31, hour: 7, min: 17, sec: 19
Another solution with the iso8601 add-on (pip install iso8601)
>>> import iso8601
>>> dt = iso8601.parse_date('2016-03-31T07:17:19.300Z')
>>> dt.year
2016
>>> dt.month
3
>>> dt.day
31
>>> dt.hour
7
>>> dt.minute
17
>>> dt.second
10
>>> dt.microsecond
300000
>>> dt.tzname()
'UTC'
Edited 2017/8/6 12h55
IMHO, it is a bad idea to split the datetime timestamp objects into components (year, month, ...) in individual lists. Keeping the datetime timestamp objects as provided by iso8601.parse_date(...) could help to compute time deltas between events, check the chronological order, ... See the doc of the datetime module for more https://docs.python.org/3/library/datetime.html
Having distinct lists for year, month, (...) would make such operations difficult. Anyway, if you prefer this solution, here are the changes
import iso8601
# Start as former solution
with open(archi,'rb') as f:
next(f)
# tiempo=[]
dt_years = []
dt_months = []
dt_days = []
dt_hours = []
dt_minutes = []
dt_timezones = []
lat=[]
lon=[]
prof=[]
mag=[]
t_mag=[]
leo=csv.reader(f,delimiter=',')
for line in leo:
# tiempo.append(line[0])
dt = iso8601.parse_date(line[0])
dt_years.append(dt.year)
dt_months.append(dt.month)
dt_days.append(dt.day)
dt_hours.append(dt.hour)
dec_minutes = dt.minute + (dt.seconds / 60) + (dt.microsecond / 600000000)
dt_minutes.append(dec_minutes)
dt_timezones.append(dt.tzname())
lat.append(line[1])
lon.append(line[2])
prof.append(line[3])
mag.append(line[4])
t_mag.append(line[5])

python parse java calendar to isodate

I've data like this.
startDateTime: {'timeZoneID': 'America/New_York', 'date': {'year': '2014', 'day': '29', 'month': '1'}, 'second': '0', 'hour': '12', 'minute': '0'}
This is just a representation for 1 attribute. Like this i've 5 other attributes. LastModified, created etc.
I wanted to derive this as ISO Date format yyyy-mm-dd hh:mi:ss. is this the right way for doing this?
def parse_date(datecol):
x=datecol;
y=str(x.get('date').get('year'))+'-'+str(x.get('date').get('month')).zfill(2)+'-'+str(x.get('date').get('day')).zfill(2)+' '+str(x.get('hour')).zfill(2)+':'+str(x.get('minute')).zfill(2)+':'+str(x.get('second')).zfill(2)
print y;
return;
That works, but I'd say it's cleaner to use the string formatting operator here:
def parse_date(c):
d = c["date"]
print "%04d-%02d-%02d %02d:%02d:%02d" % tuple(map(str, (d["year"], d["month"], d["day"], c["hour"], c["minute"], c["second"])))
Alternatively, you can use the time module to convert your fields into a Python time value, and then format that using strftime. Remember the time zone, though.

Categories