I have a pandas dataframe in which each cell of a column contains a timestamp, saved as a string:
>>>dataset['DateTime'][1]
'2018-03-14 00:34:46'
I would like to create a new column in which those dates are manipulated in the following way:
year += 1,
month += 2,
day += 3,
hour += 4,
minute += 5,
second += 6
(Important to this manipulation is that the initial date and the new date have a one-to-one relation, so that I can transform the date back later onwards)
In my case, the output I am looking for is as follows:
>>> dataset['newTimestamp'][1]
'2019-05-17 04:39:52'
To do so I am using the datetime library to create datetime objects (as a test, I have started with one variable first):
timestamp = dataset['DateTime'][1]
p = datetime.datetime.strptime(timestamp, "%Y-%m-%d %H:%M:%S")
Currently I am doing arithmetics on the single variables:
year = p.year + 1
if p.month < 12:
month = p.month + 1
else:
month = 1
year += 1
However, as with the months, there are exceptions when you can and when you cannnot add values so that the new timestamp is still a real timestamp (12 + 1 = 13, which is not an actual month).
I could program every rule explicitly, but that seems too much work and I expect there are better ways. How could I do this faster?
Use DateOffset.
Also, have a look at relativedelta module for this kind of manipulations:
dataset['newTimestamp'] = pd.to_datetime(dataset['DateTime']) + pd.DateOffset(years=1, months=2, days=3, hours=4, minutes=5, seconds=6)
You should try out beautiful-date library:
pip install beautiful-date
And use it like so:
from beautiful_date import *
...
dataset['DateTime'].apply(lambda d: d + 1 * years + 2 * months + ... + 6 * seconds)
should do the trick.
strptime() and strftime() are the functions you are looking for.
Just go ahead and google the two fuctions. surely enough,you will be abe to solve the stated problem.
these can be used to directly manipulate date-time quantities
Related
I am creating an app where I have two values, a committee starting date for e-g 2022,01,02 and for how many months it will continue here it is (4months). Now I am saving some data in my database month wise and also these dates will save too. now the issue is I am getting right result if the number of month is less than or equal to 12 using this.
number of memebrs = 12
starting date = 2022,01,01
for i in range(1,17):
print('date', (2022,i,10))
but the issue comes when the months are greater than 12 so than date start printing 2022,01,13 which is false because I also want to increment the year to 2023, I feel like this is not really a good idea very inefficient looking way. can anyone tell me is there any other way to do this.
While you can use the datetime library to handle dates, it doesn't provide any methods to increase dates month by month.
Now, previous suggestions/answers suggest you increase the year when month == 12, but that will cause December to be skipped. Also, your code doesn't consider any given month in the starting date. So a better solution would be:
>>> year = 2022
>>> month = 7
>>> day = 23
>>>
>>> for i in range(1, 8):
... month += 1
... if month == 13:
... month = 1
... year += 1
... print(f'{year}-{month}-{day}')
...
2022-8-23
2022-9-23
2022-10-23
2022-11-23
2022-12-23
2023-1-23
2023-2-23
you could do something like this:
date = [2022,1,10]
for i in range(1,17):
if i%12==0:
date[0]+=1
date[1]=1
print('date', (time[0],i,time[2]))
By the tone of your question i think you are a beginner, so i won't
recommend you to use datetime module and i appreciate that you tried to do it on your own.
What i dont appreciate is that why cant you just use if statements and create variables for year and date
yr = 2022
dt = 1
for i in range(1,17):
print('date', (yr,i,dt))
if i % 12 == 0:
yr += 1
mn = 1
I also want to share the modern aproach using datetime module. But it requires some modules.
In your cmd enter the command
pip install python-dateutil
Once installed close cmd and refresh your ide
this is the code you may want to use
from datetime import datetime
from dateutil.relativedelta import relativedelta
date_time = datetime(2022, 1, 1) #Creating a Date object
for i in range(1, 17):
date = date_time.date()
print(date)
date_time = date_time + relativedelta(months=1)
I have two Python datetime and I want to count the days between those dates, counting ONLY the days belonging to the month I choose. The range might overlap multiple months/years.
Example:
If I have 2017-10-29 & 2017-11-04 and I chose to count the days in October, I get 3 (29, 30 & 31 Oct.).
I can't find a straightforward way to do this so I think I'm going to iterate over the days using datetime.timedelta(days=1), and increment a count each time the day belongs to the month I chose.
Do you know a more performant method?
I'm using Python 2.7.10 with the Django framework.
Iterating over the days would be the most straightforward way to do it. Otherwise, you would need to know how many days are in a given month and you would need different code for different scenarios:
The given month is the month of the first date
The given month is the month of the second date
The given month is between the first and the second date (if dates span more than two months)
If you want to support dates spanning more than one year then you would need the input to include month and year.
Your example fits scenario #1, which I guess you could do like this:
>>> from datetime import datetime, timedelta
>>>
>>> first_date = datetime(2017, 10, 29)
>>>
>>> first_day_of_next_month = first_date.replace(month=first_date.month + 1, day=1)
>>> last_day_of_this_month = first_day_of_next_month - timedelta(1)
>>> number_of_days_in_this_month = last_day_of_this_month.day
>>> number_of_days_in_this_month - first_date.day + 1
3
This is why I would suggest implementing it the way you originally intended and only turning to this if there's a performance concern.
You can get difference between two datetime objects by simply subtracting them.
So, we start by getting the difference between the two dates.
And then we generate all the dates between the two using
gen = (start_date + datetime.timedelta(days = e) for e in range(diff + 1))
And since we only want the dates between the specified ones, we apply a filter.
filter(lambda x : x==10 , gen)
Then we will sum them over.
And the final code is this:
diff = start_date - end_date
gen = (start_date + datetime.timedelta(days = e) for e in range(diff + 1))
filtered_dates = filter(
lambda x : x.month == 10 ,
gen
)
count = sum(1 for e in filtered_dates)
You can also use reduce but sum() is a lot more readable.
A potential method of achieving this is to first compare whether your start or end dates you are comparing have the same month that you want to choose.
For example:
start = datetime(2017, 10, 29)
end = datetime(2017, 11, 4)
We create a function to compare the dates like so:
def daysofmonth(start, end, monthsel):
if start.month == monthsel:
days = (datetime(start.year, monthsel+1, 1) - start).days
elif end.month == monthsel:
days = (end - datetime(end.year, monthsel, 1)).days
elif not (monthsel > start.month) & (end.month > monthsel):
return 0
else:
days = (datetime(start.year, monthsel+1, 1) - datetime(start.year, monthsel, 1)).days
return days
So, in our example setting monthsel gives:
>>> daysofmonth(start, end, 10)
>>> 3
Using pandas whit your dates:
import pandas as pd
from datetime import datetime
first_date = datetime(2017, 10, 29)
second_date = datetime(2017, 11, 4)
days_count = (second_date - first_date).days
month_date = first_date.strftime("%Y-%m")
values = pd.date_range(start=first_date,periods=days_count,freq='D').to_period('M').value_counts()
print(values)
print(values[month_date])
outputs
2017-10 3
2017-11 3
3
I've been trying to input into a mysql table using python, thing is I'm trying to create a list with all dates from April 2016 to now so I can insert them individually into the sql insert, I searched but I didn't find how can I change value per list result (if it's 1 digit or 2 digits):
dates = ['2016-04-'+str(i+1) for i in range(9,30)]
I would like i to add a 0 every time i is a single digit (i.e 1,2,3 etc.)
and when its double digit for it to stay that way (i.e 10, 11, 12 etc.)
dates = ['2016-04-'+ '{:02d}'.format(i) for i in range(9,30)]
>>> print dates
['2016-04-09', '2016-04-10', '2016-04-11', '2016-04-12', '2016-04-13', '2016-04-14', '2016-04-15', '2016-04-16', '2016-0
4-17', '2016-04-18', '2016-04-19', '2016-04-20', '2016-04-21', '2016-04-22', '2016-04-23', '2016-04-24', '2016-04-25', '
2016-04-26', '2016-04-27', '2016-04-28', '2016-04-29']
>>>
Using C style formatting, all the dates in April:
dates = ['2016-04-%02d'%i for i in range(1,31)]
Need range(1,31) since the last value in the range is not used, or use range(30) and add 1 to i.
The same using .format():
dates = ['2016-04-{:02}'.format(i) for i in range(1,31)]
You can use dateutil module
from datetime import datetime
from dateutil.rrule import rrule, DAILY
start_date = datetime(2016,04,01)
w=[each.strftime('%Y-%m-%d') for each in list(rrule(freq=DAILY, dtstart=start_date, until=datetime(2016,05,9)))]
I want to get all months between now and August 2010, as a list formatted like this:
['2010-08-01', '2010-09-01', .... , '2016-02-01']
Right now this is what I have:
months = []
for y in range(2010, 2016):
for m in range(1, 13):
if (y == 2010) and m < 8:
continue
if (y == 2016) and m > 2:
continue
month = '%s-%s-01' % (y, ('0%s' % (m)) if m < 10 else m)
months.append(month)
What would be a better way to do this?
dateutil.relativedelta is handy here.
I've left the formatting out as an exercise.
from dateutil.relativedelta import relativedelta
import datetime
result = []
today = datetime.date.today()
current = datetime.date(2010, 8, 1)
while current <= today:
result.append(current)
current += relativedelta(months=1)
I had a look at the dateutil documentation. Turns out it provides an even more convenient way than using dateutil.relativedelta: recurrence rules (examples)
For the task at hand, it's as easy as
from dateutil.rrule import *
from datetime import date
months = map(
date.isoformat,
rrule(MONTHLY, dtstart=date(2010, 8, 1), until=date.today())
)
The fine print
Note that we're cheating a little bit, here. The elements dateutil.rrule.rrule produces are of type datetime.datetime, even if we pass dtstart and until of type datetime.date, as we do above. I let map feed them to date's isoformat function, which just turns out to convert them to strings as if it were just dates without any time-of-day information.
Therefore, the seemingly equivalent list comprehension
[day.isoformat()
for day in rrule(MONTHLY, dtstart=date(2010, 8, 1), until=date.today())]
would return a list like
['2010-08-01T00:00:00',
'2010-09-01T00:00:00',
'2010-10-01T00:00:00',
'2010-11-01T00:00:00',
⋮
'2015-12-01T00:00:00',
'2016-01-01T00:00:00',
'2016-02-01T00:00:00']
Thus, if we want to use a list comprehension instead of map, we have to do something like
[dt.date().isoformat()
for dt in rrule(MONTHLY, dtstart=date(2010, 8, 1), until=date.today())]
use datetime and timedelta standard Python's modules - without installing any new libraries
from datetime import datetime, timedelta
now = datetime(datetime.now().year, datetime.now().month, 1)
ctr = datetime(2010, 8, 1)
list = [ctr.strftime('%Y-%m-%d')]
while ctr <= now:
ctr += timedelta(days=32)
list.append( datetime(ctr.year, ctr.month, 1).strftime('%Y-%m-%d') )
I'm adding 32 days to enter new month every time (longest months has 31 days)
It's seems like there's a very simple and clean way to do this by generating a list of dates and subsetting to take only the first day of each month, as shown in the example below.
import datetime
import pandas as pd
start_date = datetime.date(2010,8,1)
end_date = datetime.date(2016,2,1)
date_range = pd.date_range(start_date, end_date)
date_range = date_range[date_range.day==1]
print(date_range)
I got another way using datetime, timedelta and calender:
from calendar import monthrange
from datetime import datetime, timedelta
def monthdelta(d1, d2):
delta = 0
while True:
mdays = monthrange(d1.year, d1.month)[1]
d1 += timedelta(days=mdays)
if d1 <= d2:
delta += 1
else:
break
return delta
start_date = datetime(2016, 1, 1)
end_date = datetime(2016, 12, 1)
num_months = [i-12 if i>12 else i for i in range(start_date.month, monthdelta(start_date, end_date)+start_date.month+1)]
monthly_daterange = [datetime(start_date.year,i, start_date.day, start_date.hour) for i in num_months]
You could reduce the number of if statements to two lines instead of four lines because having a second if statement that does the same thing with the previous if statement is a bit redundant.
if (y == 2010 and m < 8) or (y == 2016 and m > 2):
continue
I don't know whether it's better, but an approach like the following might be considered more 'pythonic':
months = [
'{}-{:0>2}-01'.format(year, month)
for year in xrange(2010, 2016 + 1)
for month in xrange(1, 12 + 1)
if not (year <= 2010 and month < 8 or year >= 2016 and month > 2)
]
The main differences here are:
As we want the iteration(s) to produce a list, use a list comprehension instead of aggregating list elements in a for loop.
Instead of explicitly making a distinction between numbers below 10 and numbers 10 and above, use the capabilities of the format specification mini-language for the .format() method of str to specify
a field width (the 2 in the {:0>2} place holder)
right-alignment within the field (the > in the {:0>2} place holder)
zero-padding (the 0 in the {:0>2} place holder)
xrange instead of range returns a generator instead of a list, so that the iteration values can be produced as they're being consumed and don't have to be held in memory. (Doesn't matter for ranges this small, but it's a good idea to get used to this in Python 2.) Note: In Python 3, there is no xrange and the range function already returns a generator instead of a list.
Make the + 1 for the upper bounds explicit. This makes it easier for human readers of the code to recognize that we want to specify an inclusive bound to a method (range or xrange) that treats the upper bound as exclusive. Otherwise, they might wonder what's the deal with the number 13.
A different approach that doesn't require any additional libraries, nor nested or while loops. Simply convert your dates into an absolute number of months from some reference point (it can be any date really, but for simplicity we can use 1st January 0001). For example
a=datetime.date(2010,2,5)
abs_months = a.year * 12 + a.month
Once you have a number representing the month you are in you can simply use range to loop over the months, and then convert back:
Solution to the generalized problem:
import datetime
def range_of_months(start_date, end_date):
months = []
for i in range(start_date.year * 12 + start_date.month, end_date.year*12+end_date.month + 1)
months.append(datetime.date((i-13) // 12 + 1, (i-1) % 12 + 1, 1))
return months
Additional Notes/explanation:
Here // divides rounding down to the nearest whole number, and % 12 gives the remainder when divided by 12, e.g. 13 % 12 is 1.
(Note also that in the above date.year *12 + date.month does not give the number of months since the 1st of January 0001. For example if date = datetime.datetime(1,1,1), then date.year * 12 + date.month gives 13. If I wanted to do the actual number of months I would need to subtract 1 from the year and month, but that would just make the calculations more complicated. All that matters is that we have a consistent way to convert to and from some integer representation of what month we are in.)
fresh pythonic one-liner from me
from dateutil.relativedelta import relativedelta
import datetime
[(start_date + relativedelta(months=+m)).isoformat()
for m in range(0, relativedelta(start_date, end_date).months+1)]
In case you don't have any months duplicates and they are in correct order you can get what you want with this.
from datetime import date, timedelta
first = date.today()
last = first + timedelta(weeks=20)
date_format = "%Y-%m"
results = []
while last >= first:
results.append(last.strftime(date_format))
last -= timedelta(days=last.day)
Similar to #Mattaf, but simpler...
pandas.date_range() has an option frequency freq='m'...
Here I am adding a day (pd.Timedelta('1d')) in order to reach the beginning of each new month:
import pandas as pd
date_range = pd.date_range('2010-07-01','2016-02-01',freq='M')+pd.Timedelta('1d')
print(list(date_range))
This question already has answers here:
Generate a list of datetimes between an interval
(5 answers)
Closed 8 years ago.
I have two yyyymm values that will be input by a user:
yyyymm_1 = '201406'
yyyymm_2 = '201501'
I want to be able to iterate through this range in increasing month order:
for yyyy and mm in the range of yyyymm_1 to yyyymm_2
my_function( yyyy, mm )
How can this be done in python?
Update:
Ideally, the solution should be as simple as possible without requiring external libraries. I'm not looking for a generic date manipulation solution, but a solution to answer the specific question I have asked above.
I had seen lots of generic solutions before posting my question. However, being a python noob, couldn't see how to adapt them to my question:
Generate a list of datetimes between an interval
Iterating through a range of dates in Python
On that note, the other questions linked to from this page are much more generic. If you are looking to generate a range of yyyymm values, I urge you to look at the selected answer on this page.
Here's another rather simple variant, without even using datetime. Just split the date, calculate the 'total month', and iterate.
def to_month(yyyymm):
y, m = int(yyyymm[:4]), int(yyyymm[4:])
return y * 12 + m
def iter_months(start, end):
for month in range(to_month(start), to_month(end) + 1):
y, m = divmod(month-1, 12) # ugly fix to compensate
yield y, m + 1 # for 12 % 12 == 0
for y, m in iter_months('201406', '201501'):
print y, m
Output:
2014 6
2014 7
...
2014 12
2015 1
For output in the same yyyymm format, use print("%d%02d" % (y, m)).
You can do this using the builtin datetime module and the third party package dateutil.
The code first converts your strings to datetime.datetime objects using datetime.datetime.strptime. It then uses the relativedelta function from dateutil to create a period of one month that can be added to your datetimes.
Within the while loop you can either work with the datetime objects directly, or construct the month and year as strings using strftime, I've shown an example of both in print functions.
import datetime as dt
from dateutil.relativedelta import relativedelta
yyyymm_1 = '201406'
yyyymm_2 = '201501'
MONTH = relativedelta(months=+1)
fmt = '%Y%m'
date_1 = dt.datetime.strptime(yyyymm_1, fmt).date()
date_2 = dt.datetime.strptime(yyyymm_2, fmt).date()
d = date_1
while d <= date_2:
print(d)
print(d.strftime('%Y'), d.strftime('%m'))
d += MONTH