Converting Julian to calendar date using pandas - python

I am trying to convert Julian codes to calendar dates in pandas using :
pd.to_datetime(43390, unit = 'D', origin = 'Julian')
This is giving me ValueError: origin Julian cannot be converted to a Timestamp

You need to set origin = 'julian'
pd.to_datetime(43390, unit = 'D', origin = 'julian')
but this number (43390) throws
OutOfBoundsDatetime: 43390 is Out of Bounds for origin='julian'
because the bounds are from 2333836 to 2547339
(Timestamp('1677-09-21 12:00:00') to Timestamp('2262-04-11 12:00:00'))

Method 1 - using Julian for origin didn't work
Method 2 - using excel start date to calculate other dates. All other date values will be referenced from excel default start date.
Finally this worked for me.
pd.to_datetime(43390, unit = 'D', origin=pd.Timestamp("30-12-1899"))

Below code works only for 6 digit julian value. It also handles the calendar date for leap and non-leap years.
A Julian date is considered as "CYYDDD". Where C represents century, YY represents Year and DDD represents total days which are then further defined in Days and Months.
import pandas as pd
from datetime import datetime
jul_date = '120075'
add_days = int(jul_date[3:6])
cal_date = pd.to_datetime(datetime.strptime(str(19+int(jul_date[0:1]))+jul_date[1:3]+'-01-01','%Y-%m-%d'))-timedelta(1)+pd.DateOffset(days= add_days)
print(cal_date.strftime('%Y-%m-%d'))
output: 2020-03-15
without timedelta(1): 2020-03-16
Here datetime.strptime function is being used to cast date type from string to date.
%Y represents year in 4 digit (1980)
%m & %d represents month and day in digits.
strftime('%Y-%m-%d') is used to remove timestamp from the date.
timedelta(1) :- It's used to minus one day from the date because we've concatenated year with '01-01'. so when total no's of days being split to days and months, one day will not be extra.

Related

How can I get the year, month, and day from a Deephaven DateTime in Python?

I have a Deephaven DateTime in the New York (US-East) timezone and I'd like to get the year, month, and day (of the month) numbers from it as integers in Python.
Deephaven's time module has these utilities. You may have used it to create a Deephaven DateTime in the first place.
from deephaven import time as dhtu
timestamp = dhtu.to_datetime("2022-04-01T12:00:00 NY")
The following three methods will give you what you're looking for:
year - Gets the year
month_of_year - Gets the month
day_of_month - Gets the day of the month
All three methods will give you what you want based on the DateTime itself and your preferred time zone.
tz_ny = dhtu.TimeZone.NY
year = dhtu.year(timestamp, tz_ny)
month = dhtu.month_of_year(timestamp, tz_ny)
day = dhtu.day_of_month(timestamp, tz_ny)

Work out if date is in current week python

I'm trying to write a bit of code to check if a document has been updated this week, and if not to read in the data and update it. I need to be able to check if the last modified date/time of the document occurred in this week or not (Monday-Sunday).
I know this code gives me the last modified time of the file as a float of secconds since the epoch:
os.path.getmtime('path')
And I know I can use time.ctime to get that as a string date:
time.ctime(os.path.getmtime('path'))
But I'm not sure how to check if that date was in the current week. I also don't know if its easier to convert to a datetime object rather than ctime for this?
you can use datetime.isocalendar and compare the week attribute, basicallly
import os
from datetime import datetime
t_file = datetime.fromtimestamp(os.path.getmtime(filepath))
t_now = datetime.now()
print(t_file.isocalendar().week == t_now.isocalendar().week)
# or print(t_file.isocalendar()[1]== t_now.isocalendar()[1])
# to compare the year as well, use e.g.
print(t_file.isocalendar()[:2] == t_now.isocalendar()[:2])
The ISO year consists of 52 or 53 full weeks, and where a week starts on a Monday and ends on a Sunday. The first week of an ISO year is the first (Gregorian) calendar week of a year containing a Thursday. This is called week number 1, and the ISO year of that Thursday is the same as its Gregorian year.

Conversion of set of numbers into Date Format using Python

I have a dataframe named 'train' with column ID which represents 'date' in a very unusual manner. For e.g. certain entry in ID:
For example, the value of ID 2013043002 represents the date 30/04/2013
02:00:00
First 4 digits represents year, subsequent 2 digits represent month and day respectively. And last two digits represent time.
So I want to convert this into proper date time format to perform time series analysis.
Use to_datetime with parameter format - check http://strftime.org/:
df = pd.DataFrame({'ID':[2013043002,2013043002]})
df['ID'] = pd.to_datetime(df['ID'], format='%Y%m%d%H')
print(df)
ID
0 2013-04-30 02:00:00
1 2013-04-30 02:00:00
print(df['ID'].dtype)
datetime64[ns]
Use datetime for date time manipulations.
datetime.strptime(d,"%Y%m%d%H").strftime("%d/%m/%Y %H:%M:%S")
First, if you are gonna have ALWAYS the same input style in the Id you could play with string or digit formating ...
Id = 2013043002
Year = Id[0:3]
Month = Id[4:5]
Day = Id[6:7]
Time= Id[-2:-1]
DateFormat = "{}-{}-{}".format(Day,Month,Year)
TimeFormar = "%d:00:00"%Time
Print (DateFormat)
Output:
04:30:2013
Then with this you could wrap it into a function and pass every Ids by loops and manage your data.
Of course, if you dont know your previous ID incomming format you should used the other time module options, and manage the string formating to show it in the order you want.
By using the module datetime you can do that easily with the function strptime :
my_date = datetime.datetime.strptime(ID, "%Y%m%d%H")
"%Y%m%d%H"
is the format of your date : %Y is the year, %m is the month(0 padded), %d is the day(0 padded) and %H is the hour(24H, 0 padded). See http://strftime.org/ for more.

In Python/Pandas how do I convert century-months to DateTimeIndex?

I am working with a dataset that encodes dates as the integer number of months since December 1899, so month 1 is January 1900 and month 1165 is January 1997. I would like to convert to a pandas DateTimeIndex. So far the best I've come up with is:
month0 = np.datetime64('1899-12-15')
one_month = np.timedelta64(30, 'D') + np.timedelta64(10.5, 'h')
birthdates = pandas.DatetimeIndex(month0 + one_month * resp.cmbirth)
The start date is the 15th of the month, and the timedelta is 30 days 10.5 hours, the average length of a calendar month. So the date within the month drifts by a day or two.
So this seems a little hacky and I wondered if there's a better way.
You can use built-in pandas date-time functionality.
import pandas as pd
import numpy as np
indexed_months = np.random.random_integers(0, high=1165, size=100)
month0 = pd.to_datetime('1899-12-01')
date_list = [month0 + pd.DateOffset(months=mnt) for mnt in indexed_months]
birthdates = pd.DatetimeIndex(date_list)
I've made an assumption that your resp.cmbirth object looks like an array of integers between 0 and 1165.
I'm not quite clear on why you want the bin edges of the indices to be offset from the start or end of the month. This can be done:
shifted_birthdates = birthdates.shift(15, freq=pd.datetools.day)
and similarly for hours if you want. There is also useful info in the answers to this SO question and the related pandas github issue.

Compute number of dates between two string dates and return an integer

I have a .txt file data-set like this with the date column of interest:
1181206,3560076,2,01/03/2010,46,45,M,F
2754630,2831844,1,03/03/2010,56,50,M,F
3701022,3536017,1,04/03/2010,40,38,M,F
3786132,3776706,2,22/03/2010,54,48,M,F
1430789,3723506,1,04/05/2010,55,43,F,M
2824581,3091019,2,23/06/2010,59,58,M,F
4797641,4766769,1,04/08/2010,53,49,M,F
I want to work out the number of days between each date and 01/03/2010 and replace the date with the days offset {0, 2, 3, 21...} yielding an output like this:
1181206,3560076,2,0,46,45,M,F
2754630,2831844,1,2,56,50,M,F
3701022,3536017,1,3,40,38,M,F
3786132,3776706,2,21,54,48,M,F
1430789,3723506,1,64,55,43,F,M
2824581,3091019,2,114,59,58,M,F
4797641,4766769,1,156,53,49,M,F
I've been trying for ages and its getting really frustrating. I've tried converting to datetime using the datetime.datetime.strptime( '01/03/2010', "%d/%m/%Y").date() method and then subtracting the two dates but it gives me an output of e.g. '3 days, 0:00:00' but I can't seem to get an output of only the number!
The difference between two dates is a timedelta. Any timedelta instance has days attribute that is an integer value you want.
This is fairly simple. Using the code you gave:
date1 = datetime.datetime.strptime('01/03/2010', '%d/%m/%Y').date()
date2 = datetime.datetime.strptime('04/03/2010', '%d/%m/%Y').date()
You get two datetime objects.
(date2-date1)
will give you the time delta. The mistake you're making is to convert that timedelta to a string. timedelta objects have a days attribute. Therefore, you can get the number of days using it:
(date2-date1).days
This generates the desired output.
Using your input (a bit verbose...)
#!/usr/bin/env python
import datetime
with open('input') as fd:
d_first = datetime.date(2010, 03, 01)
for line in fd:
date=line.split(',')[3]
day, month, year= date.split(r'/')
d = datetime.date(int(year), int(month), int(day))
diff=d - d_first
print diff.days
Gives
0
2
3
21
64
114
156
Have a look at pleac, a lot of date-example there using python.

Categories