Break-up year, months & days in Pandas - python

I have a input parameter dictionary as below -
InparamDict = {'DataInputDate':'2014-10-25'
}
Using the field InparamDict['DataInputDate'], I want to pull up data from 2013-10-01 till 2013-10-25. What would be the best way to arrive at the same using Pandas?
The sql equivalent is -
DATEFROMPARTS(DATEPART(year,GETDATE())-1,DATEPART(month,GETDATE()),'01')

You forgot to mention if you're trying to pull up data from a DataFrame, Series or what. If you just want to get the date parts, you just have to get the attribute you want from the Timestamp object.
from pandas import Timestamp
dt = Timestamp(InparamDict['DataInputDate'])
dt.year, dt.month, dt.day

If the dates are in a DataFrame (df) and you convert them to dates instead of strings. You can select the data by ranges as well, for instance
df[df['DataInputDate'] > datetime(2013,10,1)]

Related

How to change pandas' Datetime Index from "End of month" To just "Month"

I'm using pandas to analyze some data about the House Price Index of all states from quandl:
HPI_Data = quandl.get("FMAC/HPI_AK")
The data looks something like this:
HPI Alaska
Date
1975-01-31 35.105461
1975-02-28 35.465209
1975-03-31 35.843110
and so on.
I've got a second dataframe with some special dates in it:
Date
Name
David 1979-08
Allen 1980-08
Hugo 1989-09
The values for "Date" here are of "string" type and not "date".
I'd like to go 6 months back from each date in the special dataframe and see the values in the HPI dataframe.
I'd like to use .loc but I have not been able to convert the first dataframe's index from "END OF MONTH" to "MONTH". even after resampling to "1D" then back to "M".
I'd would appreciate any help, if it solves the problem a different way or the janky data deleting way I want :).
Not sure if I understand correctly. So please clarify your question if this is not correct.
You can convert a string to a pandas date time object using pd.to_datetime and use the format parameter to specify how to parse the string
import pandas as pd
# Creating a dummy Series
sr = pd.Series(['2012-10-21 09:30', '2019-7-18 12:30', '2008-02-2 10:30',
'2010-4-22 09:25', '2019-11-8 02:22'])
# Convert the underlying data to datetime
sr = pd.to_datetime(sr)
# Subtract 6 months of the datetime series
sr-pd.DateOffset(month=6)
In regards to changing the datetime to just month i.e. 2012-10-21 09:30 --> 2012-10 I would do this:
sr.dt.to_period('M')

How to extract a date from a SQL Server Table and store it in a variable in Pandas without noise, only the date

I try to extract a date from a SQL Server Table. I get my query to return it like this:
Hours = pd.read_sql_query("select * from tblAllHours",con)
Now I convert my "Start" Column in the Hours dataframe like this:
Hours['Start'] = pd.to_datetime(Hours['Start'], format='%Y-%m-%d')
then I select the row I want in the column like this:
StartDate1 = Hours.loc[Hours.Month == Sym1, 'Start'].values
Now, if I print my variable print(StartDate1) I get this result:
[datetime.date(2020, 10, 1)]
What I need is actually 2020-10-01
How can I get this result?
You currently have a column of datetimes. The format you're requesting is a string format
Use pandas.Series.dt.strftime to convert the datetime to a string
pd.to_datetime(Hours['Start'], format='%Y-%m-%d'): format tells the parser what format your dates are in, so they can be converted to a datetime, it is not a way to indicate the format you want the datetime.
Review pandas.to_datetime
If you want only the values, not the Series, use .values at the end of the following command, as you did in the question.
start_date_str = Hours.Start.dt.strftime('%Y-%m-%d')
try
print(Hours['Start'].dt.strftime('%Y-%m-%d').values)
result is a list of YYYY-MM-dd
['2020-07-03', '2020-07-02']
a bit similar to this How to change the datetime format in pandas

parse odd dataframe index to datetime

I have a dataframe that I've pulled from the EIA API, however, all of the index values are of the format 'YYYY mmddTHHZ dd'. For example, 11am on today's date appears as '2020 0317T11Z 17'.
What I would like to be able to do is parse this index such that there is a separate ['Date'] and ['Time']column with the date in YYYY-mm-dd format and the hour as a singular number, i.e. 11.
It is not a datetime object and I'm not sure how to parse an index and replace in this manner. Any help is appreciated.
Thanks.
Remove the excessive part:
s = pd.Series(['2020 0317T11Z 17'])
datetimes = pd.to_datetime(s.str[:-4], format='%Y %m%dT%H')
# after converting to datetime, you can extract
dates = datetimes.dt.normalize()
times = datetimes.dt.time
# or better
# times = dtatetimes - date

Sort and Filter data from a Panda Dataframe according to date range

My dataframe has two columns: (i) a date column in a string format and (ii) an int value. I would like to convert the date string into a date object and then filter and sort the data according to a date range. Converting one string to a date worked fine with:
date = dateutil.parser.parse(date_string)
date = ("%02d:%02d:%02d" % (date.hour, date.minute, date.second))
How can I iterate on all the values in the dataframe and apply the parsing so I can then use the panda library on the df to filter and sort the data as follows?
df.sort(['etime'])
df[df['etime'].isin([begin_date, end_date])]
Sample of my dataframe data is below:
etime instantaneous_ops_per_sec
3 2016-06-15T15:30:09Z 26
4 2016-06-15T15:30:14Z 26
5 2016-06-15T15:30:19Z 24
6 2016-06-15T15:30:24Z 27
You want to use pd.to_datetime:
df['etime'] = pd.to_datetime(df['etime'], format="%H:%M:%S")
Try this:
df['etime'] = pd.to_datetime(df['etime'], format="%Y%m%d %H:%M:%S")
df[df['etime'].between([begin_date, end_date])]
Caution: Since your code says date and you use time and then sort on time. The results may not be what you are after. You usually want to filter then sort, But the code in OP does the opposite.

Pandas: select all dates with specific month and day

I have a dataframe full of dates and I would like to select all dates where the month==12 and the day==25 and add replace the zero in the xmas column with a 1.
Anyway to do this? the second line of my code errors out.
df = DataFrame({'date':[datetime(2013,1,1).date() + timedelta(days=i) for i in range(0,365*2)], 'xmas':np.zeros(365*2)})
df[df['date'].month==12 and df['date'].day==25] = 1
Pandas Series with datetime now behaves differently. See .dt accessor.
This is how it should be done now:
df.loc[(df['date'].dt.day==25) & (cust_df['date'].dt.month==12), 'xmas'] = 1
Basically what you tried won't work as you need to use the & to compare arrays, additionally you need to use parentheses due to operator precedence. On top of this you should use loc to perform the indexing:
df.loc[(df['date'].month==12) & (df['date'].day==25), 'xmas'] = 1
An update was needed in reply to this question. As of today, there's a slight difference in how you extract months from datetime objects in a pd.Series.
So from the very start, incase you have a raw date column, first convert it to datetime objects by using a simple function:
import datetime as dt
def read_as_datetime(str_date):
# replace %Y-%m-%d with your own date format
return dt.datetime.strptime(str_date,'%Y-%m-%d')
then apply this function to your dates column and save results in a new column namely datetime:
df['datetime'] = df.dates.apply(read_as_datetime)
finally in order to extract dates by day and month, use the same piece of code that #Shayan RC explained, with this slight change; notice the dt.datetime after calling the datetime column:
df.loc[(df['datetime'].dt.datetime.month==12) &(df['datetime'].dt.datetime.day==25),'xmas'] =1

Categories