Today's Date in pandas with specific format - python 2.7 - python

I need to get today's data in pandas with the following int format:
2015,7,27
I am using it to get some data from a certain time frame:
sdate = date(2015,7,27)
edate = date(2015,7,31) #I would like not to hardcode it.
I tried:
today = datetime.today().strftime("%Y/%m/%d")
It outputs a string that I have to convert.
<type 'str'>
If I use the str it gives:
TypeError: an integer is required
Is there a pythonic way to solve this?
Thanks in advance for suggestions.

What you're looking for is not an int tuple, but you're calling a function date which requires 3 parameters of type int, the year, month and day. So to set the right end date you should call the date function and pass it the current date using:
edate = date(datetime.today().year, datetime.today().month, datetime.today().day)

Related

Wrong string format when converting my date column

I am trying to convert my date column called df['CO date'] which shows in this format 3/02/21 meaning date/month/year, the problem arises when I parse it and then pass it to string, like this.
df['CO date'] = pd.to_datetime(df['CO date']).dt.strftime("%d/%m/%y")
for some reason after I converted from datetime to string with the shown format it returns my date in an american format like 02/03/21 , I don't understand why this happens, the only thing I can think of is that Python only has the string format %d which shows the days as 01,02,03,04,etc where as the date on my df originally is day "3" (non-padding zero).
Does anybody know how can I solve this problem?.
Many thanks in advance
Your formatting looks right. The only way you get that result, is your data frame contains wrong or corrupted data. You can make a sanity check by:
pd.to_datetime("2021-03-02").strftime("%d/%m/%y")
>>>
'02/03/21'
I think you are converting with wrong format in the beginning at pd.to_datetime(df['CO date']) part. If you know exact format you should use format in pd.to_datetime like:
pd.to_datetime("2021-02-03", format="%Y-%d-%m").strftime("%d/%m/%y")
>>>
'02/03/21'
output date in a try and catch block and see if you can get the dataframe column with the invalid date to try an error. Check for ranges for day and month and year and custom throw and error if exceeded.
print(date.day)
print(date.month)
print(date.year)
def date_check(date):
try:
datetime.strptime(date, '%d/%m/%Y')
return True
except ValueError:
return False
or
if pd.to_datetime(df['date'], format='%d-%b-%Y', errors='coerce').notnull().all():

Create date from one year with string and int error - PYTHON

I have the following problem. I want to create a date from another. To do this, I extract the year from the database date and then create the chosen date (day = 30 and month = 9) being the year extracted from the database.
The code is the following
bbdd20Q3['year']=(pd.DatetimeIndex(bbdd20Q3['datedaymonthyear']).year)
y=(bbdd20Q3['year'])
m=int(9)
d=int(30)
bbdd20Q3['mydate']=dt.datetime(y,m,d)
But error message is this
"cannot convert the series to <class 'int'>"
I think dt mean datetime, so the line 'dt.datetime(y,m,d)' create datetime object type.
bbdd20Q3['mydate'] should get int?
If so, try to think of another way to store the date (8 numbers maybe).
hope I helped :)
I assume that you did import datetime as dt then by doing:
bbdd20Q3['year']=(pd.DatetimeIndex(bbdd20Q3['datedaymonthyear']).year)
y=(bbdd20Q3['year'])
m=int(9)
d=int(30)
bbdd20Q3['mydate']=dt.datetime(y,m,d)
You are delivering series as first argument to datetime.datetime, when it excepts int or something which can be converted to int. You should create one datetime.datetime for each element of series not single datetime.datetime, consider following example
import datetime
import pandas as pd
df = pd.DataFrame({"year":[2001,2002,2003]})
df["day"] = df["year"].apply(lambda x:datetime.datetime(x,9,30))
print(df)
Output:
year day
0 2001 2001-09-30
1 2002 2002-09-30
2 2003 2003-09-30
Here's a sample code with the required logic -
import pandas as pd
df = pd.DataFrame.from_dict({'date': ['2019-12-14', '2020-12-15']})
print(df.dtypes)
# convert the date in string format to datetime object,
# if the date column(Series) is already a datetime object then this is not required
df['date'] = pd.to_datetime(df['date'])
print(f'after conversion \n {df.dtypes}')
# logic to create a new data column
df['new_date'] = pd.to_datetime({'year':df['date'].dt.year,'month':9,'day':30})
#eollon I see that you are also new to Stack Overflow. It would be better if you can add a simple sample code, which others can tryout independently
(keeping the comment here since I don't have permission to comment :) )

reformatting the timestamp in my dataset to have it as datetime

I want to reformat the timestamp in my dataset to have it as a date + time.
here is my dataset
and I tried this
data1 = pd.read_excel(r"C:\Users\user\Desktop\Consumption.xlsx")
data1['Timestamp']= pd.to_datetime(['Timestamp'], unit='s')
and I got this error
ValueError: non convertible value Timestamp with the unit 's'
I also tried not to pass the "unit" in the pd.to_datetime function and it gave an error
The type of time stamp is Object. Please any help.
Format of datetimes is not unix time, so raised error. You can split values by ; and select second lists by str[1] and then convert to datetimes:
data1['Timestamp']= pd.to_datetime(data1['Timestamp'].str.split(';').str[1])
I would suggest you check the documentation of the function here
If you want to add date-time, you can format like this:
format='%d/%m/%Y %H:%M:%S'
Try this:
data1['Date'] = pd.DataFrame(data1['Timestamp'], format ='%d/%m/%Y')

How to filter by date with python's datatable

I have the following datatable, which I would like to filter by dates greater than "2019-01-01". The problem is that the dates are strings.
dt_dates = dt.Frame({"days_date": ['2019-01-01','2019-01-02','2019-01-03']})
This is my best attempt.
dt_dates[f.days_date > datetime.strptime(f.days_date, "2019-01-01")]
this returns the error
TypeError: strptime() argument 1 must be str, not Expr
what is the best way to filter dates in python's datatable?
Reference
python datatable
f-expressions
Your datetime syntax is incorrect, for converting a string to a datetime.
What you're looking for is:
dt_dates[f.days_date > datetime.strptime(f.days_date, "%Y-%m-%d")]
Where the 2nd arguement for strptime is the date format.
However, lets take a step back, because this isn't the right way to do it.
First, we should convert all your dates in your Frame to a datetime. I'll be honest, I've never used a datatable, but the syntax looks extremely similar to panda's Dataframe.
In a dataframe, we can do the following:
df_date = df_date['days_date'].apply(lambda x: datetime.strptime(x, '%Y-%m'%d))
This goes through each row where the column is 'dates_date" and converts each string into a datetime.
From there, we can use a filter to get the relevant rows:
df_date = df_date[df_date['days_date'] > datetime.strptime("2019-01-01", "%Y-%m-%d")]
datatable version 1.0.0 introduced native support for date an time data types. Note the difference between these two ways to initialize data:
dt_dates = dt.Frame({"days_date": ['2019-01-01','2019-01-02','2019-01-03']})
dt_dates.stypes
> (stype.str32,)
and
dt_dates = dt.Frame({"days_date": ['2019-01-01','2019-01-02','2019-01-03']}, stype="date32")
dt_dates.stypes
> (stype.date32,)
The latter frame contains days_date column of type datatable.Type.date32 that represents a calendar date. Then one can filter by date as follows:
split_date = datetime.datetime.strptime("2019-01-01", "%Y-%m-%d")
dt_split_date = dt.time.ymd(split_date.year, split_date.month, split_date.day)
dt_dates[dt.f.days_date > dt_split_date, :]

Comparison between datetime and datetime64[ns] in pandas

I'm writing a program that checks an excel file and if today's date is in the excel file's date column, I parse it
I'm using:
cur_date = datetime.today()
for today's date. I'm checking if today is in the column with:
bool_val = cur_date in df['date'] #evaluates to false
I do know for a fact that today's date is in the file in question. The dtype of the series is datetime64[ns]
Also, I am only checking the date itself and not the timestamp afterwards, if that matters. I'm doing this to make the timestamp 00:00:00:
cur_date = datetime.strptime(cur_date.strftime('%Y_%m_%d'), '%Y_%m_%d')
And the type of that object after printing is datetime as well
For anyone who also stumbled across this when comparing a dataframe date to a variable date, and this did not exactly answer your question; you can use the code below.
Instead of:
self.df["date"] = pd.to_datetime(self.df["date"])
You can import datetime and then add .dt.date to the end like:
self.df["date"] = pd.to_datetime(self.df["date"]).dt.date
You can use
pd.Timestamp('today')
or
pd.to_datetime('today')
But both of those give the date and time for 'now'.
Try this instead:
pd.Timestamp('today').floor('D')
or
pd.to_datetime('today').floor('D')
You could have also passed the datetime object to pandas.to_datetime but I like the other option mroe.
pd.to_datetime(datetime.datetime.today()).floor('D')
Pandas also has a Timedelta object
pd.Timestamp('now').floor('D') + pd.Timedelta(-3, unit='D')
Or you can use the offsets module
pd.Timestamp('now').floor('D') + pd.offsets.Day(-3)
To check for membership, try one of these
cur_date in df['date'].tolist()
Or
df['date'].eq(cur_date).any()
When converting datetime64 type using pd.Timestamp() it is important to note that you should compare it to another timestamp type. (not a datetime.date type)
Convert a date to numpy.datetime64
date = '2022-11-20 00:00:00'
date64 = np.datetime64(date)
Seven days ago - timestamp type
sevenDaysAgoTs = (pd.to_datetime('today')-timedelta(days=7))
convert date64 to Timestamp and see if it was in the last 7 days
print(pd.Timestamp(pd.to_datetime(date64)) >= sevenDaysAgoTs)

Categories