I have a DataFrame with a date_time column. The date_time column contains a date and time. I also managed to convert the column to a datetime object.
I want to create a new DataFrame containing all the rows of a specific DAY.
I managed to do it when I set the date column as the index and used the "loc" method.
Is there a way to do it even if the date column is not set as the index? I only found a method which returns the rows between two days.
You can use groupby() function. Let's say your dataframe is df,
df_group = df.groupby('Date') # assuming the column containing dates is called Date.
Now you can access rows of any date by passing the date in the get_group function,
df_group.get_group('date_here')
Related
Hello I have a dataframe containing a date column I would like to loop through these dates and compare it to the current date to see if any entry is today. I tried converting the column to a list using the tolist() method but it outputted not the date but rather "Timestamp('2022-08-02 00:00:00')" however my column only contains dates formatted as %Y-%m-%d as you can see in the image.
dataframe
Assuming that your Dataframe is called df, here's a possible way of solving your issue:
df.loc[df.Date == pd.Timestamp.now().date().strftime('%Y-%m-%d')]
I think it's a straightforward solution, you filter your dataframe by "Date" and compare to the date part of "today's date" while maintaining the correct format of y-m-d.
I want to add a new column 'timestamp' in the existing python dataframe. I tried the code below,
df["timestamp"]=datetime.datetime.now().replace(microsecond=0).replace(second=0).isoformat()+"Z"
But I got the same timestamp for every rows. Actually I need a new column, which contains a series of timestamps. Which starts from a particular timestamp.
something like this maybe?:
df["timestamp"] = pd.date_range(datetime.datetime.now(), freq='1s', periods=len(df)).strftime('%Y-%m-%dT%H-%M-%SZ')
I have a data frame named inject. I have made a column name date as the index of the data frame inject. I want to find the rows corresponding to a particular date. The data type of column date is datetime.
inject_2017["2017-04-20"]
Writing this code throwing me an error.
Try inject_2017.loc["2017-04-20"]
This way you can select the row (or group of rows) with the corresponding datetime index.
for date in list_of_dates:
df = *dataframe with identifiers for rows and dates for columns*
new_column = *a new column with a new date to be added to the df*
df_incl_new_column = *original df merged with new column*
I want to use the 'df_incl_new_column' at the start of the loop as the new 'df' and keep iteratively adding new columns for each date, and using the dataframe with the new column as the start 'df' again and again.
I want to do this for a list of over 20 dates to build a new dataframe with all the new columns.
Each new column has data which changes depending on the previous new column having been added to the df.
What is the best way to do this?
It may be that a for loop is not appropriate but i need to build a dataframe gradually using the latest data in the dataframe to add the next column.
You should try this:
df = *dataframe with identifiers for rows and dates for columns*
for date in list_of_dates:
df['New Column Name'] = *a new column with a new date to be added to the df*
I am trying to get some data through the API from quandl but the date column doesn't seem to work the same level as the other columns. E.g. when I use the following code:
data = quandl.get("WIKI/KO", trim_start = "2000-12-12", trim_end =
"2014-12-30", authtoken=quandl.ApiConfig.api_key)
print(data['Open'])
I end up with the below result
Date
2000-12-12 57.69
2000-12-13 57.75
2000-12-14 56.00
2000-12-15 55.00
2000-12-18 54.00
E.g. date appearing along with the 'Open' column. And when I try to directly include Date like this:
print(data[['Open','Date']]),
it says Date doesn't exist as a column. So I have two questions: (1) How do I make Date an actual column and (2) How do I select only the 'Open' column (and thus not the dates).
Thanks in advance
Why print(data['Open']) show dates even though Date is not a column:
quandle.get returns a Pandas DataFrame, whose index is a DatetimeIndex.
Thus, to access the dates you would use data.index instead of data['Date'].
(1) How do I make Date an actual column
If you wish to make the DatetimeIndex into a column, call reset_index:
data = data.reset_index()
print(data[['Open', 'Date']])
(2) How do I select only the 'Open' column (and thus not the dates)
To obtain a NumPy array of values without the index, use data['Open'].values.
(All Pandas Series and DataFrames have Indexs (that's Pandas' raison d'ĂȘtre!), so the only way obtain the values without the index is to convert the Series or DataFrame to a different kind of object, like a NumPy array.)