Iteratively add columns to the same dataframe - python

for date in list_of_dates:
df = *dataframe with identifiers for rows and dates for columns*
new_column = *a new column with a new date to be added to the df*
df_incl_new_column = *original df merged with new column*
I want to use the 'df_incl_new_column' at the start of the loop as the new 'df' and keep iteratively adding new columns for each date, and using the dataframe with the new column as the start 'df' again and again.
I want to do this for a list of over 20 dates to build a new dataframe with all the new columns.
Each new column has data which changes depending on the previous new column having been added to the df.
What is the best way to do this?
It may be that a for loop is not appropriate but i need to build a dataframe gradually using the latest data in the dataframe to add the next column.

You should try this:
df = *dataframe with identifiers for rows and dates for columns*
for date in list_of_dates:
df['New Column Name'] = *a new column with a new date to be added to the df*

Related

moving a row next to another row in panda data frame

I am trying to format a data frame from 2 rows to 1 rows. but I am encountering some issues. Do you have any idea on how to do that? Here the code and df:
Thanks!
If you are looking to convert two rows into one, you can do the following...
Stack the dataframe and reset the index at level=1, which will convert the data and columns into a stack. This will end up having each of the column headers as a column (called level_1) and the data as another column(called 0)
Then set the index as level_1, which will move the column names as index
Remove the index name (level_1). Then transpose the dataframe
Code is shown below.
df3=df3.stack().reset_index(level=1).set_index('level_1')
df3.index.name = None
df3=df3.T
Output
df3

Split Data Frame into New Dataframe for each consecutive Column

Looking to split columns of this data frame into multiple data frames. Each with the date column and the consecutive column. How do I get a function that can automate this. So we would have n data frames, n being the number of columns in the original data frame - 1( the date column).
The first thing first is to set the date column as the index:
df.set_index('Date')
Then, when you filter the data frame by a single column you will get a series object with the date and your column of interest:
e.g. df.P19245Y8E will give a series of the second column.
I think this will do what you need, but if you really want to create separate dataframes for each column then you just iterate through the columns:
new_dfs = []
for col in df.columns:
new_dfs.append(df[col])
or with list comprehension:
new_dfs = [df[col] for col in df.columns]

Add Row to Dataframe With Streamlit

I have a data frame with my stock portfolio. I want to be able to add a stock to my portfolio on streamlit. I have text input in my sidebar where I input the Ticker, etc. When I try to add the Ticker at the end of my dataframe, it does not work.
ticker_add = st.sidebar.text_input("Ticker")
df['Ticker'][len(df)+1] = ticker_add
The code [len(df)+1] does not work. When I try to do [len(df)-1], it works but I want to add it to the end of the dataframe, not replace the last stock. It seems like it can't add a row to the dataframe.
Solution
You MUST first check the type of ticker_add.
type(ticker_add)
Adding new row to a dataframe
Assuming your ticker_add is a dictionary with the column names of the dataframe df as the keys, you can do this:
df.append(pd.DataFrame(ticker_add))
Assuming it is a single non-array-like input, you can do this:
# adds a new row for a single column ("Ticker") dataframe
df = df.append({'Ticker': ticker_add}, ignore_index=True)
References
Add one row to pandas DataFrame
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html

Select DataFrame rows of a specific day

I have a DataFrame with a date_time column. The date_time column contains a date and time. I also managed to convert the column to a datetime object.
I want to create a new DataFrame containing all the rows of a specific DAY.
I managed to do it when I set the date column as the index and used the "loc" method.
Is there a way to do it even if the date column is not set as the index? I only found a method which returns the rows between two days.
You can use groupby() function. Let's say your dataframe is df,
df_group = df.groupby('Date') # assuming the column containing dates is called Date.
Now you can access rows of any date by passing the date in the get_group function,
df_group.get_group('date_here')

Adding a new column to a pandas dataframe

I have a dataframe df with one column and 500k rows (df with first 5 elements is given below). I want to add new data in the existing column. The new data is a matrix of 200k rows and 1 column. How can I do it? Also I want add a new column named op.
X098_DE_time
0.046104
-0.037134
-0.089496
-0.084906
-0.038594
We can use concat function after rename the column from second dataframe.
df2.rename(columns={'op':' X098_DE_time'}, inplace=True)
new_df = pd.concat([df, new_df], axis=0)
Note: If we don't rename df2 column, the resultant new_df will have 2 different columns.
To add new column you can use
df["new column"] = [list of values];

Categories