Adding a new column to a pandas dataframe - python

I have a dataframe df with one column and 500k rows (df with first 5 elements is given below). I want to add new data in the existing column. The new data is a matrix of 200k rows and 1 column. How can I do it? Also I want add a new column named op.
X098_DE_time
0.046104
-0.037134
-0.089496
-0.084906
-0.038594

We can use concat function after rename the column from second dataframe.
df2.rename(columns={'op':' X098_DE_time'}, inplace=True)
new_df = pd.concat([df, new_df], axis=0)
Note: If we don't rename df2 column, the resultant new_df will have 2 different columns.

To add new column you can use
df["new column"] = [list of values];

Related

How to modify pandas dataframe format?

there is a problem with my pandas dataframe. DF is my original dataframe. Then I select specific columns of my DF:
df1=df[['cod_far','geo_lat','geo_lon']]
Then I set new names for those columns:
df1.columns = ['new_col1', 'cod_far', 'lat', 'lon']
And finally I group by DF1 by specific columns and convert it to a new DF called "occur"
occur = df1.groupby(['cod_far','lat','lon' ]).size()
occur=pd.DataFrame(occur)
The problem is that I am getting this: a dataframe with only ONE column. Rows are fine, but there should be 3 columns! Is there any way to drop that "0" and convert my dataframe "occur" into a dataframe of 3 columns?

how to make two rows to single row based on index in python

I am trying to convert data frame with two rows as single row. here i am placing tables for better understanding.
This is my actual output
how to convert above table to single row (see below table)
Use DataFrame.set_index with DataFrame.stack for MultiIndex Series, convert to one column DataFrame by Series.to_frame and transpose for one row DataFrame, last flatten MultiIndex by join:
df1 = s.set_index('Gender').stack().to_frame().T
df1.columns = df1.columns.map(lambda x: f'{x[0]} {x[1]}')

appending in pandas - row wise

I'm trying to append two columns of my dataframe to an existing dataframe with this:
dataframe.append(df2, ignore_index = True)
and this does not seem to be working.
This is what I'm looking for (kind of) --> a dataframe with 2 columns and 6 rows:
although this is not correct and it's using two print statements to print the two dataframes, I thought it might be helpful to have a selection of the data in mind.
I tried to use concat(), but that leads to some issues as well.
dataframe = pd.concat([dataframe, df2])
but that appears to concat the second dataframe in columns rather than rows, in addition to gicing NaN values:
any ideas on what I should do?
I assume this happened because your dataframes have different column names. Try assigning the second dataframe column names with the first dataframe column names.
df2.columns = dataframe.columns
dataframe_new = pd.concat([dataframe, df2], ignore_index=True)

Add array to a dataframe in python

I have a datframe df, with the df.shape: (971,1)
And I have an array with the anarray.shape: (971,80).
How can I add the array to my dataframe, so that I have the shape: (971,81).
I only find solutions where the array goes into one column, but in my case it should go into several columns.
I believe you need helper DataFrame with same index like df and then DataFrame.join:
df = df.join(pd.DataFrame(anarray, index=df.index))

Slicing a set of columns when a pandas dataframe does not include column labels

How to slice the last n columns from pandas dataframe assuming the dataframe does not include column labels? For instance, I want to slice the last 4 columns:
data = np.random.uniform(0,10,(4,10)).astype(np.int)
df = pd.DataFrame(data)
print(df.ix[:,4])
Can someone fix this up?
You can try with iloc instead of ix with -4: columns:
print(df.iloc[:,-4:])

Categories