Overwrite data frame value

Overwrite data frame value - python

I have two data frame df and ddff
df data frame have 3 row and 5 columns
import pandas as pd
import numpy as np
df = pd.DataFrame(np.array([[0,1,0,0,1], [1,0,0,1,0], [0,1,1,0,0]]))
df
0 1 2 3 4
0 0 1 0 0 1
1 1 0 0 1 0
2 0 1 1 0 0
ddff data frame consist of neighbour columns of a particular columns which have 5 row and 3 column where the value of ddff data frame represent the column name of df
ddff = pd.DataFrame(np.array([[3,2,1], [4,2,3], [3,1,4], [4,1,2], [2,3,1]]))
ddff
0 1 2
0 3 2 1
1 4 2 3
2 3 1 4
3 4 1 2
4 2 3 1
Now I need a final data frame where where df column neighbour's set to 1 (overwrite previous value)
expected output
0 1 2 3 4
0 0 1 1 1 0
1 1 0 0 1 0
2 0 1 1 0 0

You can filter the relevant column numbers from ddff, and set the values in those columns in the first row equal to 1 and set the values in the remaining columns to 0:
relevant_columns = ddff.loc[0]
df.loc[0,relevant_columns] = 1
df.loc[0,df.columns[~df.columns.isin(relevant_columns)]] = 0
Output:
0 1 2 3 4
0 0 1 1 1 0
1 1 0 0 1 0
2 0 1 1 0 0

You can use:
s = ddff.loc[0].values
df.loc[0] = np.where(df.loc[[0]].columns.isin(s),1,0)
>>> df
0 1 2 3 4
0 0 1 1 1 0
1 1 0 0 1 0
2 0 1 1 0 0
Breaking it down:
>>> np.where(df.loc[[0]].columns.isin(s),1,0)
array([0, 1, 1, 1, 0])
# Before the update
>>> df.loc[0]
0 0
1 1
2 0
3 0
4 1
# After the assignment back
0 0
1 1
2 1
3 1
4 0

Related

Splitting a non delimited column and create an additional column to count which number value

I have a problem in which I want to take Table 1 and turn it into Table 2 using Python.
Does anybody have any ideas? I've tried to split the Value column from table 1 but run into issues in that each value is a different length, hence I can't always define how much to split it.
Equally I have not been able to think through how to create a new column that counts the position that value was in the string.
Table 1, before:
ID
Value
1
000000S
2
000FY
Table 2, after:
ID
Position
Value
1
1
0
1
2
0
1
3
0
1
4
0
1
5
0
1
6
0
1
7
S
2
1
0
2
2
0
2
3
0
2
4
F
2
5
Y

You can split the string to individual characters and explode:
out = (df
.assign(Value=df['Value'].apply(list))
.explode('Value')
)
output:
ID Value
0 1 0
0 1 0
0 1 0
0 1 0
0 1 0
0 1 0
0 1 S
1 2 0
1 2 0
1 2 0
1 2 F
1 2 Y

Given:
ID Value
0 1 000000S
1 2 000FY
Doing:
df.Value = df.Value.apply(list)
df = df.explode('Value')
df['Position'] = df.groupby('ID').cumcount() + 1
Output:
ID Value Position
0 1 0 1
0 1 0 2
0 1 0 3
0 1 0 4
0 1 0 5
0 1 0 6
0 1 S 7
1 2 0 1
1 2 0 2
1 2 0 3
1 2 F 4
1 2 Y 5

Data Frame value set to 1 based on condition

there is two data frame
import pandas as pd
import numpy as np
df = pd.DataFrame(np.array([[0,1,0,0,1], [1,0,0,1,0], [0,1,1,0,0]]))
df
0 1 2 3 4
0 0 1 0 0 1
1 1 0 0 1 0
2 0 1 1 0 0
ddff = pd.DataFrame(np.array([[3,2,1], [4,2,3], [3,1,4], [4,1,2], [2,3,1]]))
ddff
0 1 2
0 3 2 1
1 4 2 3
2 3 1 4
3 4 1 2
4 2 3 1
Now, I need to modify df data frame row 0 values based on ddff data frame. ddff data frame row 0 consist [3,2,1] values, Now If df data frame columns 3, 2 and 1 have value 0 then set it to 1
expected output
0 1 2 3 4
0 0 1 1 1 1
1 1 0 0 1 0
2 0 1 1 0 0

This will replace all elements in row 0 in df in columns given by row 0 in ddff
df.iloc[0][ddff.iloc[0].values] = 1
# Out:
# 0 1 2 3 4
# 0 0 1 1 1 1
# 1 1 0 0 1 0
# 2 0 1 1 0 0
Explanation: ddff.iloc[0].values reads the column names from row 0 in ddff.

Pandas: sort according to a row

I have a Dataframe like this (with labels on rows and columns):
0 1 2 3
0 1 1 0 0
1 0 1 1 0
2 1 0 1 0
-1 5 6 3 2
I would like to order the columns according to the last row (and then drop the row):
0 1 2 3
0 1 1 0 0
1 1 0 1 0
2 0 1 1 0

Try np.argsort to get the order, then iloc to rearrange columns and drop rows:
df.iloc[:-1, np.argsort(-df.iloc[-1])]
Output:
1 0 2 3
0 1 1 0 0
1 1 0 1 0
2 0 1 1 0

How do I create a column such that its values is count of the number of,1, in that row, which are appearing for the first time in their own column?

How do I do this operation using pandas?
Initial Df:
A B C D
0 0 1 0 0
1 0 1 0 0
2 0 0 1 1
3 0 1 0 1
4 1 1 0 0
5 1 1 1 0
Final Df:
A B C D Param
0 0 1 0 0 1
1 0 1 0 0 0
2 0 0 1 1 2
3 0 1 0 1 0
4 1 1 0 0 1
5 1 1 1 0 0
Basically Param is the number of the 1 in that row which is appearing for the first time in its own column.
Example:
index 0 : 1 in the column B is appearing for the first time hence Param1 = 1
index 1 : none of the 1 is appearing for the first time in its own column hence Param1 = 0
index 2 : 1 in the column C and D is appearing for the first time in their columns hence Paramm1 = 2
index 3 : none of the 1 is appearing for the first time in its own column hence Param1 = 0
index 4 : 1 in the column A is appearing for the first time in the column hence Paramm1 = 1
index 5 : none of the 1 is appearing for the first time in its own column hence Param1 = 0

I will do idxmax and value_counts
df['Param']=df.idxmax().value_counts().reindex(df.index,fill_value=0)
df
A B C D Param
0 0 1 0 0 1
1 0 1 0 0 0
2 0 0 1 1 2
3 0 1 0 1 0
4 1 1 0 0 1
5 1 1 1 0 0

You can check for duplicated values, multiply with df and sum:
df['Param'] = df.apply(lambda x: ~x.duplicated()).mul(df).sum(1)
Output:
A B C D Param
0 0 1 0 0 1
1 0 1 0 0 0
2 0 0 1 1 2
3 0 1 0 1 0
4 1 1 0 0 1
5 1 1 1 0 0

Assuming these are integers, you can use cumsum() twice to isolate the first occurrence of 1.
df2 = (df.cumsum() > 0).cumsum() == 1
df['Param'] = df2.sum(axis = 1)
print(df)
If df elements are strings, you should first convert them to integers.
df = df.astype(int)

How to cell values as new columns in pandas dataframe

I have a dataframe like the following:
Labels
1 Nail_Polish,Nails
2 Nail_Polish,Nails
3 Foot_Care,Targeted_Body_Care
4 Foot_Care,Targeted_Body_Care,Skin_Care
I want to generate the following matrix:
Nail_Polish Nails Foot_Care Targeted_Body_Care Skin_Care
1 1 1 0 0 0
2 1 1 0 0 0
3 0 0 1 1 0
4 0 0 1 1 1
How can I achieve this?

Use str.get_dummies:
df2 = df['Labels'].str.get_dummies(sep=',')
The resulting output:
Foot_Care Nail_Polish Nails Skin_Care Targeted_Body_Care
1 0 1 1 0 0
2 0 1 1 0 0
3 1 0 0 0 1
4 1 0 0 1 1

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Overwrite data frame value - python

Related

Splitting a non delimited column and create an additional column to count which number value

Data Frame value set to 1 based on condition

Pandas: sort according to a row

How do I create a column such that its values is count of the number of,1, in that row, which are appearing for the first time in their own column?

How to cell values as new columns in pandas dataframe

Categories

Resources