I'm trying to delete empty cells with pandas. I wanna delete only empty cells but I have no idea how to do that.
ex
A
B
C
D
E
F
G
H
I
J
1
apple
price
10
quantity
5
2
pineapple
price
12
condition
good
quantity
4
what I want
A
B
C
D
E
F
G
H
I
J
1
apple
price
10
quantity
5
2
pineapple
price
12
condition
good
quantity
4
I need all values without empty cells. So I don't want to delete whole row or column. I wanna delete empty cell and pull the values in the back.
Real Data
EXCEL
I made it with this
Removing nan from pandas dataframe and reshaping dataframe
Keypoint : Change Invalid_val with len of value strings
Related
Context: I'd like to "bump" the index level of a multi-index dataframe up. In other words, I'd like to put the index level of a dataframe at the same level as the columns of a multi-indexed dataframe
Let's say we have this dataframe:
tt = pd.DataFrame({'A':[1,2,3],'B':[4,5,6],'C':[7,8,9]})
tt.index.name = 'Index Column'
And we perform this change to add a multi-index level (like a label of a table)
tt = pd.concat([tt],keys=['Multi-Index Table Label'], axis=1)
Which results in this:
Multi-Index Table Label
A B C
Index Column
0 1 4 7
1 2 5 8
2 3 6 9
Desired Output: How can I make it so that the dataframe looks like this instead (notice the removal of the empty level on the dataframe/table):
Multi-Index Table Label
Index Column A B C
0 1 4 7
1 2 5 8
2 3 6 9
Attempts: I was testing something out and you can essentially remove the index level by doing this:
tt.index.name = None
Which would result in :
Multi-Index Table Label
A B C
0 1 4 7
1 2 5 8
2 3 6 9
Essentially removing that extra level/empty line, but the thing is that I do want to keep the Index Column as it will give information about the type of data present on the index (which in this example are just 0,1,2 but can be years, dates, etc).
How could I do that?
Thank you all in advance :)
How about this:
tt = pd.DataFrame({'A':[1,2,3],'B':[4,5,6],'C':[7,8,9]})
tt.insert(loc=0, column='Index Column', value=tt.index)
tt = pd.concat([tt],keys=['Multi-Index Table Label'], axis=1)
tt = tt.style.hide_index()
This question already has answers here:
Pandas Merging 101
(8 answers)
Pandas: assign value depending on another dataframe
(1 answer)
Closed 1 year ago.
I want to copy the information (quantity) of one dataframe's column to the other dataframe's quantity column but do so matching the SKU column.
So for example the dataframes look like:
Dataframe 1:
SKU Quantity Title
A 3 Scissors
B 4 Cable
C 5 Goat
D 6 Cheese
Dataframe 2:
SKU Quantity Title
A 1 Blue Scissors
B 2 Red Cables
C 1 Fat Goat
D 2 Smelly Cheese
So I would like to get Dataframe 1's quantities and place them into Dataframe 2, but matching the SKUs (A, B, C, D etc) even though some other columns (such as Title) might have different information.
You can try to set index on SKU for both dataframes to align on index and copy the column Quantity with the aligned index. Reset index to restore SKU back to data column.
df1a = df1.set_index('SKU')
df2a = df2.set_index('SKU')
df2a['Quantity'] = df1a['Quantity']
df2 = df2a.reset_index()
Result:
print(df2)
SKU Quantity Title
0 A 3 Blue Scissors
1 B 4 Red Cables
2 C 5 Fat Goat
3 D 6 Smelly Cheese
You could try this :
df2['Quantity'] = np.where(df1['SKU'] == df2['SKU'], df1['Quantity'])
I have a dataframe with many rows like this:
ID
Variable
1
A1_1 - Red
2
A1_2 - Blue
3
A1_3 - Yellow
I'm trying to iterate over all rows so that all the 2nd column's values change to just "A1". The code I've come up with is:
for row in df.iterrows():
current_response_id=row[1][0]
columncount=0
for columncount in range(2):
variable=row[1][1];
row[1][1]=variable.split("_")[0].split(" -")[0]
variable=row[1][1];
However, this isn't achieving the desired result. How could I go about this?
Try:
df["Variable"] = df["Variable"].str.split("_").str[0]
print(df)
Prints:
ID Variable
0 1 A1
1 2 A1
2 3 A1
I have the following dataframe in Pandas
OfferPreference_A OfferPreference_B OfferPreference_C
A B A
B C C
C S G
I have the following dictionary of unique values under all the columns
dict1={A:1, B:2, C:3, S:4, G:5, D:6}
I also have a list of the columnames
columnlist=['OfferPreference_A', 'OfferPreference_B', 'OfferPreference_C']
I Am trying to get the following table as the output
OfferPreference_A OfferPreference_B OfferPreference_C
1 2 1
2 3 3
3 4 5
How do I do this.
Use:
#if value not match get NaN
df = df[columnlist].applymap(dict1.get)
Or:
#if value not match get original value
df = df[columnlist].replace(dict1)
Or:
#if value not match get NaN
df = df[columnlist].stack().map(dict1).unstack()
print (df)
OfferPreference_A OfferPreference_B OfferPreference_C
0 1 2 1
1 2 3 3
2 3 4 5
You can use map for this like shown below, assuming the values will match always
for col in columnlist:
df[col] = df[col].map(dict1)
I have a Dataframe df with 3 columns. A,B and C
A B C
2 4 4
5 2 5
6 9 5
My goal is to use itertools.combinations to find all non-repeating column pairs and to put the first column pair in one DataFrame and the second in the other. So all pairs of this would give A:B,A:C,B:C.
So the first dataframe df1 would have the first of of those column pairs:
df=A A B
2 4 4
5 5 2
6 5 9
and the second df2:
B C C
4 4 4
3 5 5
9 5 5
I'm trying to do something with itertools like:
for cola, colb in itertools.combinations(df, 2):
df1[cola]=cola
df2[colb]=colb
I know that makes no sense but i can change each column to a list and itertool a list of lists and then append each to a list A and B and then turn that list back into a Dataframe but then Im missing the headers. And I tried adding the headers to the list but when i try and remake it back to a DataFrame the indexing seems off and I cant seem to fix it. So I'm just trying to see if there is a way to just itertool entire columns with the headers.
Utilize the zip function to group the columns to be used in each DataFrame separately, and then use pandas.concat to construct your new DataFrames:
from itertools import combinations
df1_cols, df2_cols = zip(*combinations(df.columns,2))
df1 = pd.concat([df[col] for col in df1_cols],axis=1)
df2 = pd.concat([df[col] for col in df2_cols],axis=1)