This question already has answers here:
How to add an empty column to a dataframe?
(15 answers)
Closed 2 years ago.
I have a dataframe :
a b
1 dasd
2 fsfr12341
3 %%$dasd11
4 &^hkyo1
I need to remove all the values in column b and make it a blank column
a b
1
2
3
4
Kindly help me on this.
thanks alot
Try changing the b column to empty strings '', like this:
df['b'] = ''
Related
This question already has answers here:
Pandas dataframe column value case insensitive replace where <condition>
(2 answers)
Closed 1 year ago.
I have a column in the data frame which allowed only values present in a defined list.
E.g.: Given a list l1 = [1,2,5,6], I need to replace every value with "0" if value in column is not present in the list
column
Expected column
1
1
5
5
2
2
3
0
4
0
3
0
6
6
I have tried using loc
df.loc[~l1, 0, df.column]
But this says TypeError. What is the efficient way in python to replace the value ?
df.loc[~df['column'].isin(l1), 'Expected column'] = 0
This question already has answers here:
Drop rows containing empty cells from a pandas DataFrame
(8 answers)
Closed 2 years ago.
I have a dataframe df with values as below:
Common_words count
0 realdonaldtrump 2932
2 new 2347
3 2030
4 trump 2013
5 good 1553
6 1440
7 great 200
I only need the rows where there is certain text. For e.g rows which have blank value like row 3 and row6 need to be removed.
Tried:
df = df.dropna(how='any',axis=0)
but still i get the same result. I feel these are not null values but spaces, so I also tried below:
df.Common_words = df.Common_words.str.replace(' ', '')
But still same result. Row 3 and 6 are still not removed. What to do?
You can try:
df.replace(r'^\s+$', np.nan, regex=True)
df.dropna()
You can do:
df.Common_words = df.Common_words.replace(r"\s+", np.NaN, regex=True)
df.dropna()
This question already has answers here:
Pandas Merging 101
(8 answers)
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
Closed 2 years ago.
I have 2 Pandas Dataframes with one column (ID).
the first one look like this:
ID
1
2
3
4
5
and the second one look like this:
ID
3
4
5
6
7
I want to make a new Dataframe by combining those 2 Dataframes, but only the value that exist on both Dataframe.
This is the result that I want:
ID
3
4
5
can you show me how to do this in the most efficient way with pandas? Thank you
This question already has answers here:
How to extract first 8 characters from a string in pandas
(4 answers)
Closed 3 years ago.
I want to remove some characters from 'name' columns, in a way to keep the very first characters and remove the rest
So this is my data:
name id
0 ABC-G 3
1 ERT-R 4
2 IGF 2
The result should be:
name id
0 AB 3
1 ER 4
2 IG 5
You can str.slice(..) [pandas-doc] the column, like:
df['name'] = df['name'].str.slice(stop=2)
Or if you use some sort of filtering:
df.loc[some_filter, 'name'] = df[some_filter]['name'].str.slice(stop=2)
This question already has answers here:
Pandas DENSE RANK
(4 answers)
pandas group by and assign a group id then ungroup
(3 answers)
Closed 5 years ago.
I have a pandas dataframe with a column, call it range_id, that looks something like this:
range_id
1
1
2
2
5
5
5
8
8
10
10
...
I want to maintain the number buckets (each rows that share values still share values), but make the numbers ascend uniformly. So the new column would like this:
range_id
1
1
2
2
3
3
3
4
4
5
5
...
I could write a lambda function that maps these in such a way to achieve this desired output, but I was wondering if pandas has any sort of built-in functionality to achieve this, as it has always surprised me before in what it is capable of doing. Thanks for the help!