I have this dataframe:
I want to create a ZIP column which will get the value of ZIP_y when ZIP_x is NaN and the value of ZIP_x when ZIP_x is not NaN.
I tried this code:
dm["ZIP"]=numpy.where(dm["ZIP_x"] is numpy.nan, dm["ZIP_y"],dm["ZIP_x"])
But that gave me this output:
As you can see, the ZIP column seems to be getting the values of ZIP_x in each of its cells.
Do you know how to achieve what I am after?
You want this:
dm["ZIP"]=numpy.where(dm["ZIP_x"].isnull(), dm["ZIP_y"],dm["ZIP_x"])
You can't use is or == for that matter to compare NaNs
Related
I'm facing a strange issue in which I'm trying to replace all NaN values in a dataframe with values taken from another one (same length) that has the relevant values.
Here's a glimpse for the "target dataframe" in which I want to replace the values:
data_with_null
Here's the dataframe where I want to take data from: predicted_paticipant_groups
I've tried:
data_with_null.participant_groups.fillna(predicted_paticipant_groups.participant_groups, inplace=True)
but it just fills all values NaN values with the 1st one (Infra)
Is it because of the indexes of data_with_null are all zeros?
Reset the index and try again.
data_with_null.reset_index(drop=True, inplace=True)
I have a dataframe which looks as follows:
I want to multiply elements in a row except for the "depreciation_rate" column with the value in the same row in the "depreciation_rate" column.
I tried df2.iloc[:,6:26]*df2["depreciation_rate"] as well as df2.iloc[:,6:26].mul(df2["depreciation_rate"])
I get the same results with both which look as follows. I get NaN values with additional columns which I don't want. I think the elements in rows also multiply with values in other rows in the "depreciation_rate" column. What would be a good way to solve this issue?
Try using mul() along axis=0:
df2.iloc[:,6:26].mul(df2["depreciation_rate"], axis=0)
I want to change the Nan values in a specific column in a list of DataFrame. I have applied methods (below). I am not unable to change the nan to zero. Is there any way to replace the values to zero
Data is the list of DataFrame and qobs is the specific column in each DataFrame
for value in data:
value['qobs']= value['qobs'].replace(np.nan,0)
for value in data:
value['qobs']= value['qobs'].fillna(0)
You can change column like this:
data['qobs'] = data['qobs'].fillna(0)
print(data)
I have a pandas DataFrame, df, and I'd like to get the mean for columns 180 through the end (not including the last column), only using the first 100K rows.
If I use the whole DataFrame:
df.mean().isnull().any()
I get False
If I use only the first 100K rows:
train_means = df.iloc[:100000, 180:-1].mean()
train_means.isnull().any()
I get: True
I'm not sure how this is possible, since the second approach is only getting the column means for a subset of the full DataFrame. So if no column in the full DataFrame has a mean of NaN, I don't see how a column in a subset of the full DataFrame can.
For what it's worth, I ran:
df.columns[df.isna().all()].tolist()
and I get: []. So I don't think I have any columns where every entry is NaN (which would cause a NaN in my train_means calculation).
Any idea what I'm doing incorrectly?
Thanks!
Try look at
(df.iloc[:100000, 180:-1].isnull().sum()==100000).any()
If this return True , which mean you have a columns' value is all NaN in the first 100000 rows
And Now let us explain why you get all notnull when do the mean to the whole dataframe , since mean have skipna default as True so it will drop NaN before mean
I have a dataframe called firstpart.
I'm trying to update the values in one column (Key), but only for rows in which another column (Zone) has no data.
I'm using this code, which doesn't work:
firstpart.ix[firstpart.Zone ==np.nan,"Key"] = "newvalue"
Neither does this:
firstpart.ix[firstpart.Zone =="","Key"] = "newvalue"
Using this syntax I'm able to update values in rows for which Zone has another value, but for some reason not if I try to select the rows in which it is blank.
What am I doing wrong?
firstpart.ix[firstpart.Zone.isnull()] = "newvalue"
You can't equate NaN to anything.
In [1]: NaN == NaN
Out[1]: False
You need special methods for that, and this is what .isnull() is about.