This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 4 years ago.
I have what I believe is a simple question but I can't find what I'm looking for in the docs.
I have a dataframe with a Categorical column called mycol with categories a and b and would like to be mask a subset of the dataframe as follows:
df_a = df[df.mycol.equal('a')]
Currently I am doing:
df_a = df[df.mycol.cat.codes.values==df.mycol.cat.categories.to_list().index('a')]
which is obviously extremely verbose and inelegant. Since df.mycol has both the codes and the coded labels, it has all the information to perform this operation, so I'm wondering the best way to go about this...
df_a = df[df["mycol"]=='a']
I believe this should work, unless by 'mask' you mean you want to actually zero out the values that don't have a
Related
This question already has answers here:
How to drop a list of rows from Pandas dataframe?
(15 answers)
Delete a column from a Pandas DataFrame
(20 answers)
Closed 6 months ago.
I am new to both pandas and python in general.
I have a dataset that I have transposed(T) and I want to use the same transposed format to drop some rows and columns.
I am able to transpose in a different window but when I try to drop some rows, it returns untransposed results.
I am looking for something like this(to combine transpose & drop)
datafraw.describe().T, drop(labels =['rowName', index = 1]
When i run the two separately, here is what it seems the transposition seems to be overshadowed by the drop commandtranspositioned table combined drop and transpositioned table
This question already has answers here:
Selecting non-adjacent columns by column number pandas [duplicate]
(1 answer)
Selecting a range of columns in a dataframe
(5 answers)
Closed 1 year ago.
How to select multiple columns in Python using .iloc function?
Let's say I have data frame with X rows and 100 columns and I would like to select the first 50 columns then 75 to 80 and then columns 90 and 95.
So far I read about two way of selection in Python, single columns df = df1.iloc[:,[1,2,3]] and range df = df1.iloc[:,1:30], but is there any possibility how to combine then in more complex selection?
I.e. In my example I would expect code like this:
But it does not work. I tried also different syntax (using brackets etc.) but cannot find the correct solution.
df = df1.iloc[:,[1:50,75:80,90,95]]
I believe you should try using np.r_. In thid case, please try with:
df1.iloc[:, np.r_[1:50, 75:80, 90, 95]]
This should be able to allow you to select multiple groups of columns
This question already has answers here:
How to pivot a dataframe in Pandas? [duplicate]
(2 answers)
Closed 1 year ago.
Hi there I have a data set look like df1 below and I want to make it look like df2 using pandas. I have tried to use pivot and transpose but can't wrap my head around how to do it. Appreciate any help!
This should do the job
df.pivot_table(index=["AssetID"], columns='MeterName', values='MeterValue')
index: Identifier
columns: row values that will become columns
values: values to put in those columns
I often have the same trouble:
https://towardsdatascience.com/reshape-pandas-dataframe-with-pivot-table-in-python-tutorial-and-visualization-2248c2012a31
This could help next time.
This question already has answers here:
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
(16 answers)
Closed 1 year ago.
I have dataframe like this
Is there any way to convert to this
I tried to traverse data frame than traverse grades array and add add to new data frame but it doesn't seem most efficient or easy way is there any built in method or better way to approach this problem
PS: I searched for similar questions but I couldn't find it if there is question like this I am very sorry ı will delete immediately
What you want is pandas.DataFrame.explode().
import ast
# Make sure column A is list first.
df['A'] = df['A'].apply(ast.literal_eval)
# or
df['A'] = df['A'].apply(pd.eval)
df = df.explode('Grades')
df = df.rename(columns={'Grades': 'Grade'})
This question already has answers here:
How to replace NaNs by preceding or next values in pandas DataFrame?
(10 answers)
Closed 3 years ago.
Table 1 represents the format of my raw data. The dataset was prepared in such a way that the name of a variable 1 is only mentioned for the first observation. I am exploring the dataset and would like to report the count of certain features grouped by the first variable. to achieve this I would have to transform my data into the second table (Output).
How can I achieve this with pandas?
1
The solution can be found in the pandas documentation under Upsampling. The method used is called ffill() and is used as such:
df.ffill()