Selecting columns - python [duplicate] - python

This question already has answers here:
Selecting non-adjacent columns by column number pandas [duplicate]
(1 answer)
Selecting a range of columns in a dataframe
(5 answers)
Closed 1 year ago.
How to select multiple columns in Python using .iloc function?
Let's say I have data frame with X rows and 100 columns and I would like to select the first 50 columns then 75 to 80 and then columns 90 and 95.
So far I read about two way of selection in Python, single columns df = df1.iloc[:,[1,2,3]] and range df = df1.iloc[:,1:30], but is there any possibility how to combine then in more complex selection?
I.e. In my example I would expect code like this:
But it does not work. I tried also different syntax (using brackets etc.) but cannot find the correct solution.
df = df1.iloc[:,[1:50,75:80,90,95]]

I believe you should try using np.r_. In thid case, please try with:
df1.iloc[:, np.r_[1:50, 75:80, 90, 95]]
This should be able to allow you to select multiple groups of columns

Related

Is there a way I can combine the describe() method and transpose in Pandas [duplicate]

This question already has answers here:
How to drop a list of rows from Pandas dataframe?
(15 answers)
Delete a column from a Pandas DataFrame
(20 answers)
Closed 6 months ago.
I am new to both pandas and python in general.
I have a dataset that I have transposed(T) and I want to use the same transposed format to drop some rows and columns.
I am able to transpose in a different window but when I try to drop some rows, it returns untransposed results.
I am looking for something like this(to combine transpose & drop)
datafraw.describe().T, drop(labels =['rowName', index = 1]
When i run the two separately, here is what it seems the transposition seems to be overshadowed by the drop commandtranspositioned table combined drop and transpositioned table

concatenate 2 dataframes while matching multiple columns [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 12 months ago.
I have 2 almost identical pandas dataframes with 5 common columns.
I want to add the second dataframe to the first which has a new column.
Dataframe 1
Dataframe 2
But I want it to update the same row given that columns 'Lot name', 'wafer' and 'site' match (green). If the columns do not match, I want to have the value of NaN as shown below.
Desired output
I have to do this with over 160 discrete columns but with possible matching Lot name, WAFER and SITE values.
I have tried the various merging(left right outer) and concat options, just cant seem to get it right. Any help\comments is appreciated.
Edit, follow up question:
I am trying to use this in a loop, where each iteration generates a new dataframe assigned to TEMP that needs to be merged with the previous dataframe. I cannot merge with an empty dataframe as it gives a merge error. How can I achieve this?
alldata = pd.DataFrame()
for i in range(len(operation)):
temp = data[data['OPE_NO'].isin([operation[i]])]
temp = temp[temp['PARAM_NAME'].isin([parameter[i]])]
temp = temp.reset_index(drop=True)
temp = temp[["LOT",'Lot name','WAFER',"SITE","PRODUCT",'PARAM_VALUE_NUMBER']]
temp = temp.rename(columns={'PARAM_VALUE_NUMBER':'PMRM28LEMCKLYTFR.1~'+operation[i]+'~'+parameter[i]})
alldata.merge(temp,how='outer')
example can be done with the following code
df1.merge(df2, how="outer")
If I'm misunderstanding problem, please tell me problem.
my english is not good but i have good heart to help you

Count values in one column based on the categories of other column [duplicate]

This question already has answers here:
Python: get a frequency count based on two columns (variables) in pandas dataframe some row appears
(3 answers)
Closed last year.
I'm working on the following dataset:
and I want to count each value in the LearnCode column for each Age category, I've tried doing it using Groupby method but didn't manage to get it correctly, can anyone help on how to do it?
You can do this using a groupby on two columns
results = df.groupby(by=['Age', 'LearnCode']).count()
This outputs a count for each ['Age', 'LearnCode'] pair

How to transpose values in one column to columns using Pandas? [duplicate]

This question already has answers here:
How to pivot a dataframe in Pandas? [duplicate]
(2 answers)
Closed 1 year ago.
Hi there I have a data set look like df1 below and I want to make it look like df2 using pandas. I have tried to use pivot and transpose but can't wrap my head around how to do it. Appreciate any help!
This should do the job
df.pivot_table(index=["AssetID"], columns='MeterName', values='MeterValue')
index: Identifier
columns: row values that will become columns
values: values to put in those columns
I often have the same trouble:
https://towardsdatascience.com/reshape-pandas-dataframe-with-pivot-table-in-python-tutorial-and-visualization-2248c2012a31
This could help next time.

Pandas Categorical masking [duplicate]

This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 4 years ago.
I have what I believe is a simple question but I can't find what I'm looking for in the docs.
I have a dataframe with a Categorical column called mycol with categories a and b and would like to be mask a subset of the dataframe as follows:
df_a = df[df.mycol.equal('a')]
Currently I am doing:
df_a = df[df.mycol.cat.codes.values==df.mycol.cat.categories.to_list().index('a')]
which is obviously extremely verbose and inelegant. Since df.mycol has both the codes and the coded labels, it has all the information to perform this operation, so I'm wondering the best way to go about this...
df_a = df[df["mycol"]=='a']
I believe this should work, unless by 'mask' you mean you want to actually zero out the values that don't have a

Categories