This question already has answers here:
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
(16 answers)
Closed 1 year ago.
I have dataframe like this
Is there any way to convert to this
I tried to traverse data frame than traverse grades array and add add to new data frame but it doesn't seem most efficient or easy way is there any built in method or better way to approach this problem
PS: I searched for similar questions but I couldn't find it if there is question like this I am very sorry ı will delete immediately
What you want is pandas.DataFrame.explode().
import ast
# Make sure column A is list first.
df['A'] = df['A'].apply(ast.literal_eval)
# or
df['A'] = df['A'].apply(pd.eval)
df = df.explode('Grades')
df = df.rename(columns={'Grades': 'Grade'})
Related
This question already has answers here:
How to convert index of a pandas dataframe into a column
(9 answers)
Closed last month.
I'm reading the data I'm working on and making it organized with the following codes.
import pandas as pd
df = pd.read_csv("data.csv").assign(date=lambda x: pd.to_datetime(x['date']))
.groupby([pd.Grouper(key='date', freq='M'), pd.Grouper(key='item_id')]).count().reset_index()
.pivot('date', 'item_id').fillna(0).astype(int)
This way I can see the indexes and their values.
What should I do if I want to operate using the values in the indexes? How can I access them?
You can treat your index as a normal column:
df['new_column'] = df.index
Or
df = df.reset_index(drop=False)
This question already has answers here:
Pandas column of lists, create a row for each list element
(10 answers)
Closed 1 year ago.
I need to transform the below Dataframe into the required format without using a loop(or any other inefficient logic) as the size of dataframe is huge i.e., 950 thousand rows and also the value in the Points column has a list with lengths more than 1000. I'm getting this data after de-serializing a blob data from the database and will need to use this data create some ML Models.
input:
output:
for index,val in df.iterrows():
tempDF = pd.DataFrame(
[[
df['I'][index],df['x'][index],
df['y'][index],df['points'][index],
]]* int(df['points'][index]))
tempDF["Data"] = df['data'][index]
tempDF["index"] = list(range(1,int(df['k'][index])+1))
FinalDF = FinalDF.append(tempDF, ignore_index = True)
I have tried using for loop but for 950 thousand rows it takes so much time that using that logic is just not feasible. please help me in finding a pandas logic or if not then some other method to do that.
*I had to post screenshot because i was unable to post the dataframe with a table. Sorry I'm new to stackoverflow.
explode:
df.explode('points')
This question already has answers here:
How to pivot a dataframe in Pandas? [duplicate]
(2 answers)
Closed 1 year ago.
Hi there I have a data set look like df1 below and I want to make it look like df2 using pandas. I have tried to use pivot and transpose but can't wrap my head around how to do it. Appreciate any help!
This should do the job
df.pivot_table(index=["AssetID"], columns='MeterName', values='MeterValue')
index: Identifier
columns: row values that will become columns
values: values to put in those columns
I often have the same trouble:
https://towardsdatascience.com/reshape-pandas-dataframe-with-pivot-table-in-python-tutorial-and-visualization-2248c2012a31
This could help next time.
This question already has an answer here:
Python Pandas Key Error When Trying to Access Index
(1 answer)
Closed 2 years ago.
My DataFrame looks like this :
# Pivoting the DF
df = df.pivot(index='Date',columns='Track Name',values='Streams')
After pivoting, I cant access the Index key.
The DataFrame looks like this :
Can any one tell me what am I doing wrong ? I am new with Pandas, reference to helpful recourses is welcome.
You can try this:
df.index.values
It will return a numpy array of the values of your index column
and you can then slice it like that df.index.values[0].
Or if it is more preferable you convert it to a list:
list(df.index.values)
This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 4 years ago.
I have what I believe is a simple question but I can't find what I'm looking for in the docs.
I have a dataframe with a Categorical column called mycol with categories a and b and would like to be mask a subset of the dataframe as follows:
df_a = df[df.mycol.equal('a')]
Currently I am doing:
df_a = df[df.mycol.cat.codes.values==df.mycol.cat.categories.to_list().index('a')]
which is obviously extremely verbose and inelegant. Since df.mycol has both the codes and the coded labels, it has all the information to perform this operation, so I'm wondering the best way to go about this...
df_a = df[df["mycol"]=='a']
I believe this should work, unless by 'mask' you mean you want to actually zero out the values that don't have a