I am trying to convert a Pandas DataFrame to a dictionary. I would like to have pid be the key and the remaining two columns be values within the tuples.
I have tried aggregated_events.set_index('pid').to_dict('list') and aggregated_events.set_index('pid').to_dict() but know I am missing something. Any help would be greatly appreciated!
Original Dataframe
You can first transpose your dataframe to get first column as new column names, something like this:
df = df.set_index('pid').T
Then you can use to_dict to convert a dataframe to a dictionary.
Related
I cant seem to find a way to split all of the array values from the column of a dataframe.
I have managed to get all the array values using this code:
The dataframe is as follows:
I want to use value.counts() on the dataframe and I get this
I want the array values that are clubbed together to be split so that I can get the accurate count of every value.
Thanks in advance!
You could try .explode(), which would create a new row for every value in each list.
df_mentioned_id_exploded = pd.DataFrame(df_mentioned_id.explode('entities.user_mentions'))
With the above code you would create a new dataframe df_mentioned_id_exploded with a single column entities.user_mentions, which you could then use .value_counts() on.
I have a dataframe that has a column with a list of dictionaries and for each dictionary I want to be able to extract the values and put them in another column as list. Please see the picture below for example which shows only 1 row of the dataframe. so for each title shown on the picture I want to extract the values and put them in a list for all the rows in a dataframe
Use ast.literal_eval to convert the string as a list of dict then extract the `title keys from each records:
import ast
df['activities'].apply(lambda x: [d['title'] for d in ast.literal_eval(x)])
I have following dataframe in pandas:
US46434V7617 US3160928731 US4642865251
2021-07-20 13.741297 53.793367 104.151499
How can I convert this to a dict with as keys the columns and as values the values of the columns. For example:
[{'US46434V7617': 13.741297048948578, 'US3160928731': 53.7933674972021, 'US4642865251': 104.15149908700006}]
You can use df.to_dict with orient='records':
df.to_dict(orient='records')
This will give you a list of dictionaries, one for each row. By the way, the structure you provided in the question is not valid, it must be a list of dictionaries or a dictionary with key:value pairs
I have dataframe like this
I would like this to convert to a flat table as below
You can use pd.DataFrame.stack(), which could list all dataframe values to a list
df.stack().reset_index()
Out:
Maybe pandas.DataFrame.to_numpy does the trick?
Or to keep the index use pandas.DataFrame.to_records
I want to create a dictionary from a dataframe in python.
In this dataframe, frame one column contains all the keys and another column contains multiple values of that key.
DATAKEY DATAKEYVALUE
name mayank,deepak,naveen,rajni
empid 1,2,3,4
city delhi,mumbai,pune,noida
I tried this code to first convert it into simple data frame but all the values are not separating row-wise:
columnnames=finaldata['DATAKEY']
collist=list(columnnames)
dfObj = pd.DataFrame(columns=collist)
collen=len(finaldata['DATAKEY'])
for i in range(collen):
colname=collist[i]
keyvalue=finaldata.DATAKEYVALUE[i]
valuelist2=keyvalue.split(",")
dfObj = dfObj.append({colname: valuelist2}, ignore_index=True)
You should modify you title question, it is misleading because pandas dataframes are "kind of" dictionaries in themselves, that is why the first comment you got was relating to the .to_dict() pandas' built-in method.
What you want to do is actually iterate over your pandas dataframe row-wise and for each row generate a dictionary key from the first column, and a dictionary list from the second column.
For that you will have to use:
an empty dictionary: dict()
the method for iterating over dataframe rows: dataframe.iterrows()
a method to split a single string of values separated by a separator as the split() method you suggested: str.split().
With all these tools all you have to do is:
output = dict()
for index, row in finaldata.iterrows():
output[row['DATAKEY']] = row['DATAKEYVALUE'].split(',')
Note that this generates a dictionary whose values are lists of strings. And it will not work if the contents of the 'DATAKEYVALUE' column are not singles strings.
Also note that this may not be the most efficient solution if you have a very large dataframe.