Python - How To Convert Pandas Dataframe To JSON Object? - python

I'm using df.to_json() to convert dataframe to json. But it gives me a json string and not an object.
How can I get JSON object?
Also, when I'm appending this data to an array, it adds single quote before and after the json and it ruins the json structure.
How can I export to json object and append properly?
Code Used:
a=[]
array.append(df1.to_json(orient='records', lines=True))
array.append(df2.to_json(orient='records', lines=True))
Result:
['{"test:"w","param":1}','{"test:"w2","param":2}]']
Required Result:
[{"test":"w","param":1},{"test":"w2","param":2}]
Thank you!

I believe need create dict and then convert to json:
import json
d = df1.to_dict(orient='records')
j = json.dumps(d)
Or if possible:
j = df1.to_json(orient='records')

Here's what worked for me:
import pandas as pd
import json
df = pd.DataFrame([{"test":"w","param":1},{"test":"w2","param":2}])
print(df)
test param
0 w 1
1 w2 2
So now we convert to a json string:
d = df.to_json(orient='records')
print(d)
'[{"test":"w","param":1},{"test":"w2","param":2}]'
And now we parse this string to a list of dicts:
data = json.loads(d)
print(data)
[{'test': 'w', 'param': 1}, {'test': 'w2', 'param': 2}]

Related

Converting an xlsx file to a dictionary in Python pandas

I am trying to import a dataframe from an xlsx file to Python and then convert this dataframe to a dictionary. This is how my Excel file looks like:
A B
1 a b
2 c d
where A and B are names of columns and 1 and 2 are names of rows.
I want to convert the data frame to a dictionary in python, using pandas. My code is pretty simple:
import pandas as pd
my_dict = pd.read_excel(‘.\inflation.xlsx’, sheet_name = ‘Sheet2’, index_col=0).to_dict()
print(my_dict)
What I want to get is:
{‘a’:’b’, ‘c’:’d’}
But what I get is:
{‘b’:{‘c’:’d’}}
What might be the issue?
This does what is requested:
import pandas as pd
d = pd.read_excel(‘.\inflation.xlsx’, sheet_name = ‘Sheet2’,index_col=0,header=None).transpose().to_dict('records')[0]
print(d)
Output:
{'a': 'b', 'c': 'd'}
The to_dict() function takes an orient parameter which specifies how the data will be manipulated. There are other options if you have more rows.
This should work
import pandas as pd
my_dict = pd.read_excel(‘.\inflation.xlsx’, sheet_name = ‘Sheet2’,header = 0 index_col=None).to_dict('records')
print(my_dict)

Removing Values from Pandas Read Excel

I am trying to read values from an excel and change them to json to use in my API.
I am getting:
{"Names":{"0":"Tom","1":"Bill","2":"Sally","3":"Cody","4":"Betty"}}
I only want to see the values. What I would like to get is this:
{"Names":{"Tom", "Bill", "Sally", "Cody", "Betty"}}
I haven't figured out how to remove the numbers before the values.
The code I am using is as follows:
import pandas as pd
df = pd.read_excel(r'C:\Users\User\Desktop\Names.xlsx')
json_str = df.to_json()
print(json_str)
As mentioned in the comments your desired result is not valid json.
maybe you can do this:
import json
import pandas as pd
df = pd.read_excel(r'C:\Users\User\Desktop\Names.xlsx')
json_str = df.to_json()
temp = json.loads(json_str)
temp['Names'] = list(temp['Names'].values())
print(json.dumps(temp))

How can I write a dictionary to a csv file?

The dictionary looks like the following.
res = {'Qatar': ['68.61994212', '59.03245947', '55.10905996'],
'Burundi': ['0.051012487', '0.048311391', '0.046681908'],
'Japan': ['9.605144835', '9.247590692', '9.542878595', ]}
I want to get rid of ' [ ] in my csv file
I want to get the output csv as,
Qatar 68.61994212 59.03245947 55.10905996
Burundi 0.051012487 0.048311391 0.046681908
Japan 9.605144835 9.247590692 9.542878595
Try the code below. The reason you are getting '[]' is because you might be trying to write the val of the dictionary as-is which is a list. Instead you need to retrieve the values in the list and then write it.
import csv
res = {'Qatar': ['68.61994212', '59.03245947', '55.10905996'],
'Burundi': ['0.051012487', '0.048311391', '0.046681908'],
'Japan': ['9.605144835', '9.247590692', '9.542878595', ]}
with open('./result.csv', 'w') as res_file:
csv_writer = csv.writer(res_file)
for k, v in res.items():
res_val = [x for x in v]
res_val.insert(0, k)
csv_writer.writerow(res_val)
OUTPUT:
The contents of the file (result.csv) are as below:
Burundi,0.051012487,0.048311391,0.046681908
Japan,9.605144835,9.247590692,9.542878595
Qatar,68.61994212,59.03245947,55.10905996
Aside from Jay-s answer if you are allowed to use Pandas then you can use panda-s to_csv function to just make the csv in one line.
import pandas as pd
df = pd.DataFrame(res)
df.to_csv('my_result.csv', index=False)
Try this:
[(k,) + tuple(res[k]) for k in res]
You will get list of tuples likes this which you can write to a csv file:
[('Burundi', '0.051012487', '0.048311391', '0.046681908'), ('Japan', '9.605144835', '9.247590692', '9.542878595'), ('Qatar', '68.61994212', '59.03245947', '55.10905996')]
Pandas will do it:
import pandas as pd
res = {'Qatar': ['68.61994212', '59.03245947', '55.10905996'],
'Burundi': ['0.051012487', '0.048311391', '0.046681908'],
'Japan': ['9.605144835', '9.247590692', '9.542878595', ]}
df = pd.DataFrame.from_dict(res, orient='index')
df.to_csv('res.csv', header=False)
Be sure to use "orient='index'" when creating the dataframe so that you get the correct row indexing in the csv
Qatar,68.61994212,59.03245947,55.10905996
Burundi,0.051012487,0.048311391,0.046681908
Japan,9.605144835,9.247590692,9.542878595

Reading dictionary stored on text file and convert to pandas dataframe [duplicate]

This question already has answers here:
Pandas read nested json
(3 answers)
Closed 4 years ago.
I have a text file that contains a series of data in the form of dictionary.
I would like to read and store as a data frame in pandas.
How would I read.
I read pd.csv yet it does not give me the dataframe.
Can anyone help me with that?
You can download the text file Here
Thanks,
Zep,
The problem is you have a nested json. Try using json_normalize instead:
import requests #<-- requests library helps us handle http-requests
import pandas as pd
id_ = '1DbfQxBJKHvWO2YlKZCmeIN4al3xG8Wq5'
url = 'https://drive.google.com/uc?authuser=0&id={}&export=download'.format(id_)
r = requests.get(url)
df = pd.io.json.json_normalize(r.json())
print(df.columns)
or from hard drive, and json_normalize as wants to read a dictionary object and not a path:
import pandas as pd
import json
with open('myfile.json') as f:
jsonstr = json.load(f)
df = pd.io.json.json_normalize(jsonstr)
Returns:
Index(['average.accelerations', 'average.aerialDuels', 'average.assists',
'average.attackingActions', 'average.backPasses', 'average.ballLosses',
'average.ballRecoveries', 'average.corners', 'average.crosses',
'average.dangerousOpponentHalfRecoveries',
...
'total.successfulLongPasses', 'total.successfulPasses',
'total.successfulPassesToFinalThird', 'total.successfulPenalties',
'total.successfulSmartPasses', 'total.successfulThroughPasses',
'total.successfulVerticalPasses', 'total.throughPasses',
'total.verticalPasses', 'total.yellowCards'],
dtype='object', length=171)
Another idea would be to store the nested objects in a Series (and you can let a dictionary hold that those series).
dfs = {k: pd.Series(v) for k,v in r.json().items()}
print(dfs.keys())
# ['average', 'seasonId', 'competitionId', 'positions', 'total', 'playerId', 'percent'])
print(dfs['percent'])
Returns:
aerialDuelsWon 23.080
defensiveDuelsWon 18.420
directFreeKicksOnTarget 0.000
duelsWon 33.470
fieldAerialDuelsWon 23.080
goalConversion 22.581
headShotsOnTarget 0.000
offensiveDuelsWon 37.250
penaltiesConversion 0.000
shotsOnTarget 41.940
...
yellowCardsPerFoul 12.500
dtype: float64
The data only has one entry though.
You can convert you data to json after reading it as string, then use pandas.read_json() to convert your json to a dataframe.
Example:
import json
from pandas.io.json import json_normalize
f = open("file.txt", "w+")
contents = f.read()
contents = contents.replace("\n", "")
json_data = json.loads(contents)
df = json_normalize(json.loads(data))
You should have your data as a dataframe after that.
Hope this helps!

Convert numpy.nd array to json [duplicate]

This question already has answers here:
NumPy array is not JSON serializable
(15 answers)
Closed 4 years ago.
I've a data frame genre_rail in which one column contains numpy.ndarray. The dataframe looks like as given below
The array in it looks like this :
['SINGTEL_movie_22906' 'SINGTEL_movie_22943' 'SINGTEL_movie_24404'
'SINGTEL_movie_22924' 'SINGTEL_movie_22937' 'SINGTEL_movie_22900'
'SINGTEL_movie_24416' 'SINGTEL_movie_24422']
I tried with the following code
import json
json_content = json.dumps({'mydata': [genre_rail.iloc[i]['content_id'] for i in range(len(genre_rail))] })
But got an error
TypeError: array is not JSON serializable
I need output as
{"Rail2_contend_id":
["SINGTEL_movie_22894","SINGTEL_movie_22898",
"SINGTEL_movie_22896","SINGTEL_movie_24609","SINGTEL_movie_2455",
"SINGTEL_movie_24550","SINGTEL_movie_24548","SINGTEL_movie_24546"]}
How about you convert the array to json using the .tolist method.
Then you can write it to json like :
np_array_to_list = np_array.tolist()
json_file = "file.json"
json.dump(b, codecs.open(json_file, 'w', encoding='utf-8'), sort_keys=True, indent=4)
Load all the data in dictionary, then dump it to json. Below code might help you
import json
#Data
d = ['SINGTEL_movie_22906', 'SINGTEL_movie_22943', 'SINGTEL_movie_24404'
'SINGTEL_movie_22924', 'SINGTEL_movie_22937', 'SINGTEL_movie_22900'
'SINGTEL_movie_24416', 'SINGTEL_movie_24422']
#Create dict
dic = {}
dic['Rail2_contend_id'] = d
print dic
#Dump data dict to jason
j = json.dumps(dic)
Output
{'Rail2_contend_id': ['SINGTEL_movie_22906', 'SINGTEL_movie_22943', 'SINGTEL_movie_24404SINGTEL_movie_22924', 'SINGTEL_movie_22937', 'SINGTEL_movie_22900SINGTEL_movie_24416', 'SINGTEL_movie_24422']}

Categories