How to convert JSON Object to Python List - python

I have this json object and i am trying to make it into a python list but i am getting some characters along that i don't need
import json
data = {
"products": [
{
"product_cp": 100.0,
"product_sp": 120.0,
"product_name": "coke",
},
{
"product_cp": 100.5,
"product_sp": 120.0,
"product_name": "fanta",
},
{
"product_cp": 70.5,
"product_sp": 100.5,
"product_name": "pepsi",
}
]
}
data = json.dumps(data)
print(data)
print('\v')
data = json.loads(data)
data_list = list(data['products'])
when i do:
print(data_list)
i get:
[{u'product_cp': 100.0, u'product_sp': 120.0, u'product_name': u'coke'},{u'product_cp': 100.5, u'product_sp': 120.0, u'product_name': u'fanta'}, {u'product_cp': 70.5, u'product_sp': 100.5, u'product_name': u'pepsi'}]
please how i do i make it so that {,[,},] an 'u characters doesn't show up?

To print individual product details without the syntactical overhead({,[,],}) you can use a nested loop:
for product in data["products"]:
for info in product:
print("%s: %s" % (info, product[info]))
print()
The output is:
product_cp: 100.0
product_sp: 120.0
product_name: coke
product_cp: 100.5
product_sp: 120.0
product_name: fanta
product_cp: 70.5
product_sp: 100.5
product_name: pepsi

try this code under python 3.6, i am a python rookie as well.
jsonobject = json.dumps(data)
jsonobjectToString = json.loads(jsonobject)
for level1 in jsonobjectToString["products"]:
str = level1["product_cp"],level1["product_sp"],level1["product_name"]
resultresp.append(str)
print(resultresp)
===================
[(100.0, 120.0, 'coke'), (100.5, 120.0, 'fanta'), (70.5, 100.5, 'pepsi')]

Related

Unable to avoid list items in JSON file from being printed in new lines

edit
problem solved thanks to Oyono and chepner!
I'm trying to save a dictionary which includes long lists as a JSON file, without having each new item in the lists in a new line.
to save the dictionary, I'm using the command:
with open('file.json', 'w') as fp:
json.dump(coco, fp, indent=2)
The file is really big and I can not open it in Colab. so when I open it using VSC it's shown each item in the list in a separate line.
when I try to print only a small part of the dictionary in the Colab I'm getting everything in a single line.
Any ideas why could it happen or how to avoid it?
this is how it's look like in VSC:
"annotation": [
{
"segmentation": [
[
75.0,
74.5,
...(many more lines like this),
435.0,
435.5
]
],
"iscrowd": 0,
"category_id": 1,
"image_id": 43,
"id": 430,
"bbox": [
11.0,
280.0,
117.0,
156.0
],
"area": 9897,
}
],
]
}
And this is how I want it to look (and can't tell if there is actual difference between the files)
{
"segmentation": [ [ 316.0, 171.5, 320.5, 168.0, 332.5, 153.0, 332.5, 149.0, 330.0, 146.5, 305.0, 134.5, 292.0, 125.5, 280.0, 120.5, 275.0, 116.5, 270.0, 115.5, 261.5, 130.0, 258.0, 133.5, 251.5, 136.0, 255.0, 140.5, 282.0, 153.5, 285.0, 156.5, 289.0, 156.5, 296.0, 159.5, 310.0, 170.5, 316.0, 171.5 ] ],
"iscrowd": 0,
"image_id": 5,
"category_id": 1,
"id": 5,
"bbox": [ 251.5, 115.5, 81.0, 56.0 ],
"area": 2075.0
},
You have to remove indent=2 in your call to dump:
with open('file.json', 'w') as fp:
json.dump(coco, fp)
The indent keyword will save your dictionary object as a pretty printed json object, which means printing each list item in individual lines.
You can try this in a python console to see how indent influences your output json:
json.dumps({1: "a", "b": [1, 2]}, indent=2)
# will output: '{\n "1": "a",\n "b": [\n 1,\n 2\n ]\n}'
json.dumps({1: "a", "b": [1, 2]})
# will output: '{"1": "a", "b": [1, 2]}'

DataFrame to nested JSON with Python?

I am trying to extract data from SQL and convert it into the JSON file.
I also tried other "techniques" mentioned on the various websites but without any success.
So basically I'm "stuck" after below statement
j = (df.groupby(['SectionCode'])
.apply(lambda x: x[['Barcode', 'BrandCode', 'PurchaseRate', 'SalesRate', 'unit','Item']].to_dict('r'))
.reset_index()
.rename(columns={0: 'Products'})
.to_json(r'D:\DataToFirbaseWithPython\Export_DataFrame.json'))
print(j)
need this json format.
"SectionsWithItem": { #Root_Nose_In_Firebase
"0001": { #SectionCode
"Products": {
"018123": { #Barcode
"Barcode": "018123",
"BrandCode": "1004",
"PurchaseRate": 105.0,
"SalesRate": 125.0,
"Units": "Piece",
"name": "Shahi Delux Mouth Freshener"
},
"0039217": { #Barcode
"Barcode": "0039217",
"BrandCode": "0814",
"PurchaseRate": 140.0,
"SalesRate": 160.0,
"Units": "Piece",
"name": "Maizban Gota Pan Masala Medium Jar"
}
}
},
"0002": { #SectionCode
"Products": {
"03905": { #Barcode
"Barcode": "03905",
"BrandCode": "0189",
"PurchaseRate": 15.4,
"SalesRate": 17.0,
"Units": "Piece",
"name": "Peek Freans Rio Chocolate Half Roll"
},
"0003910": { #Barcode
"Barcode": "0003910",
"BrandCode": "0189",
"PurchaseRate": 110.32,
"SalesRate": 120.0,
"Units": "Piece",
"name": "Peek Freans Gluco Ticky Pack Box"
}
}
}
}
My DataFrame
Barcode,Item,SalesRate,PurchaseRate,unit,BrandCode,SectionCode
0005575,Broom Soft A Quality,100.0,80.0,,2037,0045
0005850,Safa Tomato Paste 800g,340.0,275.0,800g,1004,0009
0005921,Dettol Liquid 1Ltr,800.0,719.99,1Ltr,0475,0045
Grouping by the barcode as well should help with indexing like the desired output.
import pandas as pd
import json
df = pd.read_csv('stac1 - Sheet1.csv', dtype=str) #made dataframe with provided data
j = (df.groupby(['SectionCode', 'Barcode'])
.apply(lambda x: x[['Barcode', 'BrandCode', 'PurchaseRate', 'SalesRate','unit','Item']].to_dict('r'))
.reset_index()
.rename(columns={0: 'Products'})
.to_json(r'Export_DataFrame.json'))
with open('Export_DataFrame.json') as f:
data = json.load(f)
print(data)
Hopefully this helps get you in the right direction!

"Not all parameters were used in the SQL statement" - how to insert API json data in mysql with python?

I need to insert the JSON Metrics obtained by the Yandex API into the already created mysql table. The code, json, and error text are below. As far as I understand, the error is not in the correct data format - it is necessary to decompose the data from json into a table, and metrics should be divided into 2 columns
What i get with API and it's json:
{
"data": [
{
"dimensions": [
{
"name": "2019-02-03"
}
],
"metrics": [
100.0,
1000.0
]
},
{
"dimensions": [
{
"name": "2019-02-01"
}
],
"metrics": [
200.0,
2000.0
]
},
{
"dimensions": [
{
"name": "2019-02-02"
}
],
"metrics": [
300.0,
3000.0
]
}
],
"data_lag": 148,
"max": [
300.0,
3000.0
],
"min": [
100.0,
1000.0
],
"query": {
"adfox_event_id": "0",
"attribution": "Last",
"auto_group_size": "1",
"currency": "RUB",
"date1": "2019-02-01",
"date2": "2019-02-03",
"dimensions": [
"ym:s:date"
],
"filters": "ym:s:lastsignUTMSource=='yandex_market'",
"group": "Week",
"ids": [
COUNTER_ID
],
"limit": 100,
"metrics": [
"ym:s:ecommercePurchases",
"ym:s:ecommerceRevenue"
],
"offline_window": "21",
"offset": 1,
"quantile": "50",
"sort": [
"-ym:s:ecommercePurchases"
]
},
"sample_share": 1.0,
"sample_size": 619636,
"sample_space": 619636,
"sampled": false,
"total_rows": 3,
"total_rows_rounded": false,
"totals": [
600.0,
6000.0
]
}
Code:
import requests
import json
import mysql.connector
headers = {'Authorization': 'OAuth TOKEN'}
ids = {
'Count_1': COUNTER_ID,
}
body = {
'metrics': 'ym:s:ecommercePurchases,ym:s:ecommerceRevenue',
'dimensions': 'ym:s:date',
'date1': '2019-02-01',
'date2': '2019-02-03',
'filters': "ym:s:lastsignUTMSource=='yandex_market'",
'ids': COUNTER_ID,
'accuracy': 'full',
}
while True:
try:
req = requests.get('https://api-metrika.yandex.ru/stat/v1/data', params=body, headers=headers)
req.encoding = 'utf-8' # UTF-8
#succes\error messages
#...
elif req.status_code == 200: #succes message
print("Report succes")
parsed = json.loads(req.text)
print (json.dumps(parsed, indent=4, sort_keys=True)) #json with hierarchy
break
#...
con = mysql.connector.connect(
host="host_IP",
user="USER",
passwd="PWD",
database="DB_NAME"
)
mycursor = con.cursor()
sql = "INSERT INTO API_METRIKA(Date, Purchases, Revenue) VALUES (%s, %s, %s)"
mycursor.executemany(sql, parsed) #inserting api-data into mysql table
print('Вставлено строк:', mycursor.rowcount) #how many rows were inserted
con.commit()
con.close()
Full error text:
ProgrammingError Traceback (most recent call last)
in ()
115 print (parsed)
116
--> 117 mycursor.executemany(sql, parsed)
118
119 print('Вставлено строк:', mycursor.rowcount)
1 frames
/usr/local/lib/python3.6/dist-packages/mysql/connector/cursor.py in _batch_insert(self, operation, seq_params)
595 if psub.remaining != 0:
596 raise errors.ProgrammingError(
--> 597 "Not all parameters were used in the SQL statement")
598 #for p in self._process_params(params):
599 # tmp = tmp.replace(b'%s',p,1)
ProgrammingError: Not all parameters were used in the SQL statement
To pass specific parameters from the json feed, you need to extract the specific values from tree structure. Currently, you pass entire json into executemany() call. Also, consider the .json() method of response object, avoiding the need of json.loads() and even importing the json library:
parsed = req.json()
...
From your posted result, the needed dimensions and metrics show in data list level:
for v in parsed["data"]:
print(v)
# {'metrics': [100.0, 1000.0], 'dimensions': [{'name': '2019-02-03'}]}
# {'metrics': [200.0, 2000.0], 'dimensions': [{'name': '2019-02-01'}]}
# {'metrics': [300.0, 3000.0], 'dimensions': [{'name': '2019-02-02'}]}
Therefore, you can create a vals list via list comprehension to be passed into executemany:
sql = "INSERT INTO API_METRIKA (`Date`, `Purchases`, `Revenue`) VALUES (%s, %s, %s)"
# LIST OF TUPLES
vals = [(v['dimensions'][0]['name'],
v['metrics'][0],
v['metrics'][1]) for v in parsed["data"]]
# inserting api-data into mysql table
mycursor.executemany(sql, vals)

How to calculate some data with python?

I have JSON file. I have parsed it and I have extracted some data which are classes and code smells. Now I should calculate the number of smells on each class. I tried this with an example of code smells and it return for me the number of this smell in all the json file.
this is a part of the Json file beacause it's too long
{
"methods": [
{
"parametersTypes": [
"Bundle"
],
"sourceFile": {
"file": {
"path": "/mnt/c/shortrain-master/app/src/main/java/com/nirhart/shortrain/MainActivity.java"
}
},
"metricsValues": {
"ParameterCount": 1.0,
"NumberOfAccessedVariables": 9.0,
"ChangingClasses": 0.0,
"CouplingDispersion": 0.5,
"MethodLinesOfCode": 21.0,
"MaxNesting": 0.0,
"CyclomaticComplexity": 1.0,
"MaxCallChain": 2.0,
"ChangingMethods": 0.0,
"CouplingIntensity": 4.0
},
"fullyQualifiedName": "com.nirhart.shortrain.MainActivity.onCreate",
"smells": []
},
{
"parametersTypes": [],
"sourceFile": {
"file": {
"path": "/mnt/c/shortrain-master/app/src/main/java/com/nirhart/shortrain/MainActivity.java"
}
},
"metricsValues": {
"ParameterCount": 0.0,
"NumberOfAccessedVariables": 2.0,
"ChangingClasses": 1.0,
"CouplingDispersion": 0.0,
"MethodLinesOfCode": 6.0,
"MaxNesting": 0.0,
"CyclomaticComplexity": 1.0,
"MaxCallChain": 6.0,
"ChangingMethods": 3.0,
"CouplingIntensity": 0.0
},
"fullyQualifiedName": "com.nirhart.shortrain.MainActivity.finishActivity",
"smells": [
{
"name": "MessageChain",
"reason": "MAX_CALL_CHAIN = 6.0",
"startingLine": 54,
"endingLine": 66
}
]
},
{
"parametersTypes": [
"View"
],
"sourceFile": {
"file": {
"path": "/mnt/c/shortrain-master/app/src/main/java/com/nirhart/shortrain/MainActivity.java"
}
},
"metricsValues": {
"ParameterCount": 1.0,
"NumberOfAccessedVariables": 4.0,
"ChangingClasses": 0.0,
"CouplingDispersion": 1.0,
"MethodLinesOfCode": 6.0,
"MaxNesting": 1.0,
"CyclomaticComplexity": 3.0,
"MaxCallChain": 1.0,
"ChangingMethods": 0.0,
"CouplingIntensity": 2.0
},
"fullyQualifiedName": "com.nirhart.shortrain.MainActivity.onClick",
"smells": []
},
{
"parametersTypes": [],
"sourceFile": {
"file": {
"path": "/mnt/c/shortrain-master/app/src/main/java/com/nirhart/shortrain/MainActivity.java"
}
},
"metricsValues": {
"ParameterCount": 0.0,
"NumberOfAccessedVariables": 0.0,
"ChangingClasses": 0.0,
"CouplingDispersion": 1.0,
"MethodLinesOfCode": 3.0,
"MaxNesting": 0.0,
"CyclomaticComplexity": 1.0,
"MaxCallChain": 1.0,
"ChangingMethods": 0.0,
"CouplingIntensity": 1.0
},
"fullyQualifiedName": "com.nirhart.shortrain.MainActivity.onBackPressed",
"smells": []
}
],
"sourceFile": {
"file": {
"path": "/mnt/c/shortrain-master/app/src/main/java/com/nirhart/shortrain/MainActivity.java"
}
},
"metricsValues": {
"ClassLinesOfCode": 40.0,
"OverrideRatio": null,
"WeighOfClass": 1.0,
"LCOM2": 0.5,
"TightClassCohesion": 0.0,
"LCOM3": 0.6666666666666666,
"NumberOfAccessorMethods": 0.0,
"WeightedMethodCount": 6.0,
"IsAbstract": 0.0,
"PublicFieldCount": 0.0
},
"fullyQualifiedName": "com.nirhart.shortrain.MainActivity",
"smells": []
},
]
This is my code:
import pandas as pd
import json
all_smells=['LazyClass','ComplexClass','LongParameterList','FeatureEnvy','LongMethod','GodClass','MessageChain']
with open('/content/result_smells.json') as handle:
dictdump = json.loads(handle.read())
my_map = {}
for elem in dictdump :
my_map[elem["fullyQualifiedName"]] = []
#adding all class smells
for class_smell in elem["smells"] :
my_map[elem["fullyQualifiedName"]].append(class_smell)
#adding all methods smells
for method in elem["methods"] :
for method_smell in method["smells"] :
my_map[elem["fullyQualifiedName"]].append(method_smell)
for elem in my_map :
print(elem)
for smell in my_map[elem] :
print(smell["name"])
This is the result : the name of class and the smells on it
com.nirhart.shortrain.MainActivity
MessageChain
com.nirhart.shortrain.path.PathParser
ComplexClass
FeatureEnvy
LongParameterList
LongParameterList
LongMethod
com.nirhart.shortrain.path.PathPoint
LazyClass
LongParameterList
com.nirhart.shortrain.path.TrainPath
FeatureEnvy
com.nirhart.shortrain.rail.RailActionActivity
FeatureEnvy
LongMethod
com.nirhart.shortrain.rail.RailInfo
com.nirhart.shortrain.train.TrainActionActivity
ComplexClass
SpaghettiCode
LongMethod
LongMethod
IntensiveCoupling
I try to calculate the number of the MessageChain smell in com.nirhart.shortrain.MainActivity class which is one it return for me 5 which is the MessageChain smell in all the json file
this is my code:
x=0
for elem in my_map :
print(elem)
for smell in my_map[elem] :
if smell["name"]=='MessageChain':
x+=1
Then I need to put all results on a CSV to analyse it.
this is an exemple of csv file with one smell
A Python Counter() can be used to simplyfy counting the smells, and a csv.DictWriter() can be used to then write the resulting dictionary holding all of the counts. For example:
from collections import Counter
import csv
import json
all_smells = ['LazyClass', 'ComplexClass', 'LongParameterList', 'FeatureEnvy', 'LongMethod', 'GodClass', 'MessageChain']
my_map = {}
with open('result_smells.json') as f_json:
json_data = json.load(f_json)
for entry in json_data:
my_map[entry["fullyQualifiedName"]] = []
#adding all class smells
for class_smell in entry["smells"] :
my_map[entry["fullyQualifiedName"]].append(class_smell)
#adding all methods smells
for method in entry["methods"] :
for method_smell in method["smells"] :
my_map[entry["fullyQualifiedName"]].append(method_smell)
with open('output.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=["NameOfClass", *all_smells], extrasaction='ignore', restval='NaN')
csv_output.writeheader()
for elem in my_map :
smell_counts = Counter()
for smell in my_map[elem] :
smell_counts[smell["name"]] += 1
smell_counts['NameOfClass'] = elem
csv_output.writerow(smell_counts)
Giving you an output CSV file looking like:
NameOfClass,LazyClass,ComplexClass,LongParameterList,FeatureEnvy,LongMethod,GodClass,MessageChain
com.nirhart.shortrain.MainActivity,NaN,NaN,NaN,NaN,NaN,NaN,1
com.nirhart.shortrain.path.PathParser,NaN,1,2,1,1,NaN,NaN
com.nirhart.shortrain.path.PathPoint,1,NaN,1,NaN,NaN,NaN,NaN
com.nirhart.shortrain.path.TrainPath,NaN,NaN,NaN,1,NaN,NaN,NaN
com.nirhart.shortrain.rail.RailActionActivity,NaN,NaN,NaN,1,1,NaN,NaN
com.nirhart.shortrain.rail.RailInfo,NaN,NaN,NaN,NaN,NaN,NaN,NaN
com.nirhart.shortrain.train.TrainActionActivity,NaN,1,1,1,2,NaN,2
com.nirhart.shortrain.train.TrainDirection,NaN,NaN,NaN,NaN,NaN,NaN,NaN
com.nirhart.shortrain.train.TrainView,NaN,NaN,NaN,NaN,NaN,NaN,NaN
com.nirhart.shortrain.tutorial.TutorialFragment,NaN,NaN,NaN,NaN,NaN,NaN,NaN
com.nirhart.shortrain.tutorial.TutorialFragment.OnNextSlideClicked,1,NaN,NaN,NaN,NaN,NaN,NaN
com.nirhart.shortrain.tutorial.TutorialViewPagerAdapter,1,NaN,NaN,1,NaN,NaN,NaN
com.nirhart.shortrain.utils.ShortcutsUtils,NaN,NaN,1,NaN,NaN,NaN,2

Python dictionary comprehension filtering

I have a list of dictionaries, for instance :
movies = [
{
"name": "The Help",
"imdb": 8.0,
"category": "Drama"
},
{
"name": "The Choice",
"imdb": 6.2,
"category": "Romance"
},
{
"name": "Colonia",
"imdb": 7.4,
"category": "Romance"
},
{
"name": "Love",
"imdb": 6.0,
"category": "Romance"
},
{
"name": "Bride Wars",
"imdb": 5.4,
"category": "Romance"
},
{
"name": "AlphaJet",
"imdb": 3.2,
"category": "War"
},
{
"name": "Ringing Crime",
"imdb": 4.0,
"category": "Crime"
}
]
I want to filter them by IMDB > 5.5 :
I try this code:
[ { k:v for (k,v) in i.items() if i.get("imdb") > 5.5 } for i in movies]
and the output:
[{'name': 'The Help', 'imdb': 8.0, 'category': 'Drama'},
{'name': 'The Choice', 'imdb': 6.2, 'category': 'Romance'},
{'name': 'Colonia', 'imdb': 7.4, 'category': 'Romance'},
{'name': 'Love', 'imdb': 6.0, 'category': 'Romance'},
{},
{},
{}]
When the IMDB is lower than 5.5, It returns an empty dictionary. any ideas? thank you!
A dictionary comprehension is not necessary to filter a list of dictionaries.
You can just use a list comprehension with a condition based on a dictionary value:
res = [d for d in movies if d['imdb'] > 5.5]
The way your code is written, the dictionary comprehension produces an empty dictionary in cases where i['imdb'] <= 5.5.
An alternative to using list comprehension is using the filter function from the Python builtins. This takes in a function and an iterable, and returns a "filter object" that only keeps the items which, when passed through the function return True.
In this case, it would be:
list(filter(lambda x:x["imdb"]>5.5, movies))
I included the list() around everything to convert the filter object to a list you can use. If you want to learn more about the filter builtin, you can read about it here.
Other answers have already provided better alternative ways of doing this but let's look at the way you were going about it and look at what was going on.
If I delete some things from your code, I get:
[{} for i in movies}]
Looking at just that, should make it clear why a dictionary is created for each movie. You do have an if statement inside that dictionary, but because it is inside, it doesn't change whether it is being created.
To do this the way you were going about it, you'd essentially need to check twice making the first check irrelevant:
[
{ k:v for (k,v) in i.items() if i.get("imdb") > 5.5 } for i in movies if i.get("imdb") > 5.5
]
which can be simplified to just
[
{ k:v for (k,v) in i.items()} for i in movies if i.get("imdb") > 5.5
]
and now, since we aren't changing the item, just:
[
i for i in movies if i.get("imdb") > 5.5
]
If you are happy to use a 3rd party library, Pandas accepts a list of dictionaries via the pd.DataFrame constructor:
import pandas as pd
df = pd.DataFrame(movies)
res = df[df['imdb'] > 5.5].to_dict('records')
Result:
[{'category': 'Drama', 'imdb': 8.0, 'name': 'The Help'},
{'category': 'Romance', 'imdb': 6.2, 'name': 'The Choice'},
{'category': 'Romance', 'imdb': 7.4, 'name': 'Colonia'},
{'category': 'Romance', 'imdb': 6.0, 'name': 'Love'}]

Categories