I trying to import a nested JSON file. I am using Python 3.0
The JSON file looks like this
"funds": [
{
"branch": "****",
"controlDigits": "**",
"accountNumber": "7605390244",
"balance": {
"amount": "71.1",
"currency": "EUR"
},
"fundName": "Eurobits Funds 0",
"webAlias": "Eurobits Funds 0",
"performance": "4.41",
"performanceDescription": "",
"yield": {
"amount": "0.0",
"currency": "EUR"
},
"quantity": "10.00",
"valueDate": "30/03/2017",
"transactions": [
{
"operationType": "1",
"operationDescription": "MOVILIZACION HACIA DENTRO",
"operationDate": "30/03/2017",
"fundName": "B EVOLUCION PRUDEN",
"quantity": "-809.27",
"unitPrice": "7.98",
"operationAmount": {
"amount": "-6457.97",
"currency": "EUR"
}
}
]
}
]
I am using this code:
from pandas.io.json import json_normalize
data_json = open("prueba.json",mode='r', encoding="utf8").read().replace('\n', '').replace('\t', '')
data_python = json.loads(data_json)
json_normalize(data_python['funds'])
this code works fine but field transaction is not expanded
In order to expand transactions I have tried this:
json_normalize(data_python,['funds','transactions'])
The information from transactions is expanded but I loose the other information
Besides that, the field amount looks like this:
{'amount': '1.00', 'currency': ''}
and I am not able to get it into separate fields
My question is how can I combine all the information into a single dataframe?
Try d['funds'][0]['transactions'] instead, where d is the name of your dictionary.
Related
I have a resultant json from an intermediate stage as following
a=[{
"ID": "1201",
"SubID": "S1201",
"Information": {
"Name": "Kim",
"Age": "41"
}
}, {
"ID": "1433",
"subID": "G1433",
"Information": {
"Name": "John",
"Age": "32"
}
}]
I have another json that needs to compared with the above json
c= [{
"ID": "1201",
"SubID": "S1201"
},
{
"ID": "3211",
"subID": "G3211"
}
]
since the json object(a) in my intermediate result is present in another json(c). I want to retain only the json object which is being repeated.
expected output:
[{
"ID": "1201",
"SubID": "S1201",
"Information": {
"Name": "Kim",
"Age": "41"
}
}]
I'm not clear on what the approach to proceed with in achieving the same. Please guide me on this. Thanks.
ids = [e['ID'] for e in c]
repeated = [e for e in a if e['ID'] in ids]
print(repeated)
I have a JSON file with lots of data, and I want to keep only specific data.
I thought reading the file, get all the data I want and save as a new JSON.
The JSON is like this:
{
"event": [
{
"date": "2019-01-01",
"location": "world",
"url": "www.com",
"comments": "null",
"country": "china",
"genre": "blues"
},
{
"date": "2000-01-01",
"location": "street x",
"url": "www.cn",
"comments": "null",
"country":"turkey",
"genre": "reds"
},
{...
and I want it to be like this (with just date and url from each event.
{
"event": [
{
"date": "2019-01-01",
"url": "www.com"
},
{
"date": "2000-01-01",
"url": "www.cn"
},
{...
I can open the JSON and read from it using
with open('xx.json') as f:
data = json.load(f)
data2=data["events"]["date"]
But I still need to understand how to save the data I want in a new JSON keeping it's structure
You can use loop comprehension to loop over the events in and return a dictionary containing only the keys that you want.
data = { "event": [
{
"date": "2019-01-01",
"location": "world",
"url": "www.com",
"comments": None,
"country": "china",
"genre": "blues",
},
{
"date": "2000-01-01",
"location": "street x",
"url": "www.cn",
"comments": None,
"country" :"turkey",
"genre":"reds",
}
]}
# List comprehension
data["event"] = [{"date": x["date"], "url": x["url"]} for x in data["event"]]
Alternatively, you can map a function over the events list
keys_to_keep = ["date", "url"]
def subset_dict(d):
return {x: d[x] for x in keys_to_keep}
data["event"] = list(map(subset_dict, data["event"]))
I'm trying to indexing some pandas dataframe into ElasticSearch. I have some troubles while parsing the json that I'm generating. I think that my problem is coming from the mapping. Please below find my code.
import logging
from pprint import pprint
from elasticsearch import Elasticsearch
import pandas as pd
def create_index(es_object, index_name):
created = False
# index settings
settings = {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"danger": {
"dynamic": "strict",
"properties": {
"name": {
"type": "text"
},
"first_name": {
"type": "text"
},
"age": {
"type": "integer"
},
"city": {
"type": "text"
},
"sex": {
"type": "text",
},
}
}
}
}
try:
if not es_object.indices.exists(index_name):
#Ignore 400means to ignore "Index Already Exist" error
es_object.indices.create(index=index_name, ignore=400,
body=settings)
print('Created Index')
created = True
except Exception as ex:
print(str(ex))
finally:
return created
def store_record(elastic_object, index_name, record):
is_stored = True
try:
outcome = elastic_object.index(index=index_name,doc_type='danger', body=record)
print(outcome)
except Exception as ex:
print('Error in indexing data')
data = [['Hook', 'James','90', 'Austin','M'],['Sparrow','Jack','15', 'Paris', 'M'],['Kent','Clark','13', 'NYC', 'M'],['Montana','Hannah','28','Las Vegas', 'F'] ]
df = pd.DataFrame(data,columns=['name', 'first_name', 'age', 'city', 'sex'])
result = df.to_json(orient='records')
result = result[1:-1]
es = Elasticsearch()
if es is not None:
if create_index(es, 'cracra'):
out = store_record(es, 'cracra', result)
print('Data indexed successfully')
I got the following error
POST http://localhost:9200/cracra/danger [status:400 request:0.016s]
Error in indexing data
RequestError(400, 'mapper_parsing_exception', 'failed to parse')
Data indexed successfully
I don't know where it is coming from. If anyone may help me to solve this, I would be grateful.
Thanks a lot !
Try to remove extra commas from your mappings:
"mappings": {
"danger": {
"dynamic": "strict",
"properties": {
"name": {
"type": "text"
},
first_name": {
"type": "text"
},
"age": {
"type": "integer"
},
"city": {
"type": "text"
},
"sex": {
"type": "text", <-- here
}, <-- and here
}
}
}
UPDATE
It seems that the index is created successfully and the problem is in data indexing. As Nishant Saini noted you probably are trying to index several documents at a time. It can be done using Bulk API. Here is the example of correct request that indexes two documents:
POST cracra/danger/_bulk
{"index": {"_id": 1}}
{"name": "Hook", "first_name": "James", "age": "90", "city": "Austin", "sex": "M"}
{"index": {"_id": 2}}
{"name": "Sparrow", "first_name": "Jack", "age": "15", "city": "Paris", "sex": "M"}
Every document in the request body must appear in the new line with some meta information before it. In this case metainfo contains only id that must be assigned to the document.
You can either make this query by hand or use Elasticsearch Helpers for Python that can take care of adding correct metainfo.
I'm building json object which I want to insert to an already existing collection in mongoldb with data using pymongo. The data looks like this:
[
{
"title": "PyMongo",
"publication_date": "2015-09-07 10:00:00",
"tags": ["python", "mongodb", "nosql"],
"author": {
"name": "David",
"author_info": {
"$oid": "1870981708ddb1a352189e25w"
},
},
"date": {
"$date": 1484071215000 },
}
]
I noticed author has objectid and date is a timestamp, how can I create values for author_info and date and insert into mongodb?
I have a function that builds the values like this:
def build_json(title, pubctndate,taglist,author):
json_ob = \
[ {
"title": title,
"publication_date": pubctndate,
"tags": tablets,
"author": {
"name": author,
"author_info": {
"$oid": " "
},
},
"date": {"$date": },
}]
return json_ob
which I intend to call this way json.dumps(build_json(title, pubctndate,taglist,author))
Please bear with me, I'm a complete beginner.
Am trying to integrate paypal sdk in my website using the paypal-python-SDK. When I type the item_list manually like this:
{"name": "Sparzy", "sku": "music beat", "price": "25.0", "currency": "USD", "quantity": 1}
But when I try to add it in form of a variable e.g.
itemlist = {"name": "Sparzy", "sku": "music beat", "price": "25.0", "currency": "USD", "quantity": 1}
I get the following error:
Payment Error: {u'message': u'Incoming JSON request does not map to API request', u'debug_id': u'394fa35b1b301', u'name': u'MALFORMED_REQUEST', u'information_link': u'https://developer.paypal.com/webapps/developer/docs/api/#MALFORMED_REQUEST'}
I really need it as a variable so that I can dynamically generate the list when a user add producs to cart. Thanks for your help.
I think you are trying to use dictonnary as JSON. Both syntaxes are similar but dictionnaries are Python structure while JSON is a data format.
In comments you said use your itemlist dictionnary like:
"transactions": [{
"item_list": { "items": [ itemlist ] },
"amount": { "total": total, "currency": "USD" },
"description": "This is the payment transaction description."
}]
If this is your entiere JSON data you want to send, you can do something like:
dict_data = "transactions": [{
"item_list": { "items": [ itemlist ] },
"amount": { "total": total, "currency": "USD" },
"description": "This is the payment transaction description."
}])
json_data = json.dumps(dict_data)
json.dumps() take a structure and parse it in JSON formatted string.
If you are doing something like:
json_data = "transactions": [{
"item_list": { "items": [ json.dumps(itemlist) ] },
"amount": { "total": total, "currency": "USD" },
"description": "This is the payment transaction description."
}])
You are wrong because element transacations:item_list will be a list of strings and not a list of dicts.
It's important to understand that JSON is just a way to format a structure, you could parse you dict in XML in the exact same way by example.
So be sure to compute data you want to send in a dictionnary first, then, at the really end, parse the whole data in JSON string.