Need to remove array in JSON object - python

My json object is: {"values": {"empid": 20000, "empName": "Sourav", "empSal": 8200}}
But I want to remove "Values: ". How can I do this? I have written a code in Python.
In the background It is taking the streaming data from MySQL and sending to Kinesis.
def main():
connection = {
"host": "127.0.0.1",
"port": int(sys.argv[1]),
"user": str(sys.argv[2]),
"passwd": str(sys.argv[3])}
kinesis = boto3.client("kinesis",region_name='ap-south-1')
stream = BinLogStreamReader(
connection_settings=connection,
only_events=[DeleteRowsEvent, WriteRowsEvent, UpdateRowsEvent],
server_id=100,
blocking=True,
log_file='mysql-bin.000003',
resume_stream=True,
)
for binlogevent in stream:
for row in binlogevent.rows:
print (json.dumps(row,cls=DateTimeEncoder))
kinesis.put_record(StreamName=str(sys.argv[4]), Data=json.dumps(row,cls=DateTimeEncoder),
PartitionKey="default",)

You can call row['values'] which will return the values inside of values.
An example in your code would be
kinesis.put_record(StreamName=str(sys.argv[4]), Data=json.dumps(row['values'],cls=DateTimeEncoder)

If you want to remove "Values: " from the string thatjson.dumps` produces, you can just do a replace:
json_string = json.dumps(row,cls=DateTimeEncoder)
json_string = json_string.replace("Values: ", "")
and then use the put_record on that string. Your json object is a dictionary, so you can't just remove the values: string/key from it. If you did actually remove the values key, the object would be empty.

Related

I need to connect to a live open source data base and want to record data if a certain key is present and ignore all other data

JSON format:
[{"SH_MSG": {"time": "1657291114000", "area_id": "D1", "address": "54", "msg_type": "SH", "data": "8CFB0B00"}}, {"SF_MSG": {"time": "1657291114000", "area_id": "D2", "address": "0A", "msg_type": "SF", "data": "1F"}}, ...}][...]
I want to record all data that has a CA_MSG tag at the start.
I am using stomp to obtain messages.
msg = json.loads(frame.body)
msg is a list such that:
msg = [{'SF_MSG': {'...'}}, ...]
I am trying:
for m in msg:
new_msg = []
if m.keys() == 'CA_MSG':
new_msg.append(m)
But this is just returning [] every time.
I ended up getting there in the end by skipping out the for loops for a list comprehension:
CA_MSGS = [msg['CA_MSG'] for msg in message if 'CA_MSG' in list(msg.keys())]
dict.keys() returns a list object for python version 2,
dict.keys() returns a dict_keys object for python version 3,
it can never be a True if you check the list/dict_keys object with a string object
if m.keys() == 'CA_MSG': # False all the time
# probably this is what you are looking for
# python 2
if m.keys().count('CA_MSG') > 0:
# python 3 change the dict_keys to a set probably good performance
if 'CA_MSG' in set(m.keys()):

Printing dictionary from inside a list puts one character on each line

Yes, yet another. I can't figure out what the issue is. I'm trying to iterate over a list that is a subsection of JSON output from an API call.
This is the section of JSON that I'm working with:
[
{
"created_at": "2017-02-22 17:20:29 UTC",
"description": "",
"id": 1,
"label": "FOO",
"name": "FOO",
"title": "FOO",
"updated_at": "2018-12-04 16:37:09 UTC"
}
]
The code that I'm running that retrieves this and displays it:
#!/usr/bin/python
import json
import sys
try:
import requests
except ImportError:
print "Please install the python-requests module."
sys.exit(-1)
SAT_API = 'https://satellite6.example.com/api/v2/'
USERNAME = "admin"
PASSWORD = "password"
SSL_VERIFY = False # Ignore SSL for now
def get_json(url):
# Performs a GET using the passed URL location
r = requests.get(url, auth=(USERNAME, PASSWORD), verify=SSL_VERIFY)
return r.json()
def get_results(url):
jsn = get_json(url)
if jsn.get('error'):
print "Error: " + jsn['error']['message']
else:
if jsn.get('results'):
return jsn['results']
elif 'results' not in jsn:
return jsn
else:
print "No results found"
return None
def display_all_results(url):
results = get_results(url)
if results:
return json.dumps(results, indent=4, sort_keys=True)
def main():
orgs = display_all_results(KATELLO_API + "organizations/")
for org in orgs:
print org
if __name__ == "__main__":
main()
I appear to be missing a concept because when I print org I get each character per line such as
[
{
"
c
r
e
a
t
e
d
_
a
t
"
It does this through to the final ]
I've also tried to print org['name'] which throws the TypeError: list indices must be integers, not str Python error. This makes me think that org is being seen as a list rather than a dictionary which I thought it would be due to the [{...}] format.
What concept am I missing?
EDIT: An explanation for why I'm not getting this: I'm working with a script in the Red Hat Satellite API Guide which I'm using to base another script on. I'm basically learning as I go.
display_all_results is returning a string since you are doing json.dumps in json.dumps(results, indent=4, sort_keys=True), which converts the dictionary to a string (you are getting that dictionary from r.json() in get_json function)
You then end up iterating over the characters of that string in main, and you see one character per line
Instead just return results from display_all_results and the code will work as intended
def display_all_results(url):
#results is already a dictionary, just return it
results = get_results(url)
if results:
return results
Orgs is a result of json.dump which produces a string. So instead of this code:
for org in orgs:
print(org)
replace it with simply:
#for org in orgs:
print(orgs)

Parsing json file to collect data and store in a list/array

I am trying to build an IOT setup. I am thinking of using a json file to store states of the sensors and lights of the setup.
I have created a function to test out my concept. Here is what I wrote so far for the data side of things.
{
"sensor_data": [
{
"sensor_id": "302CEM/lion/light1",
"sensor_state": "on"
},
{
"sensor_id": "302CEM/lion/light2",
"sensor_state": "off"
}
]
}
def read_from_db():
with open('datajson.json') as f:
data = json.load(f)
for sensors in data['sensor_data']:
name = sensors['sensor_id']
read_from_db()
What I want to do is to parse the sensor_id into an array so that I can access them by saying for example sensor_name[0]. I am not sure how to go about it. I tried array.array but it doesn't save any values, have also tried .append but not the result I expected. Any suggestions?
If I understood correctly, all you have to do is assign all those sensors to names using a for loop and then return the result:
import json
def read_from_db():
with open('sensor_data.json') as f:
data = json.load(f)
names = [sensors['sensor_id'] for sensors in data['sensor_data']]
return names
sensor_names = read_from_db()
for i in range(len(sensor_names)):
print(sensor_names[i])
This will print:
302CEM/lion/light1
302CEM/lion/light2

Writing JSON data in python. Format

I have this method that writes json data to a file. The title is based on books and data is the book publisher,date,author, etc. The method works fine if I wanted to add one book.
Code
import json
def createJson(title,firstName,lastName,date,pageCount,publisher):
print "\n*** Inside createJson method for " + title + "***\n";
data = {}
data[title] = []
data[title].append({
'firstName:', firstName,
'lastName:', lastName,
'date:', date,
'pageCount:', pageCount,
'publisher:', publisher
})
with open('data.json','a') as outfile:
json.dump(data,outfile , default = set_default)
def set_default(obj):
if isinstance(obj,set):
return list(obj)
if __name__ == '__main__':
createJson("stephen-king-it","stephen","king","1971","233","Viking Press")
JSON File with one book/one method call
{
"stephen-king-it": [
["pageCount:233", "publisher:Viking Press", "firstName:stephen", "date:1971", "lastName:king"]
]
}
However if I call the method multiple times , thus adding more book data to the json file. The format is all wrong. For instance if I simply call the method twice with a main method of
if __name__ == '__main__':
createJson("stephen-king-it","stephen","king","1971","233","Viking Press")
createJson("william-golding-lord of the flies","william","golding","1944","134","Penguin Books")
My JSON file looks like
{
"stephen-king-it": [
["pageCount:233", "publisher:Viking Press", "firstName:stephen", "date:1971", "lastName:king"]
]
} {
"william-golding-lord of the flies": [
["pageCount:134", "publisher:Penguin Books", "firstName:william","lastName:golding", "date:1944"]
]
}
Which is obviously wrong. Is there a simple fix to edit my method to produce a correct JSON format? I look at many simple examples online on putting json data in python. But all of them gave me format errors when I checked on JSONLint.com . I have been racking my brain to fix this problem and editing the file to make it correct. However all my efforts were to no avail. Any help is appreciated. Thank you very much.
Simply appending new objects to your file doesn't create valid JSON. You need to add your new data inside the top-level object, then rewrite the entire file.
This should work:
def createJson(title,firstName,lastName,date,pageCount,publisher):
print "\n*** Inside createJson method for " + title + "***\n";
# Load any existing json data,
# or create an empty object if the file is not found,
# or is empty
try:
with open('data.json') as infile:
data = json.load(infile)
except FileNotFoundError:
data = {}
if not data:
data = {}
data[title] = []
data[title].append({
'firstName:', firstName,
'lastName:', lastName,
'date:', date,
'pageCount:', pageCount,
'publisher:', publisher
})
with open('data.json','w') as outfile:
json.dump(data,outfile , default = set_default)
A JSON can either be an array or a dictionary. In your case the JSON has two objects, one with the key stephen-king-it and another with william-golding-lord of the flies. Either of these on their own would be okay, but the way you combine them is invalid.
Using an array you could do this:
[
{ "stephen-king-it": [] },
{ "william-golding-lord of the flies": [] }
]
Or a dictionary style format (I would recommend this):
{
"stephen-king-it": [],
"william-golding-lord of the flies": []
}
Also the data you are appending looks like it should be formatted as key value pairs in a dictionary (which would be ideal). You need to change it to this:
data[title].append({
'firstName': firstName,
'lastName': lastName,
'date': date,
'pageCount': pageCount,
'publisher': publisher
})

(Python) merge new and existing JSON with deduplication

I'm querying an API with Python, This API sends JSON of the last X events and I want to keep a history of what it sent me.
So this is what the API sends, and I have the same type of elements in my flat history file (but with many more of the same objects).
The API and my final file doesn't have a key on which to setup a dictionary.
[{
"Item1": "01234",
"Item2": "Company",
"Item3": "XXXXXXXXX",
"Item4": "",
"Item5": "2015-12-17T12:00:01.553",
"Item6": "2015-12-18T12:00:00"
},
{
"Item1": "01234",
"Item2": "Company2",
"Item3": "XXXXXXX",
"Item4": null,
"Item5": "2015-12-17T16:49:23.76",
"Item6": "2015-12-18T11:00:00",
}]
How do I add up elements of the API only if they are not in the original file?
I have a skeleton of opening/closing file but have not many ideas about the processing.
main_file=open("History.json","r")
new_items=[]
api_data=requests.get(#here lies the api address and the header)
#here should be the deplucation/processing process
for item in api_data
if item not in main_file
new_items.append(item)
main_file.close()
try:
file_updated = open("History.json",'w')
file_updated.write(new_items + main_file)
file_updated.close()
print("File updated")
except :
print("Error writing file")
EDIT : I used the json to object method to do this :
from collections import namedtuple
Event = namedtuple('Event', 'Item1, Item2, Item3, Item4, Item5, Item6')
def parse_json_events(text):
events = [ Event(**k) for k in json.loads(text) ]
return events
if path.exists('Mainfile.json'):
with open('Mainfile.json') as data_file:
local_data = json.load(data_file)
print(local_data.text) #debug purposes
events_local=parse_json_events(local_data.text)
else:
events_local=[]
events_api=parse_json_events(api_request.text)
inserted_events=0
for e in events_api[::-1]:
if e not in events_local:
events_local.insert(0, e)
inserted_events=inserted_events+1
print("inserted elements %d" % inserted_events)
print(events_local) # this is OK, gives me a list of events
print(json.dump(events_local)) # this ... well... I want the list of object to be serialized but I get this error :
TypeError: dump() missing 1 required positional argument: 'fp'
Normally you solve this kind of problems by defining a schema with/without a third party tool (like Avro, Thrift, etc.). Basically, every record you get from the API needs to be translated to an entity in the programming language you are using.
Let's take as an example this JSON object:
{
"Item1": "01234",
"Item2": "Company",
"Item3": "XXXXXXXXX",
"Item4": "",
"Item5": "2015-12-17T12:00:01.553",
"Item6": "2015-12-18T12:00:00"
},
If you have a schema like
Company(object):
company_number = ...
name = ...
# other fields
Then, all you need to do is to serialize and deserialize the raw data.
Ideally, you'd read the JSON response from the API and then you could simply split each json object as a schema object (with or without a tool). In pseudocode:
api_client = client(http://..., )
response = api_client.get("/resources")
json = response.json
companies = parse_json_companies(json) # list of Company objects
At this point, it's really easy to handle the data you got from the api. You should do the same for the files you have stored on the filesystem. Load your files and deserialize the records (to Company objects). Then, it will be easy to compare the objects, as they will be like "normal" Python objects, so that you can perform comparisons, etc etc.
For example:
from collections import namedtuple
import json
Company = namedtuple('Company', 'Item1, Item2, Item3, Item4, Item5, Item6')
def parse_json_companies(text):
companies = [Company(**k) for k in json.loads(text)]
return companies
>>> companies = parse_json_companies(response.json)
>>> companies
[Company(Item1='01234', Item2='Company', Item3='XXXXXXXXX', Item4=u'', Item5='2015-12-17T12:00:01.553', Item6='2015-12-18T12:00:00'), Company(Item1='01234', Item2='Company2', Item3='XXXXXXX', Item4=None, Item5='2015-12-17T16:49:23.76', Item6='2015-12-18T11:00:00')]
Update after error on .dump(obj, fp) .
If you get the error with json.dump, refer to the documentation please. It clearly states that obj and fp are required arguments.
Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object) using this conversion table.
So, you need to pass an object that supports .write (e.g., a file opened in write mode).
I think the best way of solving this would be to think about your data structure. It seems like you're using the same data structure as the api at this moment.
Is there an Id among these item fields? If so use that field for deduplication. But for this example I'll use company name.
with open('history.json') as f:
historic_data = json.load(f)
api_data = requests.get()
for item in api_data:
historic_data[item['Item2']] = item
f.write(json.dumps(historic_data))
Every time the name in this case already exists in the dictionary it will be overwritten. If the name isn't existing it will be added.

Categories