Searching a JSON file for a specific value, python - python

I have a JSON file that looks like this:
I have a list of device ID's, and I'd like to search my JSON for a specific value of the id, to get the name.
The data that is now is JSON format used to be in XML format, for which I used to do this:
device = xml.find("devices/device[#id=\'%s\']" %someDeviceID)
deviceName = device.attrib['name']
--
So far based on answers online I have managed to search the JSON for a jey, but I haven't yet managed to search for a value.

Personally to read a json file I use the jsondatabase module. Using this module I would use the following code
from jsondb.db import Database
db = Database('PATH/TO/YOUR/JSON/FILE')
for device in db['devices']:
if device['id'] == 'SEARCHEDID':
print(device['name'])
Of course when your json is online you could scrape it with the requests module and then parse it to the jsondatabase module

Related

How to read Json files in a directory separately with a for loop and performing a calculation

Update: Sorry it seems my question wasn't asked properly. So I am analyzing a transportation network consisting of more than 5000 links. All the data included in a big CSV file. I have several JSON files which each consist of subset of this network. I am trying to loop through all the JSON files INDIVIDUALLY (i.e. not trying to concatenate or something), read the JSON file, extract the information from the CVS file, perform calculation, and save the information along with the name of file in new dataframe. Something like this:
enter image description here
This is the code I wrote, but not sure if it's efficient enough.
name=[]
percent_of_truck=[]
path_to_json = \\directory
import glob
z= glob.glob(os.path.join(path_to_json, '*.json'))
for i in z:
with open(i, 'r') as myfile:
l=json.load(myfile)
name.append(i)
d_2019= final.loc[final['LINK_ID'].isin(l)] #retreive data from main CSV file
avg_m=(d_2019['AADTT16']/d_2019['AADT16']*d_2019['Length']).sum()/d_2019['Length'].sum() #calculation
percent_of_truck.append(avg_m)
f=pd.DataFrame()
f['Name']=name
f['% of truck']=percent_of_truck
I'm assuming here you just want a dictionary of all the JSON. If so, use the JSON library ( import JSON). If so, this code may be of use:
import json
def importSomeJSONFile(f):
return json.load(open(f))
# make sure the file exists in the same directory
example = importSomeJSONFile("example.json")
print(example)
#access a value within this , replacing key with what you want like "name"
print(JSON_imported[key])
Since you haven't added any Schema or any other specific requirements.
You can follow this approach to solve your problem, in any language you prefer
Get Directory of the JsonFiles, which needs to be read
Get List of all files present in directory
For each file-name returned in Step2.
Read File
Parse Json from String
Perform required calculation

reading data from the json file in Python for automating the web application

I have a code for automating web application( i am using Python and Selenium ),where i am entering the static data, i want to use JSON file to send the data to the application, can anyone please help me to how to write the code to pick from JSON file. Here is my code :-
import unittest
from selenium import webdriver
name = driver.find_element_by_xpath("some xpath").send_keys("xxxx")
pass = driver.find_element_by_xpath("some xpath").send_keys("xxxx")
phone_no = driver.find_element_by_xpath("some xpath").send_keys("xxxx")
Please help me in how to read data from json file.
Not sure if this is what you're looking for but, the simplest to read json is using the json module. json.load deserializes the json into a python object, usually a dictionary.
import json
with open('file-name.json') as data_file:
data = json.load(data_file)
# access json (now python object) like this... data['some-field']
When using the send_keys function, you can't just send the json as is, because it will add '\n' after every json attribute. Here are three ways to do it:
Way 1 - Do not use as a json, just use plain text
path = os.path.abspath("../excel_upload.json")
with open(path, "r") as fp:
obj = fp.readlines()
print(str(obj))
driver.find_element_by_name("name").send_keys((obj))
Way 2 - Serialization of the json, which is in the form of Dict, using json.dumps.
Way 3 - Loading the json and then removing '\n' by taking it in a string.
I used the first way for taking the complete json text, while second way to use key pairs. There will be a better way, please update with better solutions.
Loading and reading json is fine, but using it in send_keys is a different issue.

AWS Boto output to a json file

I have to use boto because not all employees have access to CLI or they do not know how to use CLI. It seems like boto is a guess game as I do not see the result of API call that I make with boto. Following is an example,
groups=autoscale_connection.get_all_groups()
print groups
using AWS CLI you can get output in a JSON file that you can parse easily
However it would be great if we can store output in a json file and then I can look at it and operate on the data I have in file.
This didn't work?
from boto.ec2.autoscale import AutoScaleConnection
conn = AutoScaleConnection()
git_em = conn.get_all_groups()
print git_em
If you have your .boto and other config files set to be json, it should pop right out.
In python,the dictionary data structure is equivalent to a JSON object.
If there exits, you should get a list of AG's as per your code(Note: get_all_groups returns a list). Then you just need to convert the list to dict, as per your requirement.
Say to create a map like:
AG(keys) -> Description(value) It's easy in python if you explore.

Take a Json object from three webpages and write that in a file?

I am trying to write a python script that will take a json object from a web page and write it to a flat file. There are ten lines in the flat file and three web pages. I have come to this code with the help of various online resources:
#!/usr/bin/python
import requests
import simplejson
r = requests.get('https://github.com/timeline.json')
c = r.content
j = simplejson.loads(c)
for item in j:
print item['repository']['name']
This code returns json objects from the github timeline events. The returned json objects are parsed and printed as a dictionary. I want to know is this a good way or is there a better way to do this?
Also, is there a way to send the json object to a python script that will update the flat file with the inputs from a webpage?
P.S Flat files are data files that contain records with no structured relationships. A normal .txt file with data.
requests can decode json text for you:
#!/usr/bin/env python
import requests
r = requests.get('https://github.com/timeline.json')
for item in r.json or []:
print item['repository']['name']

Convert BibTex file to database entries using Python

Given a bibTex file, I need to add the respective fields(author, title, journal etc.) to a table in a MySQL database (with a custom schema).
After doing some initial research, I found that there exists Bibutils which I could use to convert a bib file to xml. My initial idea was to convert it to XML and then parse the XML in python to populate a dictionary.
My main questions are:
Is there a better way I could do this conversion?
Is there a library which directly parses a bibTex and gives me the fields in python?
(I did find bibliography.parsing, which uses bibutils internally but there is not much documentation on it and am finding it tough to get it to work).
Old question, but I am doing the same thing at the moment using the Pybtex library, which has an inbuilt parser:
from pybtex.database.input import bibtex
#open a bibtex file
parser = bibtex.Parser()
bibdata = parser.parse_file("myrefs.bib")
#loop through the individual references
for bib_id in bibdata.entries:
b = bibdata.entries[bib_id].fields
try:
# change these lines to create a SQL insert
print b["title"]
print b["journal"]
print b["year"]
#deal with multiple authors
for author in bibdata.entries[bib_id].persons["author"]:
print author.first(), author.last()
# field may not exist for a reference
except(KeyError):
continue
My workaround is to use bibtexparser to export relevant fields to a csv file;
import bibtexparser
import pandas as pd
with open("../../bib/small.bib") as bibtex_file:
bib_database = bibtexparser.load(bibtex_file)
df = pd.DataFrame(bib_database.entries)
selection = df[['doi', 'number']]
selection.to_csv('temp.csv', index=False)
And then write the csv to a table in the database, and delete the temp.csv.
This avoids some complication with pybtex I found.
You can also use Python BibtexParser: https://github.com/sciunto/python-bibtexparser
Documentation: https://bibtexparser.readthedocs.org
It's very straight forward (I use it in production).
For the record, I am not the developer of this library.
Converting to XML is a fine idea.
XML exists as an application-independent data format, so that you can parse it with readily-available libraries; using it as an intermediary has no particular drawbacks. In fact, you can usually import XML into a database without even going through a programming language such as Python (although the amount of Python you'd have to write for a task like this is trivial).
So far as I know, there is no direct, mature bibTeX reader for Python.
You could use the Perl package Bib2ML (aka. Bib2HTML). It contains a bib2sql tool that generates a SQL database from a BibTeX database, with the following schema:
An alternative tool: bibsql and bibtosql.
Then you can feed it to your schema by writing some SQL conversion queries.

Categories