I am working on a quick and dirty script to get Chromium's bookmarks and turn them into a pipe menu for Openbox. Chromium stores it's bookmarks in a file called Bookmarks that stores information in a dictionary form like this:
{
"checksum": "99999999999999999999",
"roots": {
"bookmark_bar": {
"children": [ {
"date_added": "9999999999999999999",
"id": "9",
"name": "Facebook",
"type": "url",
"url": "http://www.facebook.com/"
}, {
"date_added": "999999999999",
"id": "9",
"name": "Twitter",
"type": "url",
"url": "http://twitter.com/"
How would I open this dictionary in this file in Python and assign it to a variable. I know you open a file with open(), but I don't really know where to go from there. In the end, I want to be able to access the info in the dictionary from a variable like this bookmarks[bookmarks_bar][children][0][name] and have it return 'Facebook'
Do you know if this is a json file? If so, python provides a json library.
Json can be used as a data serialization/interchange format. It's nice because it's cross platform. Importing this like you ask seems fairly easy, an example from the docs:
>>> import json
>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')
[u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
So in your case it would look something like:
import json
with open(file.txt) as f:
text = f.read()
bookmarks = json.loads(text)
print bookmarks[bookmarks_bar][children][0][name]
JSON is definitely the "right" way to do this, but for a quick-and-dirty script eval() might suffice:
with open('file.txt') as f:
bookmarks = eval(f.read())
Related
I'll try to explain my goal. I have to write reports based on a document sent to me that has common strings in it. For example, the document sent to me contains data like:
"reportId": 84561234,
"dateReceived": "2020-01-19T17:54:31.000+0000",
"reportingEsp": {
"firstName": "Google",
"lastName": "Reviewer",
"addresses": {
"address": [
{
"street1": "1600 Ampitheater Parkway",
"street2": null,
"city": "Mountainview",
"postalCode": "94043",
"state": "CA",
"nonUsaState": null,
"country": "US",
"type": "BUSINESS"
This is an example of the 'raw' data. It is also presented in a PDF. I have tried scraping the PDF using tabula, but there seems to be some issue with fonts?? So I only get about 10% of the text. And I am wondering/thinking going after the raw data will be more accurate/easier...(if you think scraping the PDF would be easier, please let me know)
So I used this code:
with open('filetobesearched.txt', 'r') as searchfile:
for line in searchfile:
if 'reportId' in line:
print (line)
if 'dateReceived' in line:
print (line)
if 'firstName' in line:
print (line)
and this is where trouble starts... there are multiple occurrences of the string 'firstName' in the file. So my code as exists prints each of those one after the other. In the raw file those fields exist in different sections each are preceded by a section header like in the example above 'reportingESP'. So I'd like my code to somehow know the 'firstName' string belongs to a given section and the next occurrence belongs to another section to be printed with it... (make sense?)
Eventually I'd like to parse out the address information but omit any fields with a null.
And ULTIMATELY I'd like the data outputted into a file I could then in turn import into my report template and fill those fields as applicable. Which seems like a huge thing to me... so I'll be happy with help simply parsing through the raw data and outputting the results to a file in the proper order.
Thanks in advance for any help!
Thanks, yes TIL - it's json data. So I accomplished my goal like this:
JSON Data
"reportId": 84561234,
"dateReceived": "2020-01-19T17:54:31.000+0000",
"reportingEsp": {
"firstName": "Google",
"lastName": "Reviewer",
"addresses": {
"address": [
{
"street1": "1600 Ampitheater Parkway",
"street2": null,
"city": "Mountainview",
"postalCode": "94043",
"state": "CA",
"nonUsaState": null,
"country": "US",
"type": "BUSINESS"
My code:
import json
# read files
myjsonfile=open('file.json', 'r')
jsondata=myjsonfile.read()
# Parse
obj=json.loads(jsondata)
#parse through the json data to populate report variables
rptid = str(str(obj['reportId']))
dateReceived = str(str(obj['dateReceived']))
print('Report ID: ', rptid)
print('Date Received: ', dateReceived)
So now that I have those as variables I am trying to using them to fill a docx template... but that's another question I think.
Consider this one answered. Thanks again!
I am looking for a way how the schema is set with a json file in Python on Big Query. The following document says I can set it with Schema field one by one, but I want to find out more efficient way.
https://cloud.google.com/bigquery/docs/schemas
Autodetect would be skeptical to make it in this case.
I will appreciate it if you helped me.
You can create a JSON file with columns/data types and use the below code to build BigQuery Schema.
JSON File (schema.json):
[
{
"name": "emp_id",
"type": "INTEGER"
},
{
"name": "emp_name",
"type": "STRING"
}
]
Python Code:
import json
from google.cloud import bigquery
bigquerySchema = []
with open('schema.json') as f:
bigqueryColumns = json.load(f)
for col in bigqueryColumns:
bigquerySchema.append(bigquery.SchemaField(col['name'], col['type']))
print(bigquerySchema)
Soumendra Mishra is already helpful, but here is a bit more general version that can optionally accept addition fields such as mode or description:
JSON File (schema.json):
[
{
"name": "emp_id",
"type": "INTEGER",
"mode": "REQUIRED"
},
{
"name": "emp_name",
"type": "STRING",
"description": "Description of this field"
}
]
Python Code:
import json
from google.cloud import bigquery
table_schema = []
# open JSON file read only
with open('schema.json', 'r') as f:
table_schema = json.load(f)
for entry in table_schema:
# rename key; bigquery.SchemaField expects `field` to be called `field_type`
entry["field_type"] = entry.pop("type")
# ** effectively provides data as argument:value pairs (e.g. name="emp_id")
table_schema.append(bigquery.SchemaField(**entry))
I would like to grab some values within a JSON object using Python and assign them to variables for further use in the html frontend.
I tried it using the w3c resource and by googling but I can't get any successful print:
import requests
import json
response = requests.get("https://www.api-football.com/demo/api/v2/teams/team/33")
team_data = response.json()
team_name = team_data.teams.name
print(team_name)
This is the JSON Object i get from the external API:
{
"api": {
"results": 1,
"teams": [
{
"team_id": 33,
"name": "Manchester United",
"code": "MUN",
"logo": "https://media.api-football.com/teams/33.png",
"country": "England",
"founded": 1878,
"venue_name": "Old Trafford",
"venue_surface": "grass",
"venue_address": "Sir Matt Busby Way",
"venue_city": "Manchester",
"venue_capacity": 76212
}
]
}
}
The debug console tells me AttributeError: 'dict' object has no attribute 'teams'
JSONs (JavaScript Object Notation) are overall JavaScript objects used for storing serialized data. The python interpreter has an specific module import jsonthat allows to charge JSON files to Python dicts in the RAM of your machine.
In your situation:
with open('team_data.json', "r") as fobj:
dataset = json.load(fobj)
print('Teams data: ', dataset['api']['teams'])
From here you can work with it as a normal dict.
I'm trying to use the Python extension for PDAL to read in a laz file.
To do so, I'm using the simple pipeline structure as exampled here: https://gis.stackexchange.com/questions/303334/accessing-raw-data-from-laz-file-in-python-with-open-source-software. It would be useful for me, however, to insert the value contained in a variable for the "filename:" field. To do so, I've tried the following, where fullFileName is a str variable containing the name (full path) of the file, but I am getting an error that no such file exists. I am assuming my JSON syntax is slightly off or something; can anyone help?
pipeline="""{
"pipeline": [
{
"type": "readers.las",
"filename": "{fullFileName}"
}
]
}"""
You can follow this code:
import json
import pdal
file = "D:/Lidar data/input.laz"
pipeline={
"pipeline": [
{
"type": "readers.las",
"filename": file
},
{
"type": "filters.sort",
"dimension": "Z"
}
]
}
r = pdal.Pipeline(json.dumps(pipeline))
r.validate()
points = r.execute()
print(points)
{
"id": "APA91bE9N6D9Tp79gv1kUgWLhsCmbKPKJQlzgtr1iGKlL5249bzD5DxySBiaIzDmk7rOAdrWcNcP0ZxPnaj7e6Esc _iGIYJlDte-E1pMO9GME4QufgdQQOIccM2tExMd9L9RsQthR3160KbQeRmtfxW6gvuPXYN0zw",
"platform": "android",
"user": ObjectId("545b2833b21e898413de9314"),
"_id": ObjectId("545b5e76d6be01755625b284"),
"createdAt": Date(1415274102856),
"__v": 0
}{
"__v": 0,
"_id": ObjectId("545b67c4d6be01755625b2c1"),
"createdAt": Date(1415276484321),
"id": "APA91bFRxirYHIko33D1LiHODpBd77IlRhebK4tMRWecFxb5E6nfWSMFarr5mlwmY9bPQP56DGP7cnli4_jOrS8Ynn3Y9w9uaRoESoEPglqR-rA-3phsh8UtSxMC5lNoOqIrohz3hBjzzpCH_vExwo6B5yV6Mb8jyg",
"platform": "android",
"user": ObjectId("545b69a5d6be01755625b2d2")
}
Here is the content of the JSON File.
Code which i am using for importing is:
import json
with open("test.json") as json_file:
json_data = json.load(json_file)
print(json_data)
As #Puffin pointed out you need to handle ObjectId, Date etc if you are dumping a MongoDB BSON into JSON and accessing it.
If possible then use pymongo to access MongoDB directly from Python rather than dumping into JSON and accessing the data.
Your first string is not valid JSON. It is not possible to ingest this directly to JSON parser. What I would do, is to write a preprocessor, which expands the non JSON elements like ObjectId or Date to e.g. strings. Something in the line of this answer.