The goal is to open a json file or websites so that I can view earthquake data. I create a json function that use dictionary and a list but within the terminal an error appears as a invalid argument. What is the best way to open a json file using python?
import requests
`def earthquake_daily_summary():
req = requests.get("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.geojson")
data = req.json() # The .json() function will convert the json data from the server to a dictionary
# Open json file
f = open('https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.geojson')
# returns Json oject as a dictionary
data = json.load(f)
# Iterating through the json
# list
for i in data['emp_details']:
print(i)
f.close()
print("\n=========== PROBLEM 5 TESTS ===========")
earthquake_daily_summary()`
You can immediately convert the response to json and read the data you need.
I didn't find the 'emp_details' key, so I replaced it with 'features'.
import requests
def earthquake_daily_summary():
data = requests.get("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.geojson").json()
for row in data['features']:
print(row)
print("\n=========== PROBLEM 5 TESTS ===========")
earthquake_daily_summary()
Related
I am currently trying to convert an xml document with approx 2k records to json to upload to Mongo DB.
I have written a python script for the conversion but when I upload it to Mongo db the collection is reading this as one document with 2k sub arrays (objects) but I am trying to get 2k documents instead. My thoughts are it could be the python code? Can anyone help.
# Program to convert an xml
# file to json file
# import json module and xmltodict
# module provided by python
import json
import xmltodict
# open the input xml file and read
# data in form of python dictionary
# using xmltodict module
with open("test.xml") as xml_file:
data_dict = xmltodict.parse(xml_file.read())
# xml_file.close()
# generate the object using json.dumps()
# corresponding to json data
json_data = json.dumps(data_dict)
# Write the json data to output
# json file
with open("data.json", "w") as json_file:
json_file.write(json_data)
# json_file.close()
I am not sure why you would expect an XML-to-JSON converter to automatically split the XML at "record" boundaries. After all, XML doesn't have a built-in concept of "records" - that's something in the semantics of your vocabulary, not in the syntax of XML.
The easiest way to split an XML file into multiple files is with a simple XSLT 2.0+ stylesheet. If you use XSLT 3.0 then you can invoke the JSON conversion at the same time.
Here is my solution.
import xmltodict
import json
import pprint
# Open xml file
with open(r"test.xml", "rb") as xml_file:
# data_dict = xmltodict.parse(xml_file.read())
dict_data = xmltodict.parse(xml_file)
output_data = dict_data["root"]["course_listing"]
json_data = json.dumps(output_data, indent=2)
print(json_data)
with open("datanew.json", "w") as json_file:
json_file.write(json_data)
I am getting a JSON file from a curl request and I want to read a specific value from it.
Suppose that I have a JSON file, like the following one. How can I insert the "result_count" value into a variable?
Currently, after getting the response from curl, I am writing the JSON objects into a txt file like this.
json_response = connect_to_endpoint(url, headers)
f.write(json.dumps(json_response, indent=4, sort_keys=True))
Your json_response isn't a JSON content (JSON is a formatted string), but a python dict, you can access it using the keys
res_count = json_response['meta']['result_count']
Use the json module from the python standard library.
data itself is just a python dictionary, and can be accessed as such.
import json
with open('path/to/file/filename.json') as f:
data = json.load(f)
result_count = data['meta']['result_count']
you can parse a JSON string using json.loads() method in json module.
response = connect_to_endpoint(url, headers)
json_response = json.load(response)
after that you can extract an element with specify element name in Brackets
result_count = ['meta']['result_count']
I am trying to read a JSON file (BioRelEx dataset: https://github.com/YerevaNN/BioRelEx/releases/tag/1.0alpha7) in Python. The JSON file is a list of objects, one per sentence.
This is how I try to do it:
def _read(self, file_path):
with open(cached_path(file_path), "r") as data_file:
for line in data_file.readlines():
if not line:
continue
items = json.loads(lines)
text = items["text"]
label = items.get("label")
My code is failing on items = json.loads(line). It looks like the data is not formatted as the code expects it to be, but how can I change it?
Thanks in advance for your time!
Best,
Julia
With json.load() you don't need to read each line, you can do either of these:
import json
def open_json(path):
with open(path, 'r') as file:
return json.load(file)
data = open_json('./1.0alpha7.dev.json')
Or, even cooler, you can GET request the json from GitHub
import json
import requests
url = 'https://github.com/YerevaNN/BioRelEx/releases/download/1.0alpha7/1.0alpha7.dev.json'
response = requests.get(url)
data = response.json()
These will both give the same output. data variable will be a list of dictionaries that you can iterate over in a for loop and do your further processing.
Your code is reading one line at a time and parsing each line individually as JSON. Unless the creator of the file created the file in this format (which given it has a .json extension is unlikely) then that won't work, as JSON does not use line breaks to indicate end of an object.
Load the whole file content as JSON instead, then process the resulting items in the array.
def _read(self, file_path):
with open(cached_path(file_path), "r") as data_file:
data = json.load(data_file)
for item in data:
text = item["text"]
label appears to be buried in item["interaction"]
i have an api end point where i am uploading data to using python. end point accepts
putHeaders = {
'Authorization': user,
'Content-Type': 'application/octet-stream' }
My current code is doing this
.Save a dictionary as csv file
.Encode csv to utf8
dataFile = open(fileData['name'], 'r').read()).encode('utf-8')
.Upload file to api end point
fileUpload = requests.put(url,
headers=putHeaders,
data=(dataFile))
What i am trying to acheive is
loading the data without saving
so far i tried
converting my dictionary to bytes using
data = json.dumps(payload).encode('utf-8')
and loading to api end point . This works but the output in api end point is not correct.
Question
Does anyone know how to upload csv type data without actually saving the file ?
EDIT: use io.StringIO() as your file-like object when your writing your dict to csv. Then call get_value() and pass that as your data param to requests.put().
See this question for more details: How do I write data into CSV format as string (not file)?.
Old answer:
If your dict is this:
my_dict = {'col1': 1, 'col2': 2}
then you could convert it to a csv format like so:
csv_data = ','.join(list(my_dict.keys()))
csv_data += ','.join(list(my_dict.values()))
csv_data = csv_data.encode('utf8')
And then do your requests.put() call with data=csv_data.
Updated answer
I hadn't realized your input was a dictionary, you had mentioned the dictionary was being saved as a file. I assumed the dictionary lookup in your code was referencing a file. More work needs to be done if you want to go from a dict to a CSV file-like object.
Based on the I/O from your question, it appears that your input dictionary has this structure:
file_data = {"name": {"Col1": 1, "Col2": 2}}
Given that, I'd suggest trying the following using csv and io:
import csv
import io
import requests
session = requests.Session()
session.headers.update(
{"Authorization": user, "Content-Type": "application/octet-stream"}
)
file_data = {"name": {"Col1": 1, "Col2": 2}}
with io.StringIO() as f:
name = file_data["name"]
writer = csv.DictWriter(f, fieldnames=name)
writer.writeheader()
writer.writerows([name]) # `data` is dict but DictWriter expects list of dicts
response = session.put(url, data=f)
You may want to test using the correct MIME type passed in the request header. While the endpoint may not care, it's best practice to use the correct type for the data. CSV should be text/csv. Python also provides a MIME types module:
>>> import mimetypes
>>>
>>> mimetypes.types_map[".csv"]
'text/csv'
Original answer
Just open the file in bytes mode and rather than worrying about encoding or reading into memory.
Additionally, use a context manager to handle the file rather than assigning to a variable, and pass your header to a Session object so you don't have to repeatedly pass header data in your request calls.
Documentation on the PUT method:
https://requests.readthedocs.io/en/master/api/#requests.put
data – (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the Request.
import requests
session = requests.Session()
session.headers.update(
{"Authorization": user, "Content-Type": "application/octet-stream"}
)
with open(file_data["name"], "rb") as f:
response = session.put(url, data=f)
Note: I modified your code to more closely follow python style guides.
I have this script which abstract the json objects from the webpage. The json objects are converted into dictionary. Now I need to write those dictionaries in a file. Here's my code:
#!/usr/bin/python
import requests
r = requests.get('https://github.com/timeline.json')
for item in r.json or []:
print item['repository']['name']
There are ten lines in a file. I need to write the dictionary in that file which consist of ten lines..How do I do that? Thanks.
To address the original question, something like:
with open("pathtomyfile", "w") as f:
for item in r.json or []:
try:
f.write(item['repository']['name'] + "\n")
except KeyError: # you might have to adjust what you are writing accordingly
pass # or sth ..
note that not every item will be a repository, there are also gist events (etc?).
Better, would be to just save the json to file.
#!/usr/bin/python
import json
import requests
r = requests.get('https://github.com/timeline.json')
with open("yourfilepath.json", "w") as f:
f.write(json.dumps(r.json))
then, you can open it:
with open("yourfilepath.json", "r") as f:
obj = json.loads(f.read())