Turning Python Object into a JSON string - python

I have been working on this bot that takes the twitter api and brings it into my database. Before I was grabbing 1 tweet at a time which wasn't efficient considering I was using 1 request out of the limit they had. So instead I decided to grab 150 results. I get these results back:
[Status(ID=780587171757625344, ScreenName=Ampsx, Created=Tue Sep 27 01:57:39 +0000 2016, Text='You know who you are #memes').
I get about 150 of these. Is there a library where I can turn this into JSON?

If you're using 2.6+ there's a bundled library you can use (docs), just:
import json
json_string = json.dumps(object)
We use this a lot for quick API endpoints, you just need to be careful about having functions or complex nesting in the objects you're trying to serialize, it's quite configurable (so you can skip fields, customize output of some, etc.) but can get messy pretty quickly.

Yes, a quick Google search would have revealed the json module.
import json
# instead of an empty list, create a list of dict objects
# representing the statues as you'd like to see them in JSON.
statuses = { 'statuses': [] }
with open('file.json', 'w') as f:
f.write(json.dumps(statuses).encode('utf-8'))

import json
jsondata = json.dumps(TwitterStatusObject.__dict__)

Related

Is there a way to search for a string and copy text in front until it reaches a comma?

I am new to python and wanted to store the recentAveragePrice inside a variable (from a string like this one)
{"assetStock":null,"sales":250694,"numberRemaining":null,"recentAveragePrice":731,"originalPrice":null,"priceDataPoints":[{"value":661,"date":"2022-08-11T05:00:00Z"},{"value":592,"date":"2022-08-10T05:00:00Z"},{"value":443,"date":"2022-08-09T05:00:00Z"}],"volumeDataPoints":[{"value":155,"date":"2022-08-11T05:00:00Z"},{"value":4595,"date":"2022-08-10T05:00:00Z"},{"value":12675,"date":"2022-08-09T05:00:00Z"},{"value":22179,"date":"2022-08-08T05:00:00Z"},{"value":15181,"date":"2022-08-07T05:00:00Z"},{"value":14541,"date":"2022-08-06T05:00:00Z"},{"value":15310,"date":"2022-08-05T05:00:00Z"},{"value":14146,"date":"2022-08-04T05:00:00Z"},{"value":13083,"date":"2022-08-03T05:00:00Z"},{"value":14460,"date":"2022-08-02T05:00:00Z"},{"value":16809,"date":"2022-08-01T05:00:00Z"},{"value":17571,"date":"2022-07-31T05:00:00Z"},{"value":23907,"date":"2022-07-30T05:00:00Z"},{"value":39007,"date":"2022-07-29T05:00:00Z"},{"value":38823,"date":"2022-07-28T05:00:00Z"}]}
My current solution is this:
var = sampleStr[78] + sampleStr[79] + sampleStr[80]
It works for the current string but if the recentAveragePrice was above 999 it would stop working and i was wondering if instead of getting a fixed number i could search for it inside the string.
Your replit code shows that you're acquiring JSON data from some website. Here's an example based on the URL that you're using. It shows how you check the response status, acquire the JSON data as a Python dictionary then print a value associated with a particular key. If the key is missing, it will print None:
import requests
(r := requests.get('https://economy.roblox.com/v1/assets/10159617728/resale-data')).raise_for_status()
jdata = r.json()
print(jdata.get('recentAveragePrice'))
Output:
640
Since this is json you should just be able to parse it and access recentAveragePrice:
import json
sample_string = '''{"assetStock":null,"sales":250694,"numberRemaining":null,"recentAveragePrice":731,"originalPrice":null,"priceDataPoints":[{"value":661,"date":"2022-08-11T05:00:00Z"},{"value":592,"date":"2022-08-10T05:00:00Z"},{"value":443,"date":"2022-08-09T05:00:00Z"}],"volumeDataPoints":[{"value":155,"date":"2022-08-11T05:00:00Z"},{"value":4595,"date":"2022-08-10T05:00:00Z"},{"value":12675,"date":"2022-08-09T05:00:00Z"},{"value":22179,"date":"2022-08-08T05:00:00Z"},{"value":15181,"date":"2022-08-07T05:00:00Z"},{"value":14541,"date":"2022-08-06T05:00:00Z"},{"value":15310,"date":"2022-08-05T05:00:00Z"},{"value":14146,"date":"2022-08-04T05:00:00Z"},{"value":13083,"date":"2022-08-03T05:00:00Z"},{"value":14460,"date":"2022-08-02T05:00:00Z"},{"value":16809,"date":"2022-08-01T05:00:00Z"},{"value":17571,"date":"2022-07-31T05:00:00Z"},{"value":23907,"date":"2022-07-30T05:00:00Z"},{"value":39007,"date":"2022-07-29T05:00:00Z"},{"value":38823,"date":"2022-07-28T05:00:00Z"}]}'''
data = json.loads(sample_string)
recent_price = data['recentAveragePrice']
print(recent_price)
outputs:
731
Your data is in a popular format called JSON (JavaScript Object Notation). It's commonly used to exchange data between different systems like a server and a client, or a Python program and JavaScript program.
Now Python doesn't use JSON per-se, but it has a data type called a dictionary that behaves very similarly to JSON. You can access elements of a dictionary as simply as:
print(my_dictionary["recentAveragePrice"])
Python has a built-in library meant specifically to handle JSON data, and it includes a function called loads() that can convert a string into a Python dictionary. We'll use that.
Finally, putting all that together, here is a more robust program to help parse your string and pick out the data you need. Dictionaries can do a lot more cool stuff, so make sure you take a look at the links above.
# import the JSON library
# specifically, we import the `loads()` function, which will convert a JSON string into a Python object
from json import loads
# let's store your string in a variable
original_string = """
{"assetStock":null,"sales":250694,"numberRemaining":null,"recentAveragePrice":731,"originalPrice":null,"priceDataPoints":[{"value":661,"date":"2022-08-11T05:00:00Z"},{"value":592,"date":"2022-08-10T05:00:00Z"},{"value":443,"date":"2022-08-09T05:00:00Z"}],"volumeDataPoints":[{"value":155,"date":"2022-08-11T05:00:00Z"},{"value":4595,"date":"2022-08-10T05:00:00Z"},{"value":12675,"date":"2022-08-09T05:00:00Z"},{"value":22179,"date":"2022-08-08T05:00:00Z"},{"value":15181,"date":"2022-08-07T05:00:00Z"},{"value":14541,"date":"2022-08-06T05:00:00Z"},{"value":15310,"date":"2022-08-05T05:00:00Z"},{"value":14146,"date":"2022-08-04T05:00:00Z"},{"value":13083,"date":"2022-08-03T05:00:00Z"},{"value":14460,"date":"2022-08-02T05:00:00Z"},{"value":16809,"date":"2022-08-01T05:00:00Z"},{"value":17571,"date":"2022-07-31T05:00:00Z"},{"value":23907,"date":"2022-07-30T05:00:00Z"},{"value":39007,"date":"2022-07-29T05:00:00Z"},{"value":38823,"date":"2022-07-28T05:00:00Z"}]}
"""
# convert the string into a dictionary object
dictionary_object = loads(original_string)
# access the element you need
print(dictionary_object["recentAveragePrice"])
Output upon running this program:
$ python exp.py
731

How to extract only wanted property from JSON object

When I run the code:
import requests
import json
def get_fact():
catFact = requests.get("https://catfact.ninja/fact?max_length=140")
json_data = json.loads(catFact.text)
return json_data
print(get_fact())
The output is like
{'fact': "Cats are the world's most popular pets, outnumbering dogs by as many as three to one", 'length': 84}
However I just want the fact.
How do I get rid of the 'fact:' at the front and 'length:' at the back?
What you want is to access the key in the python dict you made with the json.loads call. We actually don't need the json library as requests can read and deserialize JSON itself.
This code also checks if the response was OK and fails with informative error message. It follows PEP 20 – The Zen of Python.
import requests
def get_fact():
# Get the facts dictionary in a JSON serialized form.
cat_fact_response = requests.get("https://catfact.ninja/fact?max_length=140")
# Let the response raise the exception if something bad happened to the cat facts server connection.
cat_fact_response.raise_for_status()
# Deserialize the json (make a Python dict from the text we got). requests can do that on it's own:
cat_fact_dict = cat_fact_response.json()
# Access the fact from the json from the dictionary
return cat_fact_dict['fact']
print(get_fact())
When called you get following output as wanted:
# python3 script.py
The cat's tail is used to maintain balance.
Short answer:
you need to use either get_fact()['fact'] or get_fact().get('fact'). The former will throw an exception if fact doesn't exist whereas the latter will return None.
Why:
In your code sample you fetch some json data, and then print out the entire bit of json. When you parse json, the output is a key/value map called a dictionary (or map or object in other languages). The dictionary in this case contains two keys: fact and length. If you only one want of the values, then you need to tell python that you want only a single value -- fact in this case.
Remember though: this wouldn't apply to every json object you read. Not every one is going to have a fact key.
What you are returning in get_fact is a complete JSON object which you are then printing.
To get just its property fact (without the length) use a reference to that key or property like:
return json_data["fact"]
Below is also a link to a tutorial on using JSON in Python:
w3schools: Python JSON
To extract fact field from the response, use:
import requests
import json
def get_fact():
catFact = requests.get("https://catfact.ninja/fact?max_length=140")
json_data = json.loads(catFact.text)
return json_data['fact'] # <- HERE
print(get_fact())
Output:
Cats have "nine lives" thanks to a flexible spine and powerful leg and back muscles
Note: you don't need json module here, use json() method of Response instance returned by requests:
import requests
def get_fact():
catFact = requests.get("https://catfact.ninja/fact?max_length=140").json()
return catFact['fact']
print(get_fact())

Check if JSON var has nullable key (Twitter Streaming API)

I'm downloading tweets from Twitter Streaming API using Tweepy. I manage to check if downloaded data has keys as 'extended_tweet', but I'm struggling with an specific key inside another key.
def on_data(self, data):
savingTweet = {}
if not "retweeted_status" in data:
dataJson = json.loads(data)
if 'extended_tweet' in dataJson:
savingTweet['text'] = dataJson['extended_tweet']['full_text']
else:
savingTweet['text'] = dataJson['text']
if 'coordinates' in dataJson:
if 'coordinates' in dataJson['coordinates']:
savingTweet['coordinates'] = dataJson['coordinates']['coordinates']
else:
savingTweet['coordinates'] = 'null'
I'm checking 'extended_key' propertly, but when I try to do the same with ['coordinates]['coordinates] I get the following error:
TypeError: argument of type 'NoneType' is not iterable
Twitter documentation says that key 'coordinates' has the following structure:
"coordinates":
{
"coordinates":
[
-75.14310264,
40.05701649
],
"type":"Point"
}
I achieved to solve it by just putting the conflictive check in a try, except, but I think this is not the most suitable approach to the problem. Any other idea?
So the twitter API docs are probably lying a bit about what they return (shock horror!) and it looks like you're getting a None in place of the expected data structure. You've already decided against using try, catch, so I won't go over that, but here are a few other suggestions.
Using dict get() default
There are a couple of options that occur to me, the first is to make use of the default ability of the dict get command. You can provide a fall back if the expected key does not exist, which allows you to chain together multiple calls.
For example you can achieve most of what you are trying to do with the following:
return {
'text': data.get('extended_tweet', {}).get('full_text', data['text']),
'coordinates': data.get('coordinates', {}).get('coordinates', 'null')
}
It's not super pretty, but it does work. It's likely to be a little slower that what you are doing too.
Using JSONPath
Another option, which is likely overkill for this situation is to use a JSONPath library which will allow you to search within data structures for items matching a query. Something like:
from jsonpath_rw import parse
matches = parse('extended_tweet.full_text').find(data)
if matches:
print(matches[0].value)
This is going to be a lot slower that what you are doing, and for just a few fields is overkill, but if you are doing a lot of this kind of work it could be a handy tool in the box. JSONPath can also express much more complicated paths, or very deeply nested paths where the get method might not work, or would be unweildy.
Parse the JSON first!
The last thing I would mention is to make sure you parse your JSON before you do your test for "retweeted_status". If the text appears anywhere (say inside the text of a tweet) this test will trigger.
JSON parsing with a competent library is usually extremely fast too, so unless you are having real speed problems it's not necessarily worth worrying about.

How to Return Nested Values from Complicated JSON API

I am setting up a weather camera which will provide a live stream of the current conditions outside, but I also would like to overlay continuously updated weather conditions (temperature, wind speed/direction, current weather) from a local National Weather Service weather station, from a browser API source provided in JSON format.
I have had success extracting the desired values from a different API source using a Python script I wrote; however long story short that API source is unreliable. Therefore I am using API from the official National Weather Service ASOS station at my nearby airport. The output from the new API source I am polling from is rather complicated, however, with various tiers of indentation. I have not worked with Python very long and tutorials and guides online have either been for other languages (Java or C++ mostly) or have not worked for my specific case.
First off, here is the structure of the JSON that I am receiving:
I underlined the values I am trying to extract. They are listed under the OBSERVATIONS section, associated with precip_accum_24_hour_value_1, wind_gust_value_1, wind_cardinal_direction_value_1d, and so on. The issue is there are two values underneath each observation so the script I have tried isn't returning the values I want. Here is the code I have tried:
import urllib.request
import json
f = urllib.request.urlopen('https://api.synopticdata.com/v2/stations/latest?token=8c96805fbf854373bc4b492bb3439a67&stid=KSTC&complete=1&units=english&output=json')
json_string = f.read()
parsed_json = json.loads(json_string)
for each in parsed_json['STATION']:
observations = each['OBSERVATIONS']
print(observations)
This prints out everything underneath the OBSERVATIONS in the JSON as expected, as one long string.
{'precip_accum_24_hour_value_1': {'date_time': '2018-12-06T11:53:00Z', 'value': 0.01}, 'wind_gust_value_1': {'date_time': '2018-12-12T01:35:00Z', 'value': 14.0},
to show a small snippet of the output I am receiving. I was hoping I could individually extract the values I want from this string, but everything I have attempted is not working. I would really appreciate some guidance for finishing this piece of code so I can return the values I am looking for. I realize it may be some kind of loop or special syntax.
Try something like this:
for each in parsed_json['STATION']:
observations = each['OBSERVATIONS']
for k, v in observations.items():
print(k, v["value"])
JSON maps well into python's dictionary and list types, so accessing substructures can be done with a[<index-or-key>] syntax. Iteration over key-value pairs of a dictionary can be done as I've shown above. If you're not familiar with dictionaries in python yet, I'd recommend reading about them. Searching online should yield a lot of good tutorials.
Does this help?
When you say the JSON is complicated, it really is just nested dictionaries within the main JSON response. You would access them in the same way as you would the initial JSON blob:
import urllib.request
import json
f = urllib.request.urlopen('https://api.synopticdata.com/v2/stations/latest?token=8c96805fbf854373bc4b492bb3439a67&stid=KSTC&complete=1&units=english&output=json')
json_string = f.read()
parsed_json = json.loads(json_string)
for each in parsed_json['STATION']:
for value in each:
print(value, each[value])

How can I ensure that my Python regular expression outputs a dictionary?

I'm using Beej's Python Flickr API to ask Flickr for JSON. The unparsed string Flickr returns looks like this:
jsonFlickrApi({'photos': 'example'})
I want to access the returned data as a dictionary, so I have:
photos = "jsonFlickrApi({'photos': 'test'})"
# to match {'photos': 'example'}
response_parser = re.compile(r'jsonFlickrApi\((.*?)\)$')
parsed_photos = response_parser.findall(photos)
However, parsed_photos is a list, not a dictionary (according to type(parsed_photos). It outputs like:
["{'photos': 'test'}"]
How can I ensure that my parsed data ends up as a dictionary type?
If you're using Python 2.6, you can just use the JSON module to parse JSON stuff.
import json
json.loads(dictString)
If you're using an earlier version of Python, you can download the simplejson module and use that.
Example:
>>> json.loads('{"hello" : 4}')
{u'hello': 4}
You need to use a JSON parser to convert the string representation to actual Python data structure. Take a look at the documentation of the json module in the standard library for some examples.
In other words you'd have to add the following line at the end of your code
photos = json.loads(parsed_photos[0])
PS. In theory you could also use eval to achieve the same effect, as JSON is (almost) compatible with Python literals, but doing that would open a huge security hole. Just to let you know.

Categories