Python - Count JSON elements before extracting data - python

I use an API which gives me a JSON file structured like this:
{
offset: 0,
results: [
{
source_link: "http://www.example.com/1",
source_link/_title: "Title example 1",
source_link/_source: "/1",
source_link/_text: "Title example 1"
},
{
source_link: "http://www.example.com/2",
source_link/_title: "Title example 2",
source_link/_source: "/2",
source_link/_text: "Title example 2"
},
...
And I use this code in Python to extract the data I need:
import json
import urllib2
u = urllib2.urlopen('myapiurl')
z = json.load(u)
u.close
link = z['results'][1]['source_link']
title = z['results'][1]['source_link/_title']
The problem is that to use it I have to know the number of the element from which I'm extracting the data. My results can have different length every time, so what I want to do is to count the number of elements in results at first, so I would be able to set up a loop to extract data from each element.

To check the length of the results key:
len(z["results"])
But if you're just looping around them, a for loop is perfect:
for result in x["results"]:
print(result["source_link"])

You didn't need to know the length of the result, you are fine with a for loop:
for result in z['results']:
# process the results here
Anyway, if you want to know the length of 'results': len(z.results)

If you want to get the length, you can try:
len(z['result'])
But in python, what we usually do is:
for i in z['result']:
# do whatever you like with `i`
Hope this helps.

You don't need, or likely want, to count them in order to loop over them, you could do:
import json
import urllib2
u = urllib2.urlopen('myapiurl')
z = json.load(u)
u.close
for result in z['results']:
link = result['source_link']
title = result['source_link/_title']
# do something with link/title
Or you could do:
u = urllib2.urlopen('myapiurl')
z = json.load(u)
u.close
link = [result['source_link'] for result in z['results']]
title = [result['source_link/_title'] for result in z['results']]
# do something with links/titles lists

Few pointers:
No need to know results's length to iterate it. You can use for result in z['results'].
lists start from 0.
If you do need the index take a look at enumerate.

use this command to print the result on the terminal and then can check the number of results
print(len(z['results'][0]))

Related

Converting nested dict into a single unchanging unique number

I'm working on a minimax algorithm project and I am trying to find a way to save board values in a text file so they don't need to be calculated over and over again each time the program is tested. I have the board stored as a nested dictionary.
rows = {
4:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
3:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
2:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
1:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
}
I tried doing this, which gives the desired result but is not at all optimized and I'm sure there is a way to do this better.
rows = {
4:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
3:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
2:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
1:{1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0},
}
e = []
for key in rows:
e.append(list(rows[key].values()))
e=str(e)
e=e.replace ("[",""); e=e.replace ("]","")
e=e.replace (" ","")
e=e.replace (",","")
print(e)
You could make use of a str.join(), map is used to convert integers to strings:
res = ''.join(''.join(map(str, r.values())) for r in rows.values())
print(res)
Out:
00000000000000000000000000000000

Exctract a value from a Json file(python)

Hi i'm not an expert and this problem kept me stuck for such a long time I hope that someone here can help me
i would like to exctract the value "interestExpense" from the following json file:
{'incomeBeforeTax': 17780000000,
'minorityInterest': 103000000,
'netIncome': 17937000000,
'sellingGeneralAdministrative': 5918000000,
'grossProfit': 16507000000,
'ebit': 10589000000,
'endDate': 1640908800,
'operatingIncome': 10589000000,
'interestExpense': -1803000000,
'incomeTaxExpense': -130000000,
'totalRevenue': 136341000000,
'totalOperatingExpenses': 125752000000,
'costOfRevenue': 119834000000,
'totalOtherIncomeExpenseNet': 7191000000,
'netIncomeFromContinuingOps': 17910000000,
'netIncomeApplicableToCommonShares': 17937000000}
In this case the result should be -130000000 as a string but i m trying to find a way to create an list(or an array) with all those floats so that i can decide which one to pick, i have no idea how to manipulate this kind of data(json)
For example
print(list[0])
should return 17780000000(the value associated with incomeBeforeTax)
is this actually possible?
The output is generated from this code:
annual_is_stms=[]
url_financials ='https://finance.yahoo.com/quote/{}/financials?p{}'
stock= 'F'
response = requests.get(url_financials.format(stock,stock),headers=headers)
soup = BeautifulSoup(response.text,'html.parser')
pattern = re.compile(r'\s--\sData\s--\s')
script_data = soup.find('script',text=pattern).contents[0]
script_data[:500]
script_data[-500:]
start = script_data.find("context")-2
json_data =json.loads(script_data[start:-12])
json_data['context']['dispatcher']['stores']['QuoteSummaryStore'].keys()
#all data relative financials
annual_is=json_data['context']['dispatcher']['stores']['QuoteSummaryStore']['incomeStatementHistory']['incomeStatementHistory']
for s in annual_is:
statement = {}
for key, val in s.items():
try:
statement[key] = val['raw']
except TypeError:
continue
except KeyError:
continue
annual_is_stms.append(statement)
print(annual_is_stms[0])
If you are using python, you need to include the json module and parse it as an object:
import json
# some JSON:
x = '{ "name":"John", "age":30, "city":"New York"}'
# parse x:
y = json.loads(x)
# the result is a Python dictionary:
print(y["age"])
Regards
L.
Ok, so the output snippet you posted comes from this line:
print(annual_is_stms[0])
If you now want the: -1803000000 you should do:
print(annual_is_stms[0]['interestExpense'])
If you want the: -130000000 you should do:
print(annual_is_stms[0]['incomeTaxExpense'])
and if you want the: 17780000000 you should do:
print(annual_is_stms[0]['incomeBeforeTax'])
Copy and paste this into Python.
data = {'incomeBeforeTax': 17780000000,
'minorityInterest': 103000000,
'netIncome': 17937000000,
'sellingGeneralAdministrative': 5918000000,
'grossProfit': 16507000000,
'ebit': 10589000000,
'endDate': 1640908800,
'operatingIncome': 10589000000,
'interestExpense': -1803000000,
'incomeTaxExpense': -130000000,
'totalRevenue': 136341000000,
'totalOperatingExpenses': 125752000000,
'costOfRevenue': 119834000000,
'totalOtherIncomeExpenseNet': 7191000000,
'netIncomeFromContinuingOps': 17910000000,
'netIncomeApplicableToCommonShares': 17937000000}
print(data['interestExpense'])

How to store multiple python dictionaries or merge in a loop and return?

I am trying to get the pagination results of two pages but return is exiting loop and displays only one result from a page.
Is there a way to store or merge them?
def incidents():
m = True
limit = 50
offset = 0
while m == True:
url = f"{URL}/incidents"
params = {
"statuses[]": "resolved",
"include[]" : 'channel',
"limit" : limit,
"offset" : offset,
"total" : "true",
}
r = requests.get(url, headers=headers, params=params)
data = r.json()
offset += 50
print(offset, (r.text))
more = False # Set deliberately for easier understanding
return data
The offset, (r.text) output looks like -
50 {"incidents":[{"incident_number":1,"title":"crit server is on fire" ....
100 {"incidents":[{"incident_number":54,"title":"ghdg","description":"ghdg",....
Return only displays below and not the other one. There should be a way like use a generator for example? So we can merge them both and store in data variable so data can be returned?
100 {"incidents":[{"incident_number":54,....
I believe you could store the results in your own list:
incidents = []
and then
data = r.json()
for element in data['incidents']:
incidents.append(element)
Edited for clarity - that way you're gathering all incidents in a single object.
I'm not sure because you just gave the very start of r.text (is there more than 'incidents' within the result?), but i expect the previous answer to be a bit short; i'd suggest something like
results = []
(before the while) and at the end
data = r.json()
results += data['incidents']
return results
(btw: in your original post, each run through while just set the var "data", so no wonder the return can only deal with the very last part retrieved. But i guess that is just an artifact of you simplification, like the "more=False" would even prevent getting a second page)

For Loop 60 items 10 per 10

I'm working with an api that gives me 61 items that I include in a discord embed in a for loop.
As all of this is planned to be included into a discord bot using pagination from DiscordUtils, I need to make it so it male an embed for each 10 entry to avoid a too long message / 2000 character message.
Currently what I use to do my loop is here: https://api.nepmia.fr/spc/ (I recomend the usage of a parsing extention for your browser or it will be a bit hard to read it)
But what I want to create is something that will look like that : https://api.nepmia.fr/spc/formated/
So I can iterate each range in a different embed and then use pagination.
I use TinyDB to generate the JSON files I shown before with this script:
import urllib.request, json
from shutil import copyfile
from termcolor import colored
from tinydb import TinyDB, Query
db = TinyDB("/home/nepmia/Myazu/db/db.json")
def api_get():
print(colored("[Myazu]","cyan"), colored("Fetching WynncraftAPI...", "white"))
try:
with urllib.request.urlopen("https://api.wynncraft.com/public_api.php?action=guildStats&command=Spectral%20Cabbage") as u1:
api_1 = json.loads(u1.read().decode())
count = 0
if members := api_1.get("members"):
print(colored("[Myazu]","cyan"),
colored("Got expecteded answer, starting saving process.", "white"))
for member in members:
nick = member.get("name")
ur2 = f"https://api.wynncraft.com/v2/player/{nick}/stats"
u2 = urllib.request.urlopen(ur2)
api_2 = json.loads(u2.read().decode())
data = api_2.get("data")
for item in data:
meta = item.get("meta")
playtime = meta.get("playtime")
print(colored("[Myazu]","cyan"),
colored("Saving playtime for player", "white"),
colored(f"{nick}...","green"))
db.insert({"username": nick, "playtime": playtime})
count += 1
else:
print(colored("[Myazu]","cyan"),
colored("Unexpected answer from WynncraftAPI [ERROR 1]", "white"))
except:
print(colored("[Myazu]","cyan"),
colored("Unhandled error in saving process [ERROR 2]", "white"))
finally:
print(colored("[Myazu]","cyan"),
colored(f"Finished saving data for", "white"),
colored(f"{count}", "green"),
colored("players.", "white"))
but this will only create a range like this : https://api.nepmia.fr/spc/
what I would like is something like this : https://api.nepmia.fr/spc/formated/
Thanks for your help!
PS: Sorry for your eyes I'm still new to Python so I know I don't do stuff really properly :s
To follow up from the comments, you shouldn't store items in your database in a format that is specific to how you want to return results from the database to a different API, as it will make it more difficult to query in other contexts, among other reasons.
If you want to paginate items from a database it's better to do that when you query it.
According to the docs, you can iterate over all documents in a TinyDB database just by iterating directly over the DB like:
for doc in db:
...
For any iterable you can use the enumerate function to associate an index to each item like:
for idx, doc in enumerate(db):
...
If you want the indices to start with 1 as in your examples you would just use idx + 1.
Finally, to paginate the results, you need some function that can return items from an iterable in fixed-sized batches, such as one of the many solutions on this question or elsewhere. E.g. given a function chunked(iter, size) you could do:
pages = enumerate(chunked(enumerate(db), 10))
Then list(pages) gives a list of lists of tuples like [(page_num, [(player_num, player), ...].
The only difference between a list of lists and what you want is you seem to want a dictionary structure like
{'range1': {'1': {...}, '2': {...}, ...}, 'range2': {'11': {...}, ...}}
This is no different from a list of lists; the only difference is you're using dictionary keys to give numerical indices to each item in a collection, rather than the indices being implict in the list structure. There's many ways you can go from a list of lists to this. The easiest I think is using a (nested) dict comprehension:
{f'range{page_num + 1}': {str(player_num + 1): player for player_num, player in page}
for page_num, page in pages}
This will give output in exactly the format you want.
Thanks #Iguananaut for your precious help.
In the end I made something similar from your solution using a generator.
def chunker(seq, size):
for i in range(0, len(seq), size):
yield seq[i:i+size]
def embed_creator(embeds):
pages = []
current_page = None
for i, chunk in enumerate(chunker(embeds, 10)):
current_page = discord.Embed(
title=f'**SPC** Last week online time',
color=3903947)
for elt in chunk:
current_page.add_field(
name=elt.get("username"),
value=elt.get("play_output"),
inline=False)
current_page.set_footer(
icon_url="https://cdn.discordapp.com/icons/513160124219523086/a_3dc65aae06b2cf7bddcb3c33d7a5ecef.gif?size=128",
text=f"{i + 1} / {ceil(len(embeds) / 10)}"
)
pages.append(current_page)
current_page = None
return pages
Using embed_creator I generate a list named pages that I can simply use with DiscordUtils paginator.

Parsing Multiple json elements in python

I'm trying to build a small script that will go through the Etsy API and retrieve certain information. The API returns 25 different listing all in json and I would appreciate it if someone could help me learn how to handle one at a time.
Here is an example of the json I'm dealing with:
{"count":50100,"results":[{"listing_id":114179207,"state":"active"},{"listing_id":11344567,"state":"active"},
and so on.
Is there a simple way to handle only one of these listings at a time to minimize the amount of calls I must make to the API?
Here is some of the code of how I'm dealing with just one when I limit the results returned to 1:
r = requests.get('http://openapi.etsy.com/v2/listings/active?api_key=key&limit=1&offset='+str(offset_param)+'&category=Clothing')
raw_json = r.json()
encoded_json = json.dumps(raw_json)
dataObject = json.loads(encoded_json)
if dataObject["results"][0]["quantity"] > 1:
if dataObject["results"][0]["listing_id"] not in already_done:
already_done.append(dataObject["results"][0]["listing_id"])
s = requests.get('http://openapi.etsy.com/v2/users/'+str(dataObject["results"][0]["user_id"])+'/profile?api_key=key')
raw_json2 = s.json()
encoded_json2 = json.dumps(raw_json2)
dataObject2 = json.loads(encoded_json2)
t = requests.get('http://openapi.etsy.com/v2/users/'+str(dataObject["results"][0]["user_id"])+'?api_key=key')
raw_json3 = t.json()
encoded_json3 = json.dumps(raw_json3)
dataObject3 = json.loads(encoded_json3)
Seeing how the results field (or key) contains a list structure, you can simply iterate it through like the following
json_str = { ...other key-values, "results": [{"listing_id":114179207,"state":"active"},{"listing_id":11344567,"state":"active"}, ...and so on] }
results = json_str['results'] # this gives you a list of dicts
# iterate through this list
for result in results:
if result['state'] == 'active':
do_something_with( result['listing_id']
else:
do_someotherthing_with( result['listing_id'] # or none at all

Categories