Parsing Multiple json elements in python - python

I'm trying to build a small script that will go through the Etsy API and retrieve certain information. The API returns 25 different listing all in json and I would appreciate it if someone could help me learn how to handle one at a time.
Here is an example of the json I'm dealing with:
{"count":50100,"results":[{"listing_id":114179207,"state":"active"},{"listing_id":11344567,"state":"active"},
and so on.
Is there a simple way to handle only one of these listings at a time to minimize the amount of calls I must make to the API?
Here is some of the code of how I'm dealing with just one when I limit the results returned to 1:
r = requests.get('http://openapi.etsy.com/v2/listings/active?api_key=key&limit=1&offset='+str(offset_param)+'&category=Clothing')
raw_json = r.json()
encoded_json = json.dumps(raw_json)
dataObject = json.loads(encoded_json)
if dataObject["results"][0]["quantity"] > 1:
if dataObject["results"][0]["listing_id"] not in already_done:
already_done.append(dataObject["results"][0]["listing_id"])
s = requests.get('http://openapi.etsy.com/v2/users/'+str(dataObject["results"][0]["user_id"])+'/profile?api_key=key')
raw_json2 = s.json()
encoded_json2 = json.dumps(raw_json2)
dataObject2 = json.loads(encoded_json2)
t = requests.get('http://openapi.etsy.com/v2/users/'+str(dataObject["results"][0]["user_id"])+'?api_key=key')
raw_json3 = t.json()
encoded_json3 = json.dumps(raw_json3)
dataObject3 = json.loads(encoded_json3)

Seeing how the results field (or key) contains a list structure, you can simply iterate it through like the following
json_str = { ...other key-values, "results": [{"listing_id":114179207,"state":"active"},{"listing_id":11344567,"state":"active"}, ...and so on] }
results = json_str['results'] # this gives you a list of dicts
# iterate through this list
for result in results:
if result['state'] == 'active':
do_something_with( result['listing_id']
else:
do_someotherthing_with( result['listing_id'] # or none at all

Related

API call with for loop doesn't loop through a list in Python

I'm trying to make an API call to coinmarketcap to get the prices for a crypto portfolio and append the prices to a list.
The portfolio is on an excel file. I'm reading the excel with pandas, then looping through the "project" column of the dataframe and appending the names to a list called "project_list".
Then I'm making the API call, and looping through the previously created "project_list", and trying to get the price for each project, and finally appending those prices to an empty list called "price_list".
But when I run the code, it gives and IndexError, IndexError: list index out of range.
But the curious thing is, if I populate the project list manually, the code works just fine.
I thought this might be because I'm appending to the list dynamically and it's not ready by the time the API call is made. But I tried doing other things with the list right after appending, and they all work fine..
Code is below and any help would be greatly appreciated!
Step 1: Read the Excel File
crypto_df = pd.read_excel('D:\Trading\CRYPTO\Crypto Portfolio Tracker.xlsx')
Step 2: Create project_list and append the names from the dataframe
project_list = []
for name in crypto_df['Project'].values:
project_list.append(name)
Step 3: Making the API call
class CMC:
def __init__(self, token):
self.apiurl = 'https://pro-api.coinmarketcap.com'
self.headers = {'Accepts': 'application/json', 'X-CMC_PRO_API_KEY': token, }
self.session = Session() # instead of requesting url through requests, we are requesting through sessions.
self.session.headers.update(self.headers)
def get_all_coins(self):
url = self.apiurl + '/v1/cryptocurrency/map'
r = self.session.get(url)
data = r.json()['data']
return data
def get_price(self, symbol):
url = self.apiurl + '/v2/cryptocurrency/quotes/latest'
parameters = {'symbol': symbol}
r = self.session.get(url, params=parameters)
data = r.json()['data']
return data
Step 4: Appending the prices to price_list
price_list = []
cmc = CMC(secrets_.API_KEY)
for crypto in project_list:
price = cmc.get_price(crypto)
price_list.append(price[crypto][0]['quote']['USD']['price'])
print(price_list)
And this gives the Index Error.
If I manually populate the project_list, for example project_list = ['ETH', 'BTC', 'ADA', 'LINK', etc..] it works just fine. What am I doing wrong?
Thanks for responding guys. I figured it out. There's nothing wrong with my code. Coinmarketcap has a limit on requests/min for basic plan(free) API. So if my list is too big, it's throwing an error. When I reduced my list down to only a few currencies, it worked. I still can't be sure exactly how many I can add to the list, it varies with the amount of requests they are getting I guess.

Exctract a value from a Json file(python)

Hi i'm not an expert and this problem kept me stuck for such a long time I hope that someone here can help me
i would like to exctract the value "interestExpense" from the following json file:
{'incomeBeforeTax': 17780000000,
'minorityInterest': 103000000,
'netIncome': 17937000000,
'sellingGeneralAdministrative': 5918000000,
'grossProfit': 16507000000,
'ebit': 10589000000,
'endDate': 1640908800,
'operatingIncome': 10589000000,
'interestExpense': -1803000000,
'incomeTaxExpense': -130000000,
'totalRevenue': 136341000000,
'totalOperatingExpenses': 125752000000,
'costOfRevenue': 119834000000,
'totalOtherIncomeExpenseNet': 7191000000,
'netIncomeFromContinuingOps': 17910000000,
'netIncomeApplicableToCommonShares': 17937000000}
In this case the result should be -130000000 as a string but i m trying to find a way to create an list(or an array) with all those floats so that i can decide which one to pick, i have no idea how to manipulate this kind of data(json)
For example
print(list[0])
should return 17780000000(the value associated with incomeBeforeTax)
is this actually possible?
The output is generated from this code:
annual_is_stms=[]
url_financials ='https://finance.yahoo.com/quote/{}/financials?p{}'
stock= 'F'
response = requests.get(url_financials.format(stock,stock),headers=headers)
soup = BeautifulSoup(response.text,'html.parser')
pattern = re.compile(r'\s--\sData\s--\s')
script_data = soup.find('script',text=pattern).contents[0]
script_data[:500]
script_data[-500:]
start = script_data.find("context")-2
json_data =json.loads(script_data[start:-12])
json_data['context']['dispatcher']['stores']['QuoteSummaryStore'].keys()
#all data relative financials
annual_is=json_data['context']['dispatcher']['stores']['QuoteSummaryStore']['incomeStatementHistory']['incomeStatementHistory']
for s in annual_is:
statement = {}
for key, val in s.items():
try:
statement[key] = val['raw']
except TypeError:
continue
except KeyError:
continue
annual_is_stms.append(statement)
print(annual_is_stms[0])
If you are using python, you need to include the json module and parse it as an object:
import json
# some JSON:
x = '{ "name":"John", "age":30, "city":"New York"}'
# parse x:
y = json.loads(x)
# the result is a Python dictionary:
print(y["age"])
Regards
L.
Ok, so the output snippet you posted comes from this line:
print(annual_is_stms[0])
If you now want the: -1803000000 you should do:
print(annual_is_stms[0]['interestExpense'])
If you want the: -130000000 you should do:
print(annual_is_stms[0]['incomeTaxExpense'])
and if you want the: 17780000000 you should do:
print(annual_is_stms[0]['incomeBeforeTax'])
Copy and paste this into Python.
data = {'incomeBeforeTax': 17780000000,
'minorityInterest': 103000000,
'netIncome': 17937000000,
'sellingGeneralAdministrative': 5918000000,
'grossProfit': 16507000000,
'ebit': 10589000000,
'endDate': 1640908800,
'operatingIncome': 10589000000,
'interestExpense': -1803000000,
'incomeTaxExpense': -130000000,
'totalRevenue': 136341000000,
'totalOperatingExpenses': 125752000000,
'costOfRevenue': 119834000000,
'totalOtherIncomeExpenseNet': 7191000000,
'netIncomeFromContinuingOps': 17910000000,
'netIncomeApplicableToCommonShares': 17937000000}
print(data['interestExpense'])

How to store multiple python dictionaries or merge in a loop and return?

I am trying to get the pagination results of two pages but return is exiting loop and displays only one result from a page.
Is there a way to store or merge them?
def incidents():
m = True
limit = 50
offset = 0
while m == True:
url = f"{URL}/incidents"
params = {
"statuses[]": "resolved",
"include[]" : 'channel',
"limit" : limit,
"offset" : offset,
"total" : "true",
}
r = requests.get(url, headers=headers, params=params)
data = r.json()
offset += 50
print(offset, (r.text))
more = False # Set deliberately for easier understanding
return data
The offset, (r.text) output looks like -
50 {"incidents":[{"incident_number":1,"title":"crit server is on fire" ....
100 {"incidents":[{"incident_number":54,"title":"ghdg","description":"ghdg",....
Return only displays below and not the other one. There should be a way like use a generator for example? So we can merge them both and store in data variable so data can be returned?
100 {"incidents":[{"incident_number":54,....
I believe you could store the results in your own list:
incidents = []
and then
data = r.json()
for element in data['incidents']:
incidents.append(element)
Edited for clarity - that way you're gathering all incidents in a single object.
I'm not sure because you just gave the very start of r.text (is there more than 'incidents' within the result?), but i expect the previous answer to be a bit short; i'd suggest something like
results = []
(before the while) and at the end
data = r.json()
results += data['incidents']
return results
(btw: in your original post, each run through while just set the var "data", so no wonder the return can only deal with the very last part retrieved. But i guess that is just an artifact of you simplification, like the "more=False" would even prevent getting a second page)

for loop adding same value together and make JSON format

test=[]
sites = sel.css(".info")
for site in sites:
money = site.xpath("./h2[#class='money']/text()").extract()
people = site.xpath("//p[#class='poeple']/text()").extract()
test.append('{"money":'+str(money[0])+',"people":'+str(people[0])+'}')
My result test is:
['{"money":1,"people":23}',
'{"money":3,"people":21}',
'{"money":12,"people":82}',
'{"money":1,"people":54}' ]
I was stuck by two thing:
One is I print the type of test is string,so is not like JSON format
Two is the money value with 1 is duplicate,so I need to add the people together ,
so the final format I want is:
[
{"money":1,"people":77},
{"money":3,"people":21},
{"money":12,"people":82},
]
How can I do this??
I'd collect money entries in a dict and add up the people as values, the output to json should be done using a json library indeed (I've not tested the code but it should give you an idea how you can approach the problem):
money_map = {}
sites = sel.css(".info")
for site in sites:
money = site.xpath("./h2[#class='money']/text()").extract()[0]
people = int(site.xpath("//p[#class='poeple']/text()").extract()[0])
if money not in money_map:
money_map[money] = 0
money_map[money] += people
import json
output = [{'money': key, 'people': value} for key, value in money_map.items()]
json_output = json.dumps(output)
basically this:
import json
foo = ['{"money":1,"people":23}',
'{"money":3,"people":21}',
'{"money":12,"people":82}',
'{"money":1,"people":54}' ]
bar = []
for i in foo:
j = json.loads(i) # string to json/dict
# if j['money'] is not in bar:
bar.append(j)
# else:
# find index of duplicate and add j['people']
Above is incomplete solution, you have to implement the 'duplicate check and add'

Python - Count JSON elements before extracting data

I use an API which gives me a JSON file structured like this:
{
offset: 0,
results: [
{
source_link: "http://www.example.com/1",
source_link/_title: "Title example 1",
source_link/_source: "/1",
source_link/_text: "Title example 1"
},
{
source_link: "http://www.example.com/2",
source_link/_title: "Title example 2",
source_link/_source: "/2",
source_link/_text: "Title example 2"
},
...
And I use this code in Python to extract the data I need:
import json
import urllib2
u = urllib2.urlopen('myapiurl')
z = json.load(u)
u.close
link = z['results'][1]['source_link']
title = z['results'][1]['source_link/_title']
The problem is that to use it I have to know the number of the element from which I'm extracting the data. My results can have different length every time, so what I want to do is to count the number of elements in results at first, so I would be able to set up a loop to extract data from each element.
To check the length of the results key:
len(z["results"])
But if you're just looping around them, a for loop is perfect:
for result in x["results"]:
print(result["source_link"])
You didn't need to know the length of the result, you are fine with a for loop:
for result in z['results']:
# process the results here
Anyway, if you want to know the length of 'results': len(z.results)
If you want to get the length, you can try:
len(z['result'])
But in python, what we usually do is:
for i in z['result']:
# do whatever you like with `i`
Hope this helps.
You don't need, or likely want, to count them in order to loop over them, you could do:
import json
import urllib2
u = urllib2.urlopen('myapiurl')
z = json.load(u)
u.close
for result in z['results']:
link = result['source_link']
title = result['source_link/_title']
# do something with link/title
Or you could do:
u = urllib2.urlopen('myapiurl')
z = json.load(u)
u.close
link = [result['source_link'] for result in z['results']]
title = [result['source_link/_title'] for result in z['results']]
# do something with links/titles lists
Few pointers:
No need to know results's length to iterate it. You can use for result in z['results'].
lists start from 0.
If you do need the index take a look at enumerate.
use this command to print the result on the terminal and then can check the number of results
print(len(z['results'][0]))

Categories