Why does this Python method re-use this variable? [duplicate] - python

This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 9 years ago.
I am writing a wrapper for a Rest API I interact with day to day. When you make a request for something which has many results, it paginates them. I was trying to write a function to elegantly de-paginate data and ran into something unexpected -- if I try to re-use this function in the same application or IPython session, it will stack the results of the second request on top of the results of the first request. Here is the code:
class RestAPIWrapper(object):
def __init__(self, url=url, acct=acct, token=token):
self.url = url
self._session = requests.Session()
self._session.mount('https://', HTTPAdapter(max_retries=5))
self._session.headers.update({'Accept': 'application/json', 'Content-Type': 'application/json'})
self._session.auth = (acct, token)
def search_api_broken(self, query, page=1, results=[]):
r = self._session.get('{0}/search.json?page={1}&query={2}'.format(self.url, page, query))
response = r.json()
results.extend(response['results'])
#returns a dictionary that has these keys: ['results', 'next_page', 'previous_page']
if response['next_page'] is not None:
results = self.search_api_broken(query, page=page+1, results=results)
return results
def search_api_works(self, query, page=1, results=[]):
if page == 1:
results = []
r = self._session.get('{0}/search.json?page={1}&query={2}&sort_by={3}&sort_order={4}'.format(self.base_url, page, quote(query), sort_by, sort_order))
response = r.json()
results.extend(response['results'])
#returns a dictionary that has these keys: ['results', 'next_page', 'previous_page']
if response['next_page'] is not None:
results = self.search_api_wroks(query, page=page+1, results=results)
return results
In other words, if I call the method like this:
my_api_wrapper = RestAPIWrapper()
#query should return 320 results, #query2 should return 140 results
data = my_api_wrapper.search_api_broken(query)
len(data)
#outputs 320
more_data = my_api_wrapper.search_api_broken(query2)
len(more_data)
#outputs 460
The output on the second method includes the first. Why does it do this since I put results = [] in the function definition? I'm not specifying it when I call the method, so it should default to an empty list, right?

in "search_api_broken", previous content in "results" is not clear.

Related

Why json output so small?

This output should be way longer than it is in here.
I start with a GET request, I parse a JSON list and extract the id, which I then call on the second function, that will give me a second ID which then I will use to call on the 3rd function. But, I am only getting one entry whereas I should be getting way more entries.
The code is the following:
from requests.auth import HTTPBasicAuth
import requests
import json
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
def countries():
data = requests.get("https://localhost:8543/api/netim/v1/countries/", verify=False, auth=HTTPBasicAuth("admin", "admin"))
rep = data.json()
return [elem.get("id","") for elem in rep['items']]
def regions():
for c in countries():
url = requests.get("https://localhost:8543/api/netim/v1/countries/{}/regions".format(c), verify=False, auth=HTTPBasicAuth("admin", "admin"))
response = url.json()
return [cid.get("id","") for cid in response['items']]
def city():
for r in regions():
api = requests.get("https://localhost:8543/api/netim/v1/regions/{}/cities".format(r), verify=False, auth=HTTPBasicAuth("admin", "admin"))
resolt = api.json()
return(json.dumps([{"name":r.get("name",""),"id":r.get("id", "")} for r in resolt['items']], indent=4))
city()
print(city())
The output is the following :
[
{
"name": "Herat",
"id": "AF~HER~Herat"
}
]
I should have a huge list, so I am not sure what am I missing?
You need to go through all the iterations of your loop and collect the results, then jsonify the and return them.
data = []
for r in regions():
api = requests.get("https://localhost:8543/api/netim/v1/regions/{}/cities".format(r), verify=False, auth=HTTPBasicAuth("admin", "admin"))
resolt = api.json()
data.extend([{"name":r.get("name",""),"id":r.get("id", "")} for r in resolt['items']])
return json.dumps(data, indent=4)
This would be a fix for city() but you have the same problem in all your functions. return immediately exits the function and does not do anything else, effectively all your for loops are doing 1 iteration.
I'll update my example here to give you a better idea what's occurring.
Your functions are basically this:
def test_fn():
for i in [1,2,3,4]:
return i
# output:
1
# We never see 2 or 3 or 4 because we return before looping on them.
What you want:
def test_fn():
results = []
for i in [1,2,3,4]:
results.append(i)
return results
# output
[1,2,3,4]
It seems like you understand that the for loop is going to take some action once for each element in the list. What you're not understanding is that return ends the function NOW. No more for loop, no more actions, and in your code, you immediately return inside the for loop, stopping any further action.

Can't I call function in function in python?

Below is a part of my code.
class Financial_Statements:
def __init__(self,API_KEY,company_code,year,report_sort):
self.API_KEY = API_KEY
self.company_code = company_code
self.year = year
self.report_sort =report_sort
def get_request(self):
request= Request('https://opendart.fss.or.kr/api/fnlttSinglAcnt.json?crtfc_key='+self.API_KEY+'&corp_code='+self.company_code+'&bsns_year='+self.year+'&reprt_code='+self.report_sort)
response = urlopen(request)
elevations = response.read()
data = json.loads(elevations)
data = json_normalize(data['list']) ##--- json to dataframe
data = data.loc[:,['fs_nm','sj_nm','account_nm','thstrm_dt','thstrm_amount','frmtrm_nm','frmtrm_amount','bfefrmtrm_nm','bfefrmtrm_amount']]
return data
def get_financial_stock_price(self,reo = 0):
data = get_request(self)
I define def get_request to get data and use it in other functions, but when I run the code it returns 'get_request' is not defined.
Can't I use a function inside another function?
If you want to call the function inside the class, you have to call it with self. The appropriate code should be self.get_request()

Setting default to NULL with format() returns unexpected string

In my API call defined below to retrieve the last 24 hrs of data, the normal request url would be:
https://api.foobar.com/data
That is why I have set the next_page parameter default to NULL.
However, sometimes the API will return a unique URL at the end of the json (such as https://api.foobar.com/data?page%237hfaj39), which indicates another page exists and another get_data request needs to be made to retrieve the remainder.
In that case, the {next_page} parameter will be set to whatever this unique url returned would be.
My problem is after adding the {next_page} parameter, the default get_data url somehow gets 4 unwanted characters - %7B%7D appended so that the request looks like
https://api.foobar.com/data%7B%7D and of course the API does not respond.
In UTF-8 encoding %7B%7D are two brackets {}
Why does this happen and what am I doing wrong here in terms of formatting? Using None in place of {} also does not work.
The code:
def make_request(url, params={}, headers={}):
r = requests.get(url, params=params, headers=headers)
print r.url
if(not r.status_code is 200):
print "Error access API" + r.text
exit()
return r.json()
def get_data(access_token, next_page={}):
end_time = int(round(time.time() * 1000))
start_time = end_time - (seconds_in_day * 1000)
headers = {'Authorization': 'Bearer ' + access_token, 'start_time': str(start_time), 'end_time': str(end_time)}
url = 'https://api.foobar.com/data{next_page}'.format(next_page=next_page)
return make_request(url, headers=headers)
Note: the API call works when the next_page parameter is removed
With next_page={}, you will get unexpected formatting results. If you try the following:
>>> '{}'.format({})
'{}'
As you can see, instead of the desired '', you get a string with two brackets. This is because:
>>> str({})
'{}'
A similar thing happens with None:
>>> '{}'.format(None)
'None'
>>> str(None)
'None'
To fix this, instead of next_page={}, try next_page='', because .format() will do this:
>>> '{}'.format('')
''

Python API Client Rerun function if run unsuccessfully

Python Requests API client has a function that needs to re execute if run unsuccessfully.
Kitten(BaseClient):
def create(self, **params):
uri = self.BASE_URL
data = dict(**(params or {}))
r = self.client.post(uri, data=json.dumps(data))
return r
If ran with
api = Kitten()
data = {"email": "bill#dow.com", "currency": "USD", "country": "US" }
r = api.create(**data)
The issue is whenever you run it, the first time it always returns back the request as GET, even when it it POST. The first time the post is sent, it returns back GET list of entries.
The later requests, second and later, api.create(**data) return back new entries created like they should be.
There is a status_code for get and post
# GET
r.status_code == 200
# POST
r.status_code == 201
What would be better Python way to re execute when status_code is 200, till a valid 201 is returned.
If you know for sure that the 2nd post will always return your expected value, you can use a ternary operator to perform the check a second time:
Kitten(BaseClient):
def create(self, **params):
uri = self.BASE_URL
data = dict(**(params or {}))
r = self._get_response(uri, data)
return r if r.status_code == 201 else self._get_response(uri, data)
def _get_response(uri, data):
return self.client.post(uri, data=json.dumps(data)
Otherwise you can put it in a while loop where the condition is that the status code == 201.

How can I implement dynamic routing in Python?

I'm attempting to implement dynamic routing for a web framework. At the moment, the goal is to pass arguments into a function by way of the url. So, if user offers a url of "/page/23", then the route function will extract the "23" which will then be used as a parameter for the page function. I am getting a "keyerror", however.
import re
routing_table = {}
url = "/page/23"
def route(url, func):
key = url
key = re.findall(r"(.+?)/<[a-zA-Z_][a-zA-Z0-9_]*>", url)
if key:
params = re.findall(r"<([a-zA-Z_][a-zA-Z0-9_]*)>", url)
routing_table[key[0]] = [params, func]
else:
routing_table[url] = func
def find_path(url):
if url in routing_table:
return routing_table[url]
else:
return None
def page(page_id):
return "this is page %d" % page_id
route("/page/<page_id>", page)
print(routing_table[url])
When you called route, you used a url equal to "/page/<page_id>", but in the last line, url is a global variable equal to "/page/23".
It looks like there are other problems: replace your last line with
print(routing_table)
to see what you're doing.

Categories