I would like to send GET request with url like this "/api/stats?ad_ids=1,2,3&start_time=2013-09-01&end_time=2013-10-01" but I do not know how to mount my class to this url.
I am using cherrypy mount method and MethodDispatcher.
So far I managed to call GET method from this url api/stats/1.
Also what parameters should I pass to the GET method?
I would very appreciate any suggestion or comment?
Here are the code samples:
cherrypy.tree.mount(
Ads(), '/api/stats',
{'/':
{'request.dispatch': cherrypy.dispatch.MethodDispatcher()}
}
)
def GET(self,ad_id=None,*args, **kwargs):
jsonData1={}
jsonData = self.readData()
counter2 = 0
for item in jsonData:
index = jsonData[item][2]
if index==ad_id:
jsonData1[counter2] = jsonData[item]
counter2 += 1
print jsonData1
return ('Here is the stat %s')%(jsonData1)
Thank you in advance!
BR,
Momir
The query string can be reached with keyword arguments for the GET method.
Using your method you can access them with the dictionary kwargs.
cherrypy.tree.mount(
Songs(), '/api/stats',
{'/':
{'request.dispatch': cherrypy.dispatch.MethodDispatcher()}
}
)
def GET(self,ad_id=None,*args, **kwargs):
start_time = kwargs.get('start_time', None)
end_time = kwargs.get('end_time', None)
# you can also use kwargs['XXX']
# or do lookups with 'XXX' in kwargs
# or set (start_time=None, end_time=None) at the signature
# as a keyword argument.
jsonData1={}
jsonData = self.readData()
counter2 = 0
for item in jsonData:
index = jsonData[item][2]
if index==ad_id:
jsonData1[counter2] = jsonData[item]
counter2 += 1
print jsonData1
return ('Here is the stat %s')%(jsonData1)
Also, *args will contain any positional argument for any additional segment of the URL for example /api/stats/1/a/b/c will create args=('a', 'b', 'c')
Related
I'm creating a web scraper that will be used to value stocks. The problem I got is that my code returns a object "placement" (Not sure what it should be called) instead of the value.
import requests
class Guru():
MedianPE = 0.0
def __init__(self, ticket):
self.ticket = ticket
try:
url = ("https://www.gurufocus.com/term/pettm/"+ticket+"/PE-Ratio-TTM/")
response = requests.get(url)
htmlText = response.text
firstSplit = htmlText
secondSplit = firstSplit.split("And the <strong>median</strong> was <strong>")[1]
thirdSplit = secondSplit.split("</strong>")[0]
lastSplit = float(thirdSplit)
try:
Guru.MedianPE = lastSplit
except:
print(ticket + ": Median PE N/A")
except:
print(ticket + ": Median PE N/A")
def getMedianPE(self):
return float(Guru.getMedianPE)
g1 = Guru("AAPL")
g1.getMedianPE
print("Median + " + str(g1))
If I print the lastSplit inside the __init__ it returns the value I want 15.53 but when I try to get it by the function getMedianPE I just get Median + <__main__.Guru object at 0x0000016B0760D288>
Thanks a lot for your time!
Looks like you are trying to cast a function object to a float. Simply change return float(Guru.getMedianPE) to return float(Guru.MedianPE)
getMedianPE is a function (also called object method when part of a class), so you need to call it with parentheses. If you call it without parentheses, you get the method/function itself rather than the result of calling the method/function.
The other problem is that getMedianPE returns the function Guru.getMedianPE rather than the value Guru.MedianPE. I don't think you want MedianPE to be a class variable - you probably just want to set it as a default of 0 in init so that each object has its own median_PE value.
Also, it is not a good idea to include all of the scraping code in your init method. That should be moved to a scrape() method (or some other name) that you call after instantiating the object.
Finally, if you are going to print an object, it is useful to have a str method, so I added a basic one here.
So putting all of those comments together, here is a recommended refactor of your code.
import requests
class Guru():
def __init__(self, ticket, median_PE=0):
self.ticket = ticket
self.median_PE = median_PE
def __str__(self):
return f'{self.ticket} {self.median_PE}'
def scrape(self):
try:
url = f"https://www.gurufocus.com/term/pettm/{self.ticket}/PE-Ratio-TTM/"
response = requests.get(url)
htmlText = response.text
firstSplit = htmlText
secondSplit = firstSplit.split("And the <strong>median</strong> was <strong>")[1]
thirdSplit = secondSplit.split("</strong>")[0]
lastSplit = float(thirdSplit)
self.median_PE = lastSplit
except ValueError:
print(f"{self.ticket}: Median PE N/A")
Then you run the code
>>>g1 = Guru("AAPL")
...g1.scrape()
...print(g1)
AAPL 15.53
I have the following program to scrap data from a website. I want to improve the below code by using a generator with a yield instead of calling generate_url and call_me multiple times sequentially. The purpose of this exersise is to properly understand yield and the context in which it can be used.
import requests
import shutil
start_date='03-03-1997'
end_date='10-04-2015'
yf_base_url ='http://real-chart.finance.yahoo.com/table.csv?s=%5E'
index_list = ['BSESN','NSEI']
def generate_url(index, start_date, end_date):
s_day = start_date.split('-')[0]
s_month = start_date.split('-')[1]
s_year = start_date.split('-')[2]
e_day = end_date.split('-')[0]
e_month = end_date.split('-')[1]
e_year = end_date.split('-')[2]
if (index == 'BSESN') or (index == 'NSEI'):
url = yf_base_url + index + '&a={}&b={}&c={}&d={}&e={}&f={}'.format(s_day,s_month,s_year,e_day,e_month,e_year)
return url
def callme(url,index):
print('URL {}'.format(url))
r = requests.get(url, verify=False,stream=True)
if r.status_code!=200:
print "Failure!!"
exit()
else:
r.raw.decode_content = True
with open(index + "file.csv", 'wb') as f:
shutil.copyfileobj(r.raw, f)
print "Success"
if __name__ == '__main__':
url = generate_url(index_list[0],start_date,end_date)
callme(url,index_list[0])
url = generate_url(index_list[1],start_date,end_date)
callme(url,index_list[1])
There are multiple options. You could use yield to iterate over URL's. Or over request objects.
If your index_list were long, I would suggest yielding URLs.
Because then you could use multiprocessing.Pool to map a function that does a request and saves the output over these URLs. That would execute them in parallel, potentially making it a lot faster (assuming that you have enough network bandwidth, and that yahoo finance doesn't throttle connections).
yf ='http://real-chart.finance.yahoo.com/table.csv?s=%5E'
'{}&a={}&b={}&c={}&d={}&e={}&f={}'
index_list = ['BSESN','NSEI']
def genurl(symbols, start_date, end_date):
# assemble the URLs
s_day, s_month, s_year = start_date.split('-')
e_day, e_month, e_year = end_date.split('-')
for s in symbols:
url = yf.format(s, s_day,s_month,s_year,e_day,e_month,e_year)
yield url
def download(url):
# Do the request, save the file
p = multiprocessing.Pool()
rv = p.map(download, genurl(index_list, '03-03-1997', '10-04-2015'))
If I understand you correctly, what you want to know is how to change the code so that you can replace the last part by
if __name__ == '__main__':
for url in generate_url(index_list,start_date,end_date):
callme(url,index)
If this is correct, you need to change generate_url, but not callme. Changing generate_url is rather mechanical. Make the first parameter index_list instead of index, wrap the function body in a for index in index_list loop, and change return url to yield url.
You don't need to change callme because you never want to say something like for call in callme(...). You won't do anything with it but a normal function call.
I have an URL such as: http://example.com/page/page_id
I want to know how to get the page_id part from url in the route. I am hoping I could devise some method such as:
#route('/page/page_id')
def page(page_id):
pageid = page_id
It's pretty straightforward - pass the path parameter in between angle brackets, but be sure to pass that name to your method.
#app.route('/page/<page_id>')
def page(page_id):
pageid = page_id
# You might want to return some sort of response...
You should use the following syntax:
#app.route('/page/<int:page_id>')
def page(page_id):
# Do something with page_id
pass
You can specify the ID as integer :
#app.route('/page/<int:page_id>')
def page(page_id):
# Replace with your custom code or render_template method
return f"<h1>{page_id}</h1>"
or if you are using alpha_num ID:
#app.route('/page/<username>')
def page(username):
# Replace with your custom code or render_template method
return f"<h1>Welcome back {username}!</h1>"
It's also possible to not specify any argument in the function and still access to URL parameters :
# for given URL such as domain.com/page?id=123
#app.route('/page')
def page():
page_id = request.args.get("id") # 123
# Replace with your custom code or render_template method
return f"<h1>{page_id}</h1>"
However this specific case is mostly used when you have FORM with one or multiple parameters (example: you have a query :
domain.com/page?cars_category=audi&year=2015&color=red
#app.route('/page')
def page():
category = request.args.get("cars_category") # audi
year = request.args.get("year") # 2015
color = request.args.get("color") # red
# Replace with your custom code or render_template method
pass
Good luck! :)
I'd like to know if I can have a parameter list with keywords given inside a string that I can pass into a function? Basically, the parameter list may or may not have keywords, so the parameter list would have variable 'types'. Here's an example of what I'm trying to do:
from bs4 import BeautifulSoup
import urllib.request as urlreq
import my_parameters # can have variable values
# my_parameters.useful_token_concept = ["h1", "class_ = some_class"]
# I want to pass these above parameters into a function; "class_" is
# a keyword, but it's wrapped in a string => gives me problems
url = my_parameters.url
page = urlreq.urlope(url)
pageHtml = page.read()
page.close()
soup = BeautifulSoup(pageHtml)
# something like the following line works:
# params = soup.find("h1", class_ = "some_class")
params = soup.find(*my_parameters.useful_token_concept)
# params = soup.find(my_parameters.useful_token_concept[0],\
# my_parameters.useful_token_concept[1])
# I don't know how long the list of attributes/parameter-list to
# BeautifulSoup's find() function will be, nor do I know what keywords,
# if any, will be passed into find(), as given by a user to my_parameters.
print(params) # should print the html the user wants to scrape.
Why not just use a better representation? I.e, instead of
my_parameters.useful_token_concept = ["h1", "class_ = some_class"]
use
my_parameters.useful_token_concept = ["h1", {"class_": "some_class"}]
Since these values' representation is up to you, using a dict to represent keyword parameters is much simpler than encoding them into a string and then having to parse that string back!
You need to split your token list into a dictionary of keyword arguments, and a list of positional arguments.
kwargs = {}
args = []
for i in my_parameters.useful_token_concept:
bits = i.split('=')
if len(bits) > 1:
kwargs[bits[0].strip()] = bits[1].strip()
else:
args.append(bits[0].strip())
params = soup.find(*args, **kwargs)
You could create a string representation of how all the arguments would be passed and use eval() to turn them into something you could actually use in a real function call:
my_parameters.useful_token_concept = ["h1", "class_ = some_class"]
def func_proxy(*args, **kwargs):
" Just return all positional and keyword arguments. "
return args, kwargs
calling_seq = ', '.join(my_parameters.useful_token_concept)
args, kwargs = eval('func_proxy({})'.format(calling_seq))
print('args:', args) # -> args: (<Header1 object>,)
print('kwargs:', kwargs) # -> kwargs: {'class_': <class '__main__.some_class'>}
parms = soup.find(*args, **kwargs)
def images_custom_list(args, producer_data):
tenant, token, url = producer_data
url = url.replace(".images", ".servers")
url = url + '/' + 'detail'
output = do_request(url, token)
output = output[0].json()["images"]
custom_images_list = [custom_images for custom_images in output
if custom_images["metadata"].get('user_id', None)]
temp_image_list = []
for image in custom_images_list:
image_temp = ( { "status": image["status"],
"links": image["links"][0]["href"],
"id": image["id"], "name": image["name"]} )
temp_image_list.append(image_temp)
print json.dumps(temp_image_list, indent=2)
def image_list_detail(args, producer_data):
tenant, token, url = producer_data
url = url.replace(".images", ".servers")
uuid = args['uuid']
url = url + "/" + uuid
output = do_request(url, token)
print output[0]
I am trying to make the code more efficient and clean looking by utilizing the Python's function decoration. Since these 2 functions share the same first 2 lines, how could I make a function decorator with these 2 lines and have these 2 functions be decorated it?
here's a way to solve it:
from functools import wraps
def fix_url(function):
#wraps(function)
def wrapper(*args, **kwarg):
kwarg['url'] = kwarg['url'].replace(".images", ".servers")
return function(*args, **kwarg)
return wrapper
#fix_url
def images_custom_list(args, tenant=None, token=None, url=None):
url = url + '/' + 'detail'
output = do_request(url, token)
output = output[0].json()["images"]
custom_images_list = [custom_images for custom_images in output
if custom_images["metadata"].get('user_id', None)]
temp_image_list = []
for image in custom_images_list:
image_temp = ( { "status": image["status"],
"links": image["links"][0]["href"],
"id": image["id"], "name": image["name"]} )
temp_image_list.append(image_temp)
print json.dumps(temp_image_list, indent=2)
#fix_url
def image_list_detail(args, tenant=None, token=None, url=None):
uuid = args['uuid']
url = url + "/" + uuid
output = do_request(url, token)
print output[0]
sadly for you, you may notice that you need to get rid of producer_data, but have it split in multiple arguments because you cannot factorize that part of the code, as you'll anyway need to split it again in each of the functions. I chose to use keyword arguments (by setting a default value to None), but you could use positional arguments as well, your call.
BTW, note that it's not making the code more efficient, though it's helping in making it a bit more readable (you know that you're changing the URL the same way for both methods, and when you fix the URL changing part, it's done the same way everywhere), but it's making 2 more function calls each time you call the function, so it's in no way more "efficient".
N.B.: It's basically based over #joel-cornett's example (I wouldn't have used #wraps otherwise, just plain old double function decorator), I just specialized it. (I don't think he deserves a -1)
Please at least +1 his answer or accept it.
But I think a simpler way to do it would be:
def fix_url(producer_data):
return (producer_data[0], producer_data[1], producer_data[2].replace(".images", ".servers"))
def images_custom_list(args, producer_data):
tenant, token, url = fix_url(producer_data)
# stuff ...
def image_list_detail(args, producer_data):
tenant, token, url = fix_url(producer_data)
# stuff ...
which uses a simpler syntax (no decorator) and does only one more function call.
Like this:
from functools import wraps
def my_timesaving_decorator(function):
#wraps(function)
def wrapper(*args, **kwargs):
execute_code_common_to_multiple_function()
#Now, call the "unique" code
#Make sure that if you modified the function args,
#you pass the modified args here, not the original ones.
return function(*args, **kwargs)
return wrapper