I'm new to python and would like some assistance.
I have a variable
q = request.GET['q']
How do I insert the variable q inside this:
url = "http://search.com/search?term="+q+"&location=sf"
Now I'm not sure what the convention is? I'm used to PHP or javascript, but I'm learning python and how do you insert a variable dynamically?
Use the format method of String:
url = "http://search.com/search?term={0}&location=sf".format(q)
But of course you should URL-encode the q:
import urllib
...
qencoded = urllib.quote_plus(q)
url =
"http://search.com/search?term={0}&location=sf".format(qencoded)
One way to do it is using urllib.urlencode(). It accepts a dictionary(or associative array or whatever you call it) taking key-value pairs as parameter and value and you can encode it to form urls
from urllib import urlencode
myurl = "http://somewebsite.com/?"
parameter_value_pairs = {"q":"q_value","r":"r_value"}
req_url = url + urlencode(parameter_value_pair)
This will give you "http://somewebsite.com/?q=q_value&r=r_value"
q = request.GET['q']
url = "http://search.com/search?term=%s&location=sf" % (str(q))
Use this it will be faster...
Related
I'll try to clarify what I mean.
Let's say I have this url:
https://test-api.com/users?age=17&sex=M
There's 2 fields: age and sex. The age field is required but the sex field is optional.
Let's say I want to make a bunch of tests and I use this code:
import requests
def testUserQuery(user_age, user_sex):
url = f'https://test-api.com/users?age={user_age}&sex={user_sex}'
response = requests.get(url)
test_query = testUserQuery(17)
Now, assuming that I can't go into the actual code of the API itself and change how empty fields are interpreted...
How can I make this test and leave the user_sex field blank?
In other words, is there a special universal symbol (like "&" which means "and" for every URL in the world) that I can put for user_sex that'll force the API to ignore the field and not cause errors?
Otherwise, I would have to do this:
import requests
def testUserQuery(user_age, user_sex=None):
if user_sex is None:
url = f'https://test-api.com/users?age={user_age}'
elif user_sex is not None:
url = f'https://test-api.com/users?age={user_age}&sex={user_sex}'
response = requests.get(url)
test_query = testUserQuery(17)
Imagine if I'm dealing with 10 optional fields. I don't think it would be very efficient to make multiple elif statements to change the URL for every single possible case where an optional field is empty.
I hope I'm being clear, sorry if this sounds confusing.
Here's a simple way to do this by utilising the params parameter:
import requests
URL = 'https://test-api.com/users'
def testUserQuery(**params):
return requests.get(URL, params=params)
testUserQuery(age=21, sex='male')
testUserQuery(age=21)
In other words, all you have to do is match the parameter names with those that are understood by the API. No need to manipulate the URL
One way to dynamically achieve this is by changing testUserQuery to accept its arguments as **kwargs then using urllib.parse.urlencode to dynamically build the query string.
from urllib.parse import urlencode
def testUserQuery(base_url='https://test-api.com/users', **kwargs):
params = urlencode({k: v for k, v in kwargs.items() if v is not None})
url = f"{base_url}{'?' + params if params else ''}"
print(url)
testUserQuery()
testUserQuery(a=1)
testUserQuery(a=1, b=2)
This outputs
https://test-api.com/users
https://test-api.com/users?a=1
https://test-api.com/users?a=1&b=2
I am trying to implement a searchbar in django. I am getting the search input as movie_title from the html form. Now, how do i include it in my api call ? I tried with curly braces. Here's is the code
def searchbar(request):
if request.method == 'GET':
movie_title = request.GET.get('movie_title')
searched_movie = requests.get(
'http://www.omdbapi.com/?apikey=9a63b7fd&t={movie_title}')
You can create the url as an object using f-strings (one of the ways) and pass that to the get() method:
url = f'http://www.omdbapi.com/?apikey=9a63b7fd&t={movie_title}'
searched_movie = requests.get(url)
Note: You don't need to create a different object and can directly use:
searched_movie = requests.get(f'http://www.omdbapi.com/?apikey=9a63b7fd&t={movie_title}')
The above approach helps with readability when there are several dynamic attributes involved.
If you want to use { and } in your query string you can simply write
searched_movie = requests.get('http://www.omdbapi.com/?apikey=9a63b7fd&t={'+movie_title+'}')
Otherwise you can write
searched_movie = requests.get('http://www.omdbapi.com/?apikey=9a63b7fd&t='+movie_title)
I have a requests.cookies.RequestCookieJar object which contains multiple cookies from different domain/path. How can I extract a cookies string for a particular domain/path following the rules mentioned in here?
For example
>>> r = requests.get("https://stackoverflow.com")
>>> print(r.cookies)
<RequestsCookieJar[<Cookie prov=4df137f9-848e-01c3-f01b-35ec61022540 for .stackoverflow.com/>]>
# the function I expect
>>> getCookies(r.cookies, "stackoverflow.com")
"prov=4df137f9-848e-01c3-f01b-35ec61022540"
>>> getCookies(r.cookies, "meta.stackoverflow.com")
"prov=4df137f9-848e-01c3-f01b-35ec61022540"
# meta.stackoverflow.com is also satisfied as it is subdomain of .stackoverflow.com
>>> getCookies(r.cookies, "google.com")
""
# r.cookies does not contains any cookie for google.com, so it return empty string
I think you need to work with a Python dictionary of the cookies. (See my comment above.)
def getCookies(cookie_jar, domain):
cookie_dict = cookie_jar.get_dict(domain=domain)
found = ['%s=%s' % (name, value) for (name, value) in cookie_dict.items()]
return ';'.join(found)
Your example:
>>> r = requests.get("https://stackoverflow.com")
>>> getCookies(r.cookies, ".stackoverflow.com")
"prov=4df137f9-848e-01c3-f01b-35ec61022540"
NEW ANSWER
Ok, so I still don't get exactly what it is you are trying to achieve.
If you want to extract the originating url from a requests.RequestCookieJar object (so that you could then check if there is a match with a given subdomain) that is (as far as I know) impossible.
However, you could off course do something like:
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
import requests
import re
class getCookies():
def __init__(self, url):
self.cookiejar = requests.get(url).cookies
self.url = url
def check_domain(self, domain):
try:
base_domain = re.compile("(?<=\.).+\..+$").search(domain).group()
except AttributeError:
base_domain = domain
if base_domain in self.url:
print("\"prov=" + str(dict(self.cookiejar)["prov"]) + "\"")
else:
print("No cookies for " + domain + " in this jar!")
Then if you do:
new_instance = getCookies("https://stackoverflow.com")
You could then do:
new_instance.check_domain("meta.stackoverflow.com")
Which would give the output:
"prov=5d4fda78-d042-2ee9-9a85-f507df184094"
While:
new_instance.check_domain("google.com")
Would output:
"No cookies for google.com in this jar!"
Then, if you (if needed) fine-tune the regex & create a list of urls, you could first loop through the list to create many instances and save them in eg a list or dict. In a second loop you could check another list of urls to see if their cookies might be present in any of the instances.
OLD ANSWER
The docs you link to explain:
items()
Dict-like items() that returns a list of name-value
tuples from the jar. Allows client-code to call
dict(RequestsCookieJar) and get a vanilla python dict of key value
pairs.
I think what you are looking for is:
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
import requests
def getCookies(url):
r = requests.get(url)
print("\"prov=" + str(dict(r.cookies)["prov"]) + "\"")
Now I can run it like this:
>>> getCookies("https://stackoverflow.com")
"prov=f7712c78-b489-ee5f-5e8f-93c85ca06475"
actually , when I just have the problem as you are. but when I access the Class Define
class RequestsCookieJar(cookielib.CookieJar, MutableMapping):
I found a func called def get_dict(self, domain=None, path=None):
you can simply write code like this
raw = "rawCookide"
print(len(cookie))
mycookie = SimpleCookie()
mycookie.load(raw)
UCookie={}
for key, morsel in mycookie.items():
UCookie[key] = morsel.value
The following code is not promised to be "forward compatible" because I am accessing attributes of classes that were intentionally hidden (kind of) by their authors; however, if you must get into the attributes of a cookie, take a look here:
import http.cookies
import requests
import json
import sys
import os
aresponse = requests.get('https://www.att.com')
requestscookiejar = aresponse.cookies
for cdomain,cooks in requestscookiejar._cookies.items():
for cpath, cookgrp in cooks.items():
for cname,cattribs in cookgrp.items():
print(cattribs.version)
print(cattribs.name)
print(cattribs.value)
print(cattribs.port)
print(cattribs.port_specified)
print(cattribs.domain)
print(cattribs.domain_specified)
print(cattribs.domain_initial_dot)
print(cattribs.path)
print(cattribs.path_specified)
print(cattribs.secure)
print(cattribs.expires)
print(cattribs.discard)
print(cattribs.comment)
print(cattribs.comment_url)
print(cattribs.rfc2109)
print(cattribs._rest)
When a person needs to access the simple attributes of cookies is it likely less complicated to go after the following way. This avoids the use of RequestsCookieJar. Here we construct a single SimpleCookie instance by reading from the headers attribute of a response object instead of the cookies attribute. The name SimpleCookie would seem to imply a single cookie but that isn't what a simple cookie is. Try it out:
import http.cookies
import requests
import json
import sys
import os
def parse_cookies(http_response):
cookie_grp = http.cookies.SimpleCookie()
for h,v in http_response.headers.items():
if 'set-cookie' in h.lower():
for cook in v.split(','):
cookie_grp.load(cook)
return cookie_grp
aresponse = requests.get('https://www.att.com')
cookies = parse_cookies(aresponse)
print(str(cookies))
You can get list of domains in ResponseCookieJar and then dump the cookies for each domain with the following code:
import requests
response = requests.get("https://stackoverflow.com")
cjar = response.cookies
for domain in cjar.list_domains():
print(f'Cookies for {domain}: {cjar.get_dict(domain=domain)}')
Outputs:
Cookies for domain .stackoverflow.com: {'prov': 'efe8c1b7-ddbd-4ad5-9060-89ea6c29479e'}
In this example, only one domain is listed. It would have multiple lines in output if there were cookies for multiple domains in the Jar.
For many usecases, the cookie jar can be serialized by simply ignoring domains by calling:
dCookies = cjar.get_dict()
We can easily extract cookies string for a particular domain/path using functions already available in requests lib.
import requests
from requests.models import Request
from requests.cookies import get_cookie_header
session = requests.session()
r1 = session.get("https://www.google.com")
r2 = session.get("https://stackoverflow.com")
cookie_header1 = get_cookie_header(session.cookies, Request(method="GET", url="https://www.google.com"))
# '1P_JAR=2022-02-19-18; NID=511=Hz9Mlgl7DtS4uhTqjGOEolNwzciYlUtspJYxQ0GWOfEm9u9x-_nJ1jpawixONmVuyua59DFBvpQZkPzNAeZdnJjwiB2ky4AEFYVV'
cookie_header2 = get_cookie_header(session.cookies, Request(method="GET", url="https://stackoverflow.com"))
# 'prov=883c41a4-603b-898c-1d14-26e30e3c8774'
Request is used to prepare a :class:PreparedRequest <PreparedRequest>, which is sent to the server.
What you need is get_dict() method
a_session = requests.Session()
a_session.get('https://google.com/')
session_cookies = a_session.cookies
cookies_dictionary = session_cookies.get_dict()
# Now just print it or convert to json
as_string = json.dumps(cookies_dictionary)
print(cookies_dictionary)
The working code is like this:
csrf = list(set(htmls.xpath("//input[#name='whatever']/#value")))[0]
However, I'm trying to get that input name as a parameter passed into the function, in that way I would do something like this:
tokenname = sys.argv[2]
which gives the value 'whatever', and I want to pass it something like this:
csrf = list(set(htmls.xpath("//input[#name="+tokenname+"]/#value")))[0]
But it doesn't work that way, anyway to pass a variable in that #name value?
The full code is here:
import requests
from lxml import html
import json
import sys
session_requests = requests.session()
login_url = sys.argv[1]
tokenname = sys.argv[2]
result = session_requests.get(login_url)
htmls = html.fromstring(result.text)
csrf = list(set(htmls.xpath("//input[#name={}]/#value".format(tokenname))))[0]
print csrf
EDIT
Based upon discussion, looks like you had issues with " and escape charcaters.
Use following
csrf = list(set(htmls.xpath("//input[#name=\"{}\"]/#value".format(tokenname))))[0]
Old
You can use format as below
"//input[#name={}]/#value".format('whatever')
From python doc site
str.format(*args, **kwargs)
Perform a string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.
>>> "The sum of 1 + 2 is {0}".format(1+2)
'The sum of 1 + 2 is 3'
I am using Python and Flask, and I have some YouTube URLs I need to convert to their embed versions. For example, this:
https://www.youtube.com/watch?v=X3iFhLdWjqc
has to be converted into this:
https://www.youtube.com/embed/X3iFhLdWjqc
Should I use Regexp, or is there a Flask method to convert the URLs?
Assuming your URLs are just strings, you don't need regexes or special Flask functions to do it.
This code will replace all YouTube URLs with the embedded versions, based off of how you said it should be handled:
url = "https://youtube.com/watch?v=TESTURLNOTTOBEUSED"
url = url.replace("watch?v=", "embed/")
All you have to do is replace url with whatever variable you store the URL in.
To do this for a list, use:
new_url_list = list()
for address in old_url_list:
new_address = address.replace("watch?v=", "embed/")
new_url_list.append(new_address)
old_url_list = new_url_list
where old_url_list is the list which your URLs are included in.
There are two types of youtube links:
http://www.youtube.com/watch?v=xxxxxxxxxxx
or
http://youtu.be/xxxxxxxxxxx
Use this function it will work with the 2 kinds of youtube links.
import re
def embed_url(video_url):
regex = r"(?:https:\/\/)?(?:www\.)?(?:youtube\.com|youtu\.be)\/(?:watch\?v=)?(.+)"
return re.sub(regex, r"https://www.youtube.com/embed/\1",video_url)
You can try this:
import re
videoUrl = "https://www.youtube.com/watch?v=X3iFhLdWjqc"
embedUrl = re.sub(r"(?ism).*?=(.*?)$", r"https://www.youtube.com/embed/\1", videoUrl )
print (embedUrl)
Output:
https://www.youtube.com/embed/X3iFhLdWjqc
Demo