how to save search input query from url [duplicate]

how to save search input query from url [duplicate] - python

Is there a way to parse a URL (with some python library) and return a python dictionary with the keys and values of a query parameters part of the URL?
For example:
url = "http://www.example.org/default.html?ct=32&op=92&item=98"
expected return:
{'ct':32, 'op':92, 'item':98}

Use the urllib.parse library:
>>> from urllib import parse
>>> url = "http://www.example.org/default.html?ct=32&op=92&item=98"
>>> parse.urlsplit(url)
SplitResult(scheme='http', netloc='www.example.org', path='/default.html', query='ct=32&op=92&item=98', fragment='')
>>> parse.parse_qs(parse.urlsplit(url).query)
{'item': ['98'], 'op': ['92'], 'ct': ['32']}
>>> dict(parse.parse_qsl(parse.urlsplit(url).query))
{'item': '98', 'op': '92', 'ct': '32'}
The urllib.parse.parse_qs() and urllib.parse.parse_qsl() methods parse out query strings, taking into account that keys can occur more than once and that order may matter.
If you are still on Python 2, urllib.parse was called urlparse.

For Python 3, the values of the dict from parse_qs are in a list, because there might be multiple values. If you just want the first one:
>>> from urllib.parse import urlsplit, parse_qs
>>>
>>> url = "http://www.example.org/default.html?ct=32&op=92&item=98"
>>> query = urlsplit(url).query
>>> params = parse_qs(query)
>>> params
{'item': ['98'], 'op': ['92'], 'ct': ['32']}
>>> dict(params)
{'item': ['98'], 'op': ['92'], 'ct': ['32']}
>>> {k: v[0] for k, v in params.items()}
{'item': '98', 'op': '92', 'ct': '32'}

If you prefer not to use a parser:
url = "http://www.example.org/default.html?ct=32&op=92&item=98"
url = url.split("?")[1]
dict = {x[0] : x[1] for x in [x.split("=") for x in url[1:].split("&") ]}
So I won't delete what's above but it's definitely not what you should use.
I think I read a few of the answers and they looked a little complicated, incase you're like me, don't use my solution.
Use this:
from urllib import parse
params = dict(parse.parse_qsl(parse.urlsplit(url).query))
and for Python 2.X
import urlparse as parse
params = dict(parse.parse_qsl(parse.urlsplit(url).query))
I know this is the same as the accepted answer, just in a one liner that can be copied.

For python 2.7
In [14]: url = "http://www.example.org/default.html?ct=32&op=92&item=98"
In [15]: from urlparse import urlparse, parse_qsl
In [16]: parse_url = urlparse(url)
In [17]: query_dict = dict(parse_qsl(parse_url.query))
In [18]: query_dict
Out[18]: {'ct': '32', 'item': '98', 'op': '92'}

I agree about not reinventing the wheel but sometimes (while you're learning) it helps to build a wheel in order to understand a wheel. :) So, from a purely academic perspective, I offer this with the caveat that using a dictionary assumes that name value pairs are unique (that the query string does not contain multiple records).
url = 'http:/mypage.html?one=1&two=2&three=3'
page, query = url.split('?')
names_values_dict = dict(pair.split('=') for pair in query.split('&'))
names_values_list = [pair.split('=') for pair in query.split('&')]
I'm using version 3.6.5 in the Idle IDE.

from urllib.parse import splitquery, parse_qs, parse_qsl
url = "http://www.example.org/default.html?ct=32&op=92&item=98&item=99"
splitquery(url)
# ('http://www.example.org/default.html', 'ct=32&op=92&item=98&item=99')
parse_qs(splitquery(url)[1])
# {'ct': ['32'], 'op': ['92'], 'item': ['98', '99']}
dict(parse_qsl(splitquery(url)[1]))
# {'ct': '32', 'op': '92', 'item': '99'}
# also works with url w/o query
parse_qs(splitquery("http://example.org")[1])
# {}
dict(parse_qsl(splitquery("http://example.org")[1]))
# {}
Old question, thougt I'd chip in though after I came across this splitquery thingy. Not sure about Python 2 since I dont use Python 2. splitquery is a bit more than re.split(r"\?", url, 1).

For python2.7 I am using urlparse module to parse url query to dict.
import urlparse
url = "http://www.example.org/default.html?ct=32&op=92&item=98"
print urlparse.parse_qs( urlparse.urlparse(url).query )
# result: {'item': ['98'], 'op': ['92'], 'ct': ['32']}

WSGI, python 2.7
Code
import sys
import json
import cgi
import urlparse
def application(environ, start_response):
status = '200 OK'
method = environ['REQUEST_METHOD']
args = urlparse.parse_qs(environ['QUERY_STRING'])
m = args['mesg']
x = {
"input": m[0],
"result": m[0].capitalize()
}
# convert into JSON:
y = json.dumps(x)
output = y
response_headers = [('Content-type', 'application/json'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
print (sys.version_info)
return [output]
URL
http:///echo.py?mesg=hola
Response
{"input": "hola", "result": "Hola"}

You can easily parse a URL with a speciific library.
Here is my simple code to parse it without any dedicated library.
(the input url must contain a domain name,a protocol and a path.
def parseURL(url):
seg2 = url.split('/')[2] # Separating domain name
seg1 = url.split(seg2)[-2] # Deriving protocol
print('Protocol:', seg1, '\n')
print('Domain name:', seg2, '\n')
seg3 = url.split(seg2)[1] #Getting the path; if output is empty,the there is no path in URL
print('Path:', seg3, '\n')
if '#' in url: # Extracting fragment id, else None
seg4 = url.split('#')[1]
print('Fragment ID:', seg4, '\n')
else:
seg4 = 'None'
if '#' in url: # Extracting user name, else None
seg5 = url.split('/')[-1]
print('Scheme with User Name:', seg5, '\n')
else:
seg5 = 'None'
if '?' in url: # Extracting query string, else None
seg6 = url.split('?')[-1]
print('Query string:', seg6, '\n')
else:
seg6 = 'None'
print('**The dictionary is in the sequence: 0.URL 1.Protocol 2.Domain name 3.Path 4.Fragment id 5.User name 6.Query string** \n')
dictionary = {'0.URL': url, '1.Protocol': seg1, '2.Domain name': seg2, '3.Path': seg3, '4.Fragment id': seg4,
'5.User name': seg5, '6.Query string': seg6} # Printing required dictionary
print(dictionary, '\n')
print('The TLD in the given URL is following: ')
if '.com' in url: # Extracting most famous TLDs maintained by ICAAN
print('.com\n')
elif '.de' in url:
print('.de\n')
elif '.uk' in url:
print('.uk\n')
elif 'gov' in url:
print('gov\n')
elif '.org' in url:
print('.org\n')
elif '.ru' in url:
print('.ru\n')
elif '.net' in url:
print('.net\n')
elif '.info' in url:
print('.info\n')
elif '.biz' in url:
print('.biz\n')
elif '.online' in url:
print('.online\n')
elif '.in' in url:
print('.in\n')
elif '.edu' in url:
print('.edu\n')
else:
print('Other low level domain!\n')
return dictionary
if name == 'main':
url = input("Enter your URL: ")
parseURL(url)
#Sample URLS to copy
# url='https://www.facebook.com/photo.php?fbid=2068026323275211&set=a.269104153167446&type=3&theater'
# url='http://www.blog.google.uk:1000/path/to/myfile.html?key1=value1&key2=value2#InTheDocument'
# url='https://www.overleaf.com/9565720ckjijuhzpbccsd#/347876331/'

Related

Python / Django: Get var from cookies

I tried to get distinct_id with request.COOKIES.get('distinct_id'). However Mixpanel saves the data in a not extractable way for me. Anyone knows why there are all these %22%3A%20%22 and how to extraxt distinct_id?
print(request.COOKIES):
{
'djdt': 'hide',
'cookie_bar': '1',
'mp_1384c4d0e46aaaaad007e3d8b5d6eda_mixpanel': '%7B%22distinct_id%22%3A%20%22165edf326870-00fc0e7eb72ed3-34677908-fa000-165e40c268947b%22%2C%22%24initial_referrer%22%3A%20%22%24direct%22%2C%22%24initial_referring_domain%22%3A%20%22%24direct%22%2C%22__alias%22%3A%20%22maz%2B1024%40gmail.com%22%7D',
'csrftoken': 'nvWzsrp3t6Sivkrsyu0gejjjjjiTfc36ZfkH7U7fgHaI40EF',
'sessionid': '7bkel6r27ebd55x262cv9lzv61gzoemw'
}

Check this code. You can run it because use the example you shared. First you must unquote the data in the mixpanel value. I used the suffix of the cookie key to get it. Then after the unquote you must load the json to get back a dictionary.
The code here prints all the keys in the dictionary, but you can easily get the distinct_id using mixpanel_dict.get('distinct_id')
Try it.
from urllib import parse
import json
cookie = {'djdt': 'hide',
'cookie_bar': '1',
'mp_1384c4d0e46aaaaad007e3d8b5d6eda_mixpanel': '%7B%22distinct_id%22%3A%20%22165edf326870-00fc0e7eb72ed3-34677908-fa000-165e40c268947b%22%2C%22%24initial_referrer%22%3A%20%22%24direct%22%2C%22%24initial_referring_domain%22%3A%20%22%24direct%22%2C%22__alias%22%3A%20%22maz%2B1024%40gmail.com%22%7D',
'csrftoken': 'nvWzsrp3t6Sivkrsyu0gejjjjjiTfc36ZfkH7U7fgHaI40EF',
'sessionid': '7bkel6r27ebd55x262cv9lzv61gzoemw'
}
def get_value_for_mixpanel(cookie):
mixpanel_dict = {}
for key in cookie.keys():
if '_mixpanel' in key:
value = parse.unquote(cookie.get(key))
mixpanel_dict = json.loads(value)
return mixpanel_dict
if __name__ == "__main__":
mixpanel_dict = get_value_for_mixpanel(cookie) # type: dict
for key,value in mixpanel_dict.items():
print("%s:%s" %(key, value))
Result
distinct_id:165edf326870-00fc0e7eb72ed3-34677908-fa000-165e40c268947b
$initial_referrer:$direct
$initial_referring_domain:$direct
__alias:maz+1024#gmail.com

Try unquote()
>>> s = '/path/to/my/handler/?action=query&id=112&type=vca&info=ch%3D0%26type%3Devent%26ev46[sts%3Dbegin'
>>> import urllib
>>> urllib.unquote(s)
>>> '/path/to/my/handler/?action=query&id=112&type=vca&info=ch=0&type=event&ev46[sts=begin'
Credits : https://stackoverflow.com/a/11215316/5647272

Python dictionary with same key for parsing in a url so must be ordered

I have a problem trying to create a dictionary, ordering it and joining it for parsing with urllib2.
This is my code:
values = {'STR':'1',
'STR':'123',
'STR':'3456',
'BAT':'95'}
ary_ordered_names = []
ary_ordered_names.append('STR')
ary_ordered_names.append('STR')
ary_ordered_names.append('STR')
ary_ordered_names.append('BAT')
queryString = "&".join( [ item+'='+urllib.pathname2url(values[item]) for item in ary_ordered_names ] )
print queryString
url = 'url'
full_url = url + '?' + queryString
print full_url
request = urllib2.Request(url, queryString)
response = urllib2.urlopen(full_url)
html = response.read()
print html
So when I execute this script it works but only send last STR value, 3456, not the rest.
Could anyone help me with any trick for this python dictionary problem?
Thanks in advance.

Dictionaries have no ordering and keys must be unique. Instead, pass a list of (key, value) tuples to the urllib.urlencode() function if order is important:
from urllib import urlencode
params = [('STR', '1'), ('STR', '123'), ('STR', '3456'), ('BAT', '95')]
query_string = urlencode(params)
Demo:
>>> from urllib import urlencode
>>> params = [('STR', '1'), ('STR', '123'), ('STR', '3456'), ('BAT', '95')]
>>> urlencode(params)
'STR=1&STR=123&STR=3456&BAT=95'
You can also use a sequence for the values and pass in True for the doseq parameter:
params = [('STR', ['1', '123', '3456']), ('BAT', '95')]
query_string = urlencode(params, True)
This produces the same output:
>>> params = [('STR', ['1', '123', '3456']), ('BAT', '95')]
>>> urlencode(params, True)
'STR=1&STR=123&STR=3456&BAT=95'
If the order of BAT and STR relative to one another does not matter, you can still use a dictionary, but use a sequence for the STR values:
params = {'STR': ['1', '123', '3456'], 'BAT': '95'}
query_string = urlencode(params, True)
The STR values are then grouped in order, but the BAT parameter can end up after or before that group:
>>> params = {'STR': ['1', '123', '3456'], 'BAT': '95'}
>>> urlencode(params, True)
'BAT=95&STR=1&STR=123&STR=3456'

As you have discovered, the order in a python dict is not guaranteed.
One option is to use a list of (key, value) tuples.
Another is to use an OrderedDict. [Edit: This would only work in the situation where there aren't any duplicate keys.]

Extract JSON attributes using Python and filter out some data

JSON File: http://media1.clubpenguin.com/play/en/web_service/game_configs/paper_items.json
I'm using Python 2.
I am trying to extract all the 'paper_item_id' 's from the JSON file (specified above), using a loop and storing the 'paper_item_id' in a 'item_id' variable, so this variable will change each time the loop iterates to the 'paper_item_id', but also I want to have an if statement which checks if the 'paper_item_id' in the JSON file 'is_bait' is 'true' if it is true the the 'item_id' variable will not store the 'paper_item_id' which has an 'is_bait' of true and go on to the next one.
Step 1) Get JSON Data.
Step 2) Filter out 'paper_item_id' 's with the 'is_bait' to true.
Step 3) Run a loop which assigns a 'item_id' variable to the 'paper_item_id' received.
Step 4) The loop should run so all filtered 'paper_item_id' (item_id) has been passed to 'myFunction'
Sample English Like Code:
foreach ('don't know what to have for the loop cond') {
item_id = 'paper_item_id'
if (item_id[is_bait]) == true {
code which will go to the end of the loop
}
else
{
myFunction(item_id)
}
I know this has a Javascript kind-of syntax but I want it in python.
What I have now:
import json
import urllib2
url = 'http://media1.clubpenguin.com/play/en/web_service/game_configs/paper_items.json'
result = json.loads(url)
request = urllib2.Request(url)
response = urllib2.urlopen(request)
json_obj = json.load(response)
What do I do know?

I've used requests.get, and also checked for a valid HTTP response.
Here's my sample code:
import json
import requests
def myFunction(item):
# do something here
print item['paper_item_id']
pass
json_obj = requests.get('http://media1.clubpenguin.com/play/en/web_service/game_configs/paper_items.json').json()
for item in json_obj:
if 'is_bait' in item and item['is_bait'] == "1":
# item['is_bait'] == "1", in case you ever get is_bait = "0" in your json response.
print item['paper_item_id']
continue
else:
myFunction(item)

Here is the translation of what you give as a pseudo code:
import json
import urllib2
url = 'http://media1.clubpenguin.com/play/en/web_service/game_configs/paper_items.json'
result = json.loads(url)
request = urllib2.Request(url)
response = urllib2.urlopen(request)
json_obj = json.load(response)
for item in json_obj:
if "is_bait" in item and item['is_bait']:
continue
else:
# do stuff
The continue can be skipped if you reverse the condition though.

json.load will read your data into a list of dictionaries, as appropriate. After that, you can filter on whatever it is your heart desires using standard python object manipulation:
import json
import urllib2
url = 'http://media1.clubpenguin.com/play/en/web_service/game_configs/paper_items.json'
result = json.load(url)
request = urllib2.Request(url)
response = urllib2.urlopen(request)
json_obj = json.load(response)
Your issue is that you're reading the URL object as a string and that's a non-starter. I fixed your code above and it now reads in the whole thing (snippet from output below:
{u'cost': 0,
u'is_bait': u'1',
u'is_member': Terue,
u'label': u"Stompin' Bob Fowhawk Hair",
u'layer': 6000,
u'paper_item_id': 1274,
u'prompt': u"Stompin' Bob Fowhawk Hair",
u'type': 2},
{u'cost': 0,
u'is_bait': u'1',
u'is_member': True,
u'label': u'G Billy Hair and Bandana',
u'layer': 6000,
u'paper_item_id': 1275,
u'prompt': u'G Billy Hair and Bandana',
u'type': 2},
... continues for a long time...

Looking for assertURLEquals

During a unittest I would like to compare a generated URL with a static one defined in the test. For this comparison it would be good to have a TestCase.assertURLEqual or similar which would let you compare two URLs in string format and result in True if all query and fragment components were present and equal but not necessarily in order.
Before I go implement this myself, is this feature around already?

I don't know if there is something built-in, but you could simply use urlparse and check yourself for the query parameters since order is taken into account by default.
>>> import urlparse
>>> url1 = 'http://google.com/?a=1&b=2'
>>> url2 = 'http://google.com/?b=2&a=1'
>>> # parse url ignoring query params order
... def parse_url(url):
... u = urlparse.urlparse(url)
... q = u.query
... u = urlparse.urlparse(u.geturl().replace(q, ''))
... return (u, urlparse.parse_qs(q))
...
>>> parse_url(url1)
(ParseResult(scheme='http', netloc='google.com', path='/', params='', query='', fragment=''), {'a': ['1'], 'b': ['2']})
>>> def assert_url_equals(url1, url2):
... return parse_url(url1) == parse_url(url1)
...
>>> assert_url_equals(url1, url2)
True

Well this is not too hard to implement with urlparse in the standard library:
from urlparse import urlparse, parse_qs
def urlEq(url1, url2):
pr1 = urlparse(url1)
pr2 = urlparse(url2)
return (pr1.scheme == pr2.scheme and
pr1.netloc == pr2.netloc and
pr1.path == pr2.path and
parse_qs(pr1.query) == parse_qs(pr2.query))
# Prints True
print urlEq("http://foo.com/blah?bar=1&foo=2", "http://foo.com/blah?foo=2&bar=1")
# Prints False
print urlEq("http://foo.com/blah?bar=1&foo=2", "http://foo.com/blah?foo=4&bar=1")
Basically, compare everything that is parsed from the URL but use parse_qs to get a dictionary from the query string.

Add params to given URL in Python

Suppose I was given a URL.
It might already have GET parameters (e.g. http://example.com/search?q=question) or it might not (e.g. http://example.com/).
And now I need to add some parameters to it like {'lang':'en','tag':'python'}. In the first case I'm going to have http://example.com/search?q=question&lang=en&tag=python and in the second — http://example.com/search?lang=en&tag=python.
Is there any standard way to do this?

There are a couple of quirks with the urllib and urlparse modules. Here's a working example:
try:
import urlparse
from urllib import urlencode
except: # For Python 3
import urllib.parse as urlparse
from urllib.parse import urlencode
url = "http://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}
url_parts = list(urlparse.urlparse(url))
query = dict(urlparse.parse_qsl(url_parts[4]))
query.update(params)
url_parts[4] = urlencode(query)
print(urlparse.urlunparse(url_parts))
ParseResult, the result of urlparse(), is read-only and we need to convert it to a list before we can attempt to modify its data.

Outsource it to the battle tested requests library.
This is how I will do it:
from requests.models import PreparedRequest
url = 'http://example.com/search?q=question'
params = {'lang':'en','tag':'python'}
req = PreparedRequest()
req.prepare_url(url, params)
print(req.url)

Why
I've been not satisfied with all the solutions on this page (come on, where is our favorite copy-paste thing?) so I wrote my own based on answers here. It tries to be complete and more Pythonic. I've added a handler for dict and bool values in arguments to be more consumer-side (JS) friendly, but they are yet optional, you can drop them.
How it works
Test 1: Adding new arguments, handling Arrays and Bool values:
url = 'http://stackoverflow.com/test'
new_params = {'answers': False, 'data': ['some','values']}
add_url_params(url, new_params) == \
'http://stackoverflow.com/test?data=some&data=values&answers=false'
Test 2: Rewriting existing args, handling DICT values:
url = 'http://stackoverflow.com/test/?question=false'
new_params = {'question': {'__X__':'__Y__'}}
add_url_params(url, new_params) == \
'http://stackoverflow.com/test/?question=%7B%22__X__%22%3A+%22__Y__%22%7D'
Talk is cheap. Show me the code.
Code itself. I've tried to describe it in details:
from json import dumps
try:
from urllib import urlencode, unquote
from urlparse import urlparse, parse_qsl, ParseResult
except ImportError:
# Python 3 fallback
from urllib.parse import (
urlencode, unquote, urlparse, parse_qsl, ParseResult
)
def add_url_params(url, params):
""" Add GET params to provided URL being aware of existing.
:param url: string of target URL
:param params: dict containing requested params to be added
:return: string with updated URL
>> url = 'http://stackoverflow.com/test?answers=true'
>> new_params = {'answers': False, 'data': ['some','values']}
>> add_url_params(url, new_params)
'http://stackoverflow.com/test?data=some&data=values&answers=false'
"""
# Unquoting URL first so we don't loose existing args
url = unquote(url)
# Extracting url info
parsed_url = urlparse(url)
# Extracting URL arguments from parsed URL
get_args = parsed_url.query
# Converting URL arguments to dict
parsed_get_args = dict(parse_qsl(get_args))
# Merging URL arguments dict with new params
parsed_get_args.update(params)
# Bool and Dict values should be converted to json-friendly values
# you may throw this part away if you don't like it :)
parsed_get_args.update(
{k: dumps(v) for k, v in parsed_get_args.items()
if isinstance(v, (bool, dict))}
)
# Converting URL argument to proper query string
encoded_get_args = urlencode(parsed_get_args, doseq=True)
# Creating new parsed result object based on provided with new
# URL arguments. Same thing happens inside of urlparse.
new_url = ParseResult(
parsed_url.scheme, parsed_url.netloc, parsed_url.path,
parsed_url.params, encoded_get_args, parsed_url.fragment
).geturl()
return new_url
Please be aware that there may be some issues, if you'll find one please let me know and we will make this thing better

You want to use URL encoding if the strings can have arbitrary data (for example, characters such as ampersands, slashes, etc. will need to be encoded).
Check out urllib.urlencode:
>>> import urllib
>>> urllib.urlencode({'lang':'en','tag':'python'})
'lang=en&tag=python'
In python3:
from urllib import parse
parse.urlencode({'lang':'en','tag':'python'})

You can also use the furl module https://github.com/gruns/furl
>>> from furl import furl
>>> print furl('http://example.com/search?q=question').add({'lang':'en','tag':'python'}).url
http://example.com/search?q=question&lang=en&tag=python

If you are using the requests lib:
import requests
...
params = {'tag': 'python'}
requests.get(url, params=params)

Based on this answer, one-liner for simple cases (Python 3 code):
from urllib.parse import urlparse, urlencode
url = "https://stackoverflow.com/search?q=question"
params = {'lang':'en','tag':'python'}
url += ('&' if urlparse(url).query else '?') + urlencode(params)
or:
url += ('&', '?')[urlparse(url).query == ''] + urlencode(params)

I find this more elegant than the two top answers:
from urllib.parse import urlencode, urlparse, parse_qs
def merge_url_query_params(url: str, additional_params: dict) -> str:
url_components = urlparse(url)
original_params = parse_qs(url_components.query)
# Before Python 3.5 you could update original_params with
# additional_params, but here all the variables are immutable.
merged_params = {**original_params, **additional_params}
updated_query = urlencode(merged_params, doseq=True)
# _replace() is how you can create a new NamedTuple with a changed field
return url_components._replace(query=updated_query).geturl()
assert merge_url_query_params(
'http://example.com/search?q=question',
{'lang':'en','tag':'python'},
) == 'http://example.com/search?q=question&lang=en&tag=python'
The most important things I dislike in the top answers (they are nevertheless good):
Łukasz: having to remember the index at which the query is in the URL components
Sapphire64: the very verbose way of creating the updated ParseResult
What's bad about my response is the magically looking dict merge using unpacking, but I prefer that to updating an already existing dictionary because of my prejudice against mutability.

Yes: use urllib.
From the examples in the documentation:
>>> import urllib
>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params)
>>> print f.geturl() # Prints the final URL with parameters.
>>> print f.read() # Prints the contents

python3, self explanatory I guess
from urllib.parse import urlparse, urlencode, parse_qsl
url = 'https://www.linkedin.com/jobs/search?keywords=engineer'
parsed = urlparse(url)
current_params = dict(parse_qsl(parsed.query))
new_params = {'location': 'United States'}
merged_params = urlencode({**current_params, **new_params})
parsed = parsed._replace(query=merged_params)
print(parsed.geturl())
# https://www.linkedin.com/jobs/search?keywords=engineer&location=United+States

I liked Łukasz version, but since urllib and urllparse functions are somewhat awkward to use in this case, I think it's more straightforward to do something like this:
params = urllib.urlencode(params)
if urlparse.urlparse(url)[4]:
print url + '&' + params
else:
print url + '?' + params

Use the various urlparse functions to tear apart the existing URL, urllib.urlencode() on the combined dictionary, then urlparse.urlunparse() to put it all back together again.
Or just take the result of urllib.urlencode() and concatenate it to the URL appropriately.

Yet another answer:
def addGetParameters(url, newParams):
(scheme, netloc, path, params, query, fragment) = urlparse.urlparse(url)
queryList = urlparse.parse_qsl(query, keep_blank_values=True)
for key in newParams:
queryList.append((key, newParams[key]))
return urlparse.urlunparse((scheme, netloc, path, params, urllib.urlencode(queryList), fragment))

In python 2.5
import cgi
import urllib
import urlparse
def add_url_param(url, **params):
n=3
parts = list(urlparse.urlsplit(url))
d = dict(cgi.parse_qsl(parts[n])) # use cgi.parse_qs for list values
d.update(params)
parts[n]=urllib.urlencode(d)
return urlparse.urlunsplit(parts)
url = "http://stackoverflow.com/search?q=question"
add_url_param(url, lang='en') == "http://stackoverflow.com/search?q=question&lang=en"

Here is how I implemented it.
import urllib
params = urllib.urlencode({'lang':'en','tag':'python'})
url = ''
if request.GET:
url = request.url + '&' + params
else:
url = request.url + '?' + params
Worked like a charm. However, I would have liked a more cleaner way to implement this.
Another way of implementing the above is put it in a method.
import urllib
def add_url_param(request, **params):
new_url = ''
_params = dict(**params)
_params = urllib.urlencode(_params)
if _params:
if request.GET:
new_url = request.url + '&' + _params
else:
new_url = request.url + '?' + _params
else:
new_url = request.url
return new_ur

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to save search input query from url [duplicate] - python

Is there a way to parse a URL (with some python library) and return a python dictionary with the keys and values of a query parameters part of the URL? For example: url = "http://www.example.org/default.html?ct=32&op=92&item=98" expected return: {'ct':32, 'op':92, 'item':98}

For python 2.7 In [14]: url = "http://www.example.org/default.html?ct=32&op=92&item=98" In [15]: from urlparse import urlparse, parse_qsl In [16]: parse_url = urlparse(url) In [17]: query_dict = dict(parse_qsl(parse_url.query)) In [18]: query_dict Out[18]: {'ct': '32', 'item': '98', 'op': '92'}

For python2.7 I am using urlparse module to parse url query to dict. import urlparse url = "http://www.example.org/default.html?ct=32&op=92&item=98" print urlparse.parse_qs( urlparse.urlparse(url).query ) # result: {'item': ['98'], 'op': ['92'], 'ct': ['32']}

Related

Python / Django: Get var from cookies

Python dictionary with same key for parsing in a url so must be ordered

Extract JSON attributes using Python and filter out some data

Looking for assertURLEquals

Add params to given URL in Python

Categories

Resources