I need to get current model name from this url:
http://localhost:8069/web#view_type=form&model=system.audit&menu_id=221&action=229
Here the model name is system.audit.
Please Help
You can use urlparse built-in library in python
Try this snippet:
from urlparse import urlparse, parse_qs
url = "http://localhost:8069/web#view_type=form&model=system.audit&menu_id=221&action=229"
uri = urlparse(url)
qs = uri.fragment
print parse_qs(qs).get('model', None)
# ['system.audit']
It is working fine for me, check if the model is in query params when you're getting in real-time
In python 3, try this:
from urllib import parse
result = parse.parse_qs("http://localhost:8069/web#view_type=form&model=system.audit&menu_id=221&action=229")['model'][0]
print(result)
For python 2 use urlparse.urlparse.parse_qs().
Related
My code:
from amazonify import amazonify
from urllib.parse import urlparse
User_url = input('Enter the link here: ')
The error I got:
File
"C:\Users\jashandeep\AppData\Local\Programs\Python\Python39\lib\site-packages\amazonify.py",
line 4, in
from urlparse import urlparse, urlunparse ModuleNotFoundError: No module named 'urlparse'
You will need to update this import to add Python 3 support:
From:
from urlparse import urlparse, urlunparse, parse_qs
from urllib import urlencode
to:
from urllib.parse import urlparse, urlunparse, parse_qs, urlencode
Your options:
Use this package with Python 2.
Update package source code.
Modify the import statements in amazonify.py.
In some cases, changing the source code can have unexpected consequences (for example, for dependent packages and scripts). In general, I do not recommend it. But it looks like this package is pretty simple and hasn't been updated for a long time.
It seems that there will be no update. You need to do everything yourself.
Use this as a standalone script.
This package is basically a single Python function. The author of the package did not specify a license, but you can write your own script similar to this example.
from urllib.parse import urlparse, urlunparse, parse_qs, urlencode
def amazonify(url, affiliate_tag):
"""Generate an Amazon affiliate link given any Amazon link and affiliate
tag.
:param str url: The Amazon URL.
:param str affiliate_tag: Your unique Amazon affiliate tag.
:rtype: str or None
:returns: An equivalent Amazon URL with the desired affiliate tag included,
or None if the URL is invalid.
"""
# Ensure the URL we're getting is valid:
new_url = urlparse(url)
if not new_url.netloc:
return None
# Add or replace the original affiliate tag with our affiliate tag in the
# querystring. Leave everything else unchanged.
query_dict = parse_qs(new_url[4])
query_dict['tag'] = affiliate_tag
new_url = new_url[:4] + (urlencode(query_dict, True), ) + new_url[5:]
return urlunparse(new_url)
if __name__ == '__main__':
affiliate_tag = 'tag'
urls = [...]
affiliate_urls = [amazonify(u, affiliate_tag) for u in urls]
print(affiliate_urls)
I have some links stored in a file which looks like this:
http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16
At the end of the link we have the video's title. I want to read this link from a file and get the video's title in a proper format (with those '+' and '%' signs properly resolved). How do I do that?
I cannot use raw cgi as suggested here since the link is read from a file and not submitted by a form. Any idea?
There's super convenient urllib.parse.parse_qs for python 3, but if you're using python 2, you might have to dig out the title string first.
import urllib
url = 'http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16'
title = url[url.rfind('&title=') + 7:]
print urllib.unquote_plus(title)
Note: thanks to bereal for pointing out parse_qs is also available in python 2, so just:
import urlparse
print urlparse.parse_qs(url)['title'][0]
'600ft UFO Crash Site Discovered On Mars! 11/23/16'
You could use urllib.parse.parse_qs and give it the string:
In [17]: urllib.parse.parse_qs(s)
Out[17]:
{'dur': ['1047.870'],
'ei': ['DtN8WLfwFsKb1gKXho6YDw'],
'expire': ['1484597102'],
'http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag': ['22'],
[.. and so on ..]
'source': ['youtube'],
'sparams': ['dur,ei,id,initcwndbps,ip,ipbits,itag,lmt,mime,mm,mn,ms,mv,nh,pl,ratebypass,source,upn,expire'],
'title': ['600ft UFO Crash Site Discovered On Mars! 11/23/16'],
'upn': ['tUcEt34Qe6c']}
In [18]: urllib.parse.parse_qs(s)["title"][0]
Out[18]: '600ft UFO Crash Site Discovered On Mars! 11/23/16'
Purl can fit your needs:
import purl
u = purl.URL('http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16')
print(u.query_param('title'))
use urlparse.parse_qs:
try:
from urlparse import urlparse # for python2
except:
from urllib import parse as urlparse # for python3
rv = urlparse.parse_qs(link)
title = rv['title'][0]
import urllib
a = "http://r14---sn-p5qlsnss.googlevideo.com/videoplayback?itag=22&id=o-AOtM1kWozUiJKP2ENWH989ZIfJaZNPVvXTrBkXx40lG5&key=yt6&ip=159.253.144.86&lmt=1480060612064057&dur=1047.870&mv=m&source=youtube&ms=au&ei=DtN8WLfwFsKb1gKXho6YDw&expire=1484597102&mn=sn-p5qlsnss&mm=31&ipbits=0&nh=IgpwcjAzLmlhZDA3KgkxMjcuMC4wLjE&initcwndbps=4717500&mt=1484575249&pl=24&signature=1ECAB2B56C30CBF760721A1A26A7E80963DB36B8.6336B2C9C41DB53C8FA1D2A037793275F57C4825&ratebypass=yes&mime=video%2Fmp4&upn=tUcEt34Qe6c&sparams=dur%2Cei%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cnh%2Cpl%2Cratebypass%2Csource%2Cupn%2Cexpire&title=600ft+UFO+Crash+Site+Discovered+On+Mars%21+11%2F23%2F16"
b = a.split('=')[-1]
print urllib.unquote_plus(b)
I am using Jupyter Notebook to get docid=PE209374738 as my output using reg ex. It is currently stored in a dictionary in this format:
{'Url': 'https://backtoschool.com/document.php?docid=PE209374738&datasource=PHE&vid=3326&referrer=api'}.
This is my code:
results= xmldoc.getElementsByTagName("result")
dict= {}
for a in results:
url= 'Url'
dict[url] = a.getElementsByTagName("url")[0].childNodes[0].nodeValue
docid= re.search(r'\?(.*?)&')
Does anyone have any suggestions on how to print that id?
The standard library already has methods for parsing URLs properly, no need for regex.
In Python 3:
from urllib.parse import urlparse, parse_qs
url = 'https://backtoschool.com/document.php?docid=PE209374738&datasource=PHE&vid=3326&referrer=api'
print(parse_qs(urlparse(url).query)['docid'][0]) # PE209374738
In Python 2 the first line is:
from urlparse import urlparse, parse_qs
#alex-hall is correct, you probably should better parse this using a proper URL parser.
That said, your original question was about doing it with using regexps, so here is the solution (which you nearly nailed already):
s = 'https://backtoschool.com/document.php?docid=PE209374738&datasource=PHE&vid=3326&referrer=api'
m = re.search(r'\?docid=(.*?)&', s)
print m.groups()[0]
This will print the desired PE209374738.
In a Django 1.8 simple tag, I need to resolve the path to the HTTP_REFERER found in the context. I have a piece of code that works, but I would like to know if a more elegant solution could be implemented using Django tools.
Here is my code :
from django.core.urlresolvers import resolve, Resolver404
# [...]
#register.simple_tag(takes_context=True)
def simple_tag_example(context):
# The referer is a full path: http://host:port/path/to/referer/
# We only want the path: /path/to/referer/
referer = context.request.META.get('HTTP_REFERER')
if referer is None:
return ''
# Build the string http://host:port/
prefix = '%s://%s' % (context.request.scheme, context.request.get_host())
path = referer.replace(prefix, '')
resolvermatch = resolve(path)
# Do something very interesting with this resolvermatch...
So I manually construct the string 'http://sub.domain.tld:port', then I remove it from the full path to HTTP_REFERER found in context.request.META. It works but it seems a bit overwhelming for me.
I tried to build a HttpRequest from referer without success. Is there a class or type that I can use to easily extract the path from an URL?
You can use urlparse module to extract the path:
try:
from urllib.parse import urlparse # Python 3
except ImportError:
from urlparse import urlparse # Python 2
parsed = urlparse('http://stackoverflow.com/questions/32809595')
print(parsed.path)
Output:
'/questions/32809595'
I'm doing this:
urlparse.urljoin('http://example.com/mypage', '?name=joe')
And I get this:
'http://example.com/?name=joe'
While I want to get this:
'http://example.com/mypage?name=joe'
What am I doing wrong?
You could use urlparse.urlunparse :
import urlparse
parsed = list(urlparse.urlparse('http://example.com/mypage'))
parsed[4] = 'name=joe'
urlparse.urlunparse(parsed)
You're experiencing a known bug which affects Python 2.4-2.6.
If you can't change or patch your version of Python, #jd's solution will work around the issue.
However, if you need a more generic solution that works as a standard urljoin would, you can use a wrapper method which implements the workaround for that specific use case, and default to the standard urljoin() otherwise.
For example:
import urlparse
def myurljoin(base, url, allow_fragments=True):
if url[0] != "?":
return urlparse.urljoin(base, url, allow_fragments)
if not allow_fragments:
url = url.split("#", 1)[0]
parsed = list(urlparse.urlparse(base))
parsed[4] = url[1:] # assign params field
return urlparse.urlunparse(parsed)
I solved it by bundling Python 2.6's urlparse module with my project. I also had to bundle namedtuple which was defined in collections, since urlparse uses it.
Are you sure? On Python 2.7:
>>> import urlparse
>>> urlparse.urljoin('http://example.com/mypage', '?name=joe')
'http://example.com/mypage?name=joe'