Unable to detect the URI-scheme of "urltest" error Python - python

I am getting the error when trying to open a url I obtained from reading data from a .txt file in python using match.group(). This is my code below for where the error comes up. Any help as too how this can be corrected would be very much appreciated.
with open('output.txt') as f:
for line in f:
match = re.search("(?P<url>https?://docs.google.com/file[^\s]+)", line)
if match is not None:
urltest = match.group()
print urltest
print "[*] Opening Map in the web browser..."
kml_url = "urltest"
try:
webbrowser.get().open_new_tab(kml_url)

Since you have not provided what you are trying to parse I can only guess but this should pretty much work for your url:
>>> import re
>>> match = re.search('(?P<url>https:\/\/docs.google.com\/file[a-zA-z0-9-]*)', 'https://docs.google.com/fileCharWithnumbers123')
>>> match.group("url")
'https://docs.google.com/fileCharWithnumbers123'

Related

How to convert python code character to human readable code

I am currently using Google Vision API in python to detect Chinese character in an image, but I found google will return python source code (Such as \xe7\x80\x86\xe7\xab\x91) instead of some human-readable string.
How can I convert it to human-readable text with utf-8 format?
Thanks all of your answer, may be I post my code is more easily for all of you.
Here is my code, basically I try to convert the whole json return from GOOGLE Vision and save in a json file, however, it hasn't success.
try:
code = requests.post('https://vision.googleapis.com/v1/images:annotate?key='+GOOGLE_API_KEY, data=params,headers=headers)
resultText = code.text.encode("utf-8")
outputFileName = image_path.split('.',1)[0]
outputDataFile = open(outputFileName+".json", "w")
outputDataFile.write(json.dumps(resultText))
outputDataFile.close()
except requests.exceptions.ConnectionError:
print('Request error')
Thank you
t = '\xe7\x80\x86\xe7\xab\x91'
t = unicode('\xe7\x80\x86\xe7\xab\x91', 'utf8')
# Output: 瀆竑
More detailed information about Unicode in here.
I finally solve this by using the below code. Thanks all of you
try:
code = requests.post('https://vision.googleapis.com/v1/images:annotate?key='+GOOGLE_API_KEY, data=params,headers=headers)
resultText = json.loads(code.text)
outputFileName = image_path.split('.',1)[0]
with open(outputFileName+".json", "w", encoding='utf8') as f:
json.dump(resultText, f, ensure_ascii=False,indent=4)
f.close()
except requests.exceptions.ConnectionError:
print('Request error')
I assume you mean you have a literal string like \xe4\xb8\x89 and you want to convert this into the character 三.
It's very strange that there isn't a straightforward way to do this. The best i can come up with is:
s = '\\xe4\\xb8\\x89'
print(bytes.fromhex(s.replace('\\x', '')).decode('utf-8')) # prints 三

Python script not downloading files

I have the code below. It prints out the URL in the console. I'm having trouble figuring out how to get it to just download it instead of displaying it. I also want to be able to search for .mov file type. I'd rather have information on how to do this rather than it done for me. Any help is appreciated!
import urllib
def is_download_allowed():
f = urllib.urlopen("http://10.1.1.27/config?action=get&paramid=eParamID_MediaState")
response = f.read()
if (response.find('"value":"1"') > -1):
return True
f = urllib.urlopen("http://10.1.1.27/config?action=set&paramid=eParamID_MediaState&value=1")
def download_clip():
url = "http://10.1.1.27/media/SC1ATK26"
print url
def is_not_download_allowed():
f = urllib.urlopen("http://10.1.1.27/config?action=get&paramid=eParamID_MediaState")
response = f.read()
if (response.find('"value":"-1"') > 1):
return True
f = urllib.urlopen("http://10.1.1.27/config?action=set&paramid=eParamID_MediaState&value=1")
is_download_allowed()
download_clip()
is_not_download_allowed()
You say you don’t want a full solution so ...
Try urllib.urlretrieve
As commented already your download function is just printing a string.

regarding file downloading in python

I wrote this code to download an srt subtitle file, but this doesn't work. Please review this problem and help me with the code. I need to find what is the mistake that i'm doing. Thanks.
from urllib import request
srt_url = "https://subscene.com/subtitle/download?mac=LkM2jew_9BdbDSxdwrqLkJl7hDpIL_HnD-s4XbfdB9eqPHsbv3iDkjFTSuKH0Ee14R-e2TL8NQukWl82yNuykti8b_36IoaAuUgkWzk0WuQ3OyFyx04g_vHI_rjnb2290"
def download_srt_file(srt_url):
response = request.urlopen(srt_url)
srt = response.read()
srt_str = str(srt)
lines = srt_str.split('\\n')
dest_url = r'srtfile.srt'
fx = open('dest_url' , 'w')
for line in lines:
fx.write(line)
fx.close()
download_srt_file(srt_url)
A number of things are wrong or can be improved.
You are missing the return statement on your function.
You are calling the function from within the function so you are not actually calling it at all. You never enter it to begin with.
dest_url is not a string, it is a variable so fx = open('dest_url', 'w') will return an error (no such file)
To avoid handling the closing and flushing the file you are writing just use the with statement.
Your split('//n') is also wrong. You are escaping the slash like that. You want to split the lines so it has to be split('\n')
Finally, you don't have to convert the srt to string. It already is.
Below is a modified and hopefully functioning version of your code with the above implemented.
from urllib import request
def download_srt_file(srt_url):
response = request.urlopen(srt_url)
srt = response.read()
lines = srt.split('\n')
dest_url = 'srtfile.srt'
with open(dest_url, 'w') as fx:
for line in lines:
fx.write(line)
return
srt_url = "https://subscene.com/subtitle/download?mac=LkM2jew_9BdbDSxdwrqLkJl7hDpIL_HnD-s4XbfdB9eqPHsbv3iDkjFTSuKH0Ee14R-e2TL8NQukWl82yNuykti8b_36IoaAuUgkWzk0WuQ3OyFyx04g_vHI_rjnb2290"
download_srt_file(srt_url)
Tell me if it works for you.
A final remark is that you are not setting the target directory for the file you are writing. Are you sure you want to do that?

Using Regex to review a Text File in Python

What I am trying to accomplish here is basically have Reg ex return the match I want based on the pattern from a text file that Python has created and written too.
Currently I am getting TypeError: 'NoneType' object is not iterable error and I am not sure why. If I need more information let me know.
#Opens Temp file
TrueURL = open("TrueURL_tmp.txt","w+")
#Reviews Data grabbed from BeautifulSoup and write urls to file
for link in g_data:
TrueURL.write(link.get("href") + '\n')
#Creates Regex Pattern for TrueURL_tmp
pattern = re.compile(r'thread/.*/*apple|thread/.*/*potato')
search_pattern = re.search(pattern, str(TrueURL))
#Uses Regex Pattern against TrueURL_tmp file.
for url in search_pattern:
print (url)
#Closes and deletes file
TrueURL.close()
os.remove("TrueURL_tmp.txt")
Your search is returning no match because you are doing it on the str representation of the file object not the actual file content.
You are basically searching something like:
<open file 'TrueURL_tmp.txt', mode 'w+' at 0x7f2d86522390>
If you want to search the file content, close the file so the content is definitely written, then reopen and read the lines or maybe just search in the loop for link in g_data:
If you actually want to write to temporary file then use a tempfile:
from tempfile import TemporaryFile
with TemporaryFile() as f:
for link in g_data:
f.write(link.get("href") + '\n')
f.seek(0)
#Creates Regex Pattern for TrueURL_tmp
pattern = re.compile(r'thread/.*/*apple|thread/.*/*potato')
search_pattern = re.search(pattern, f.read())
search_pattern is a _sre.SRE_Match object so you would call group i,e print(search_pattern.group()) or maybe you want to use findAll.
search_pattern = re.findall(pattern, f.read())
for url in search_pattern:
print (url)
I still think doing the search before you write anything might be the best approach and maybe not writing at all but I am not fully sure what it is you actually want to do because I don't see how the file fits into what you are doing, concatenating to a string would achieve the same.
pattern = re.compile(r'thread/.*/*apple|thread/.*/*potato')
for link in g_data:
match = pattern.search(link.get("href"))
if match:
print(match.group())
Here is the solution I have found to answer my original question with, although Padraic way is correct and less painful process.
with TemporaryFile() as f:
for link in g_data:
f.write(bytes(link.get("href") + '\n', 'UTF-8'))
f.seek(0)
#Creates Regex Pattern for TrueURL_tmp
pattern = re.compile(r'thread/.*/*apple|thread/.*/*potato')
read = f.read()
search_pattern = re.findall(pattern,read)
#Uses Regex Pattern against TrueURL_tmp file.
for url in search_pattern:
print (url.decode('utf-8'))

Passing a variable in url?

So I'm new in python and I desperately need help.
I have a file which has a bunch of ids (integer values) written in 'em. Its a text file.
Now I need to pass each id inside the file into a url.
For example "https://example.com/[id]"
It will be done in this way
A = json.load(urllib.urlopen("https://example.com/(the first id present in the text file)"))
print A
What this will essentially do is that it will read certain information about the id present in the above url and display it. I want this to work in a loop format where in it will read all the ids inside the text file and pass it to the url mentioned in 'A' and display the values continuously..is there a way to do this?
I'd be very grateful if someone could help me out!
Old style string concatenation can be used
>>> id = "3333333"
>>> url = "https://example.com/%s" % id
>>> print url
https://example.com/3333333
>>>
The new style string formatting:
>>> url = "https://example.com/{0}".format(id)
>>> print url
https://example.com/3333333
>>>
The reading for file as mentioned by avasal with a small change:
f = open('file.txt', 'r')
for line in f.readlines():
id = line.strip('\n')
url = "https://example.com/{0}".format(id)
urlobj = urllib.urlopen(url)
try:
json_data = json.loads(urlobj)
print json_data
except:
print urlobj.readlines()
lazy style:
url = "https://example.com/" + first_id
A = json.load(urllib.urlopen(url))
print A
old style:
url = "https://example.com/%s" % first_id
A = json.load(urllib.urlopen(url))
print A
new style 2.6+:
url = "https://example.com/{0}".format( first_id )
A = json.load(urllib.urlopen(url))
print A
new style 2.7+:
url = "https://example.com/{}".format( first_id )
A = json.load(urllib.urlopen(url))
print A
Python 3+
New String formatting is supported in Python 3 which is a more readable and better way to format a string.
Here's the good article to read about the same: Python 3's f-Strings
In this case, it can be formatted as
url = f"https://example.com/{id}"
Detailed example
When you want to pass multiple params to the URL it can be done as below.
name = "test_api_4"
owner = "jainik#test.com"
url = f"http://localhost:5001/files/create" \
f"?name={name}" \
f"&owner={owner}" \
We are using multiple f-string here and they can be appended by ''. This will keep them in the same line without inserting any new line character between them.
For values which have space
For such values you should import from urllib.parse import quote in your python file and then quote the string like: quote("firstname lastname")
This will replace space character with %20.
The first thing you need to do is know how to read each line from a file. First, you have to open the file; you can do this with a with statement:
with open('my-file-name.txt') as intfile:
This opens a file and stores a reference to that file in intfile, and it will automatically close the file at the end of your with block. You then need to read each line from the file; you can do that with a regular old for loop:
for line in intfile:
This will loop through each line in the file, reading them one at a time. In your loop, you can access each line as line. All that's left is to make the request to your website using the code you gave. The one bit your missing is what's called "string interpolation", which allows you to format a string with other strings, numbers, or anything else. In your case, you'd like to put a string (the line from your file) inside another string (the URL). To do that, you use the %s flag along with the string interpolation operator, %:
url = 'http://example.com/?id=%s' % line
A = json.load(urllib.urlopen(url))
print A
Putting it all together, you get:
with open('my-file-name.txt') as intfile:
for line in intfile:
url = 'http://example.com/?id=%s' % line
A = json.load(urllib.urlopen(url))
print A

Categories