urllib2.urlopen crashing for unknown reason

urllib2.urlopen crashing for unknown reason - python

I have some code which is very similar to code used here:
https://github.com/jeysonmc/python-google-speech-scripts/blob/master/stt_google.py
Here is my code:
f = open(filename, 'rb')
speech = f.read()
f.close()
LANG_CODE = 'en-US' # Language to use
GOOGLE_SPEECH_URL = 'https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=2&lang=%s&maxresults=6' % (LANG_CODE)
f = open(filename, 'rb')
flac_cont = f.read()
f.close()
hrs = {"User-Agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7",
'Content-type': 'audio/x-flac; rate=16000'}
req = urllib2.Request(GOOGLE_SPEECH_URL, data=flac_cont, headers=hrs)
print "Sending request to Google TTS"
p = urllib2.urlopen(req)
response = p.read()
print "response", response
res = eval(response)['hypotheses']
It seems to get stuck on the urllib2.urlopen(req) line. It gives back this error:
Traceback (most recent call last):
File "google-speech.py", line 443, in <module>
GoogleSpeech.text_from_speech(filename)
File "google-speech.py", line 274, in text_from_speech
p = urllib2.urlopen(req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request
I'm not sure what the issue could be
EDIT: Added the end of my backtrace, which was missing earlier

If the error happens randomly, you can use a graceful retry algorithm, such as the one implemented here:
https://wiki.python.org/moin/PythonDecoratorLibrary#Retry
The idea is that, if for example the URL is currently not reachable, you don't keep retrying blindly, but increase the retry interval to allow the target location to recover, and backoff eventually if the URL cannot be opened at all.
If the error happens everytime, you have a different problem and should post the complete stacktrace.

This is what I do to overcome this problem:
while True:
try:
p = urllib2.urlopen(req)
break
except Exception as e:
print(e, 'Trying again...')

Related

Some Image URLs working while most dont Pillow

I am trying to make some image filters for my API. Some URLs work while most do not. I wanted to know why and how to fix it. I have Looked through another stack overflow post but have not had much luck as I don't know the problem.
Here is an example of a working URL
And one that does not work
Edit Here is another URL that does not work
Here is the API I am trying to make
Here is my code
def generate_image_Wanted(imageUrl):
with urllib.request.urlopen(imageUrl) as url:
f = io.BytesIO(url.read())
im1 = Image.open("images/wanted.jpg")
im2 = Image.open(f)
im2 = im2.resize((300, 285))
img = im1.copy()
img.paste(im2, (85, 230))
d = BytesIO()
d.seek(0)
img.save(d, "PNG")
d.seek(0)
return d
Here is my error
Traceback (most recent call last):
File "c:\Users\micha\OneDrive\Desktop\MicsAPI\test.py", line 23, in <module>
generate_image_Wanted("https://cdn.discordapp.com/avatars/902240397273743361/9d7ce93e7510f47da2d8ba97ec32fc33.png")
File "c:\Users\micha\OneDrive\Desktop\MicsAPI\test.py", line 11, in generate_image_Wanted
with urllib.request.urlopen(imageUrl) as url:
File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 523, in open
response = meth(req, response)
File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 632, in http_response
response = self.parent.error(
File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 561, in error
return self._call_chain(*args)
File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain
result = func(*args)
File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Thank you for looking at this and have a good day.

maybe sites you can't scrape has server prevention for known bot and spiders and block your request from urllib.
You need to provide some headers - see more about python request lib
Working example:
import urllib.request
hdr = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }
url = "https://cdn.discordapp.com/avatars/902240397273743361/9d7ce93e7510f47da2d8ba97ec32fc33.png"
req = urllib.request.Request(url, headers=hdr)
response = urllib.request.urlopen(req)
response.read()

HTTP Error 400: Bad Request

I am learning python API testing using urllib2 module.I tried to execute the code.but throwing the following msg.Can anybody help me.Thanks in advance.
code:
url = "http://localhost:8000/HPFlights_REST/FlightOrders/"
data = {"Class" : "Business","CustomerName" :"Bhavani","DepartureDate" : "2015-10-12","FlightNumber" : "1304","NumberOfTickets": "3"}
encoded_data = urllib.urlencode(data)
'''print encoded_data
print urllib2.urlopen(url, encoded_data).read()'''
request = urllib2.Request(url, encoded_data)
print request.get_method()
request.add_data(encoded_data)
response = urllib2.urlopen(request)
Error:
Traceback (most recent call last):
File "C:/Users/kanakadurga/PycharmProjects/untitled/API.py", line 44, in <module>
createFlightOrder()
File "C:/Users/kanakadurga/PycharmProjects/untitled/API.py", line 39, in createFlightOrder
response = urllib2.urlopen(request)
File "C:\Python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request
Process finished with exit code 1

It looks like you are trying to post data to the server.
From the URL, I can make a wild guess and assume the server accepts the data in json format, probably.
If that is the case then you can do
import json
url = "http://localhost:8000/HPFlights_REST/FlightOrders/"
data = {"Class": "Business", "CustomerName": "Bhavani", "DepartureDate": "2015-10-12", "FlightNumber": "1304", "NumberOfTickets": "3"}
encoded_data = json.dumps(data)
request = urllib2.Request(url, encoded_data, {'Content-Type': 'application/json'})
f = urllib2.urlopen(req) # issue the request
response = f.read() # read the response
f.close()
... # your next operations follow
The point is that you need to encode the data correctly (json) and also set the proper content-type header in the HTTP post request, which the server probably checks.
Otherwise, the default content-type would be application/x-www-form-urlencoded, as if the data came from a form.

Post multilevel dict from python application to python webservice

I am working on an automation script (that I am using to automate the process of conversion of some videos). In this script after video conversion, I am calling my web service to update the clip status in database and sending the web service a list of clips in POST request. But the problem is this request is failing and causing 500 internal server error on server side.
Here is the code I am using to call the web service with sample data I am trying with:
post_body = {
'clips': [
{
'clip_id': 17555,
'db_url': '/720p/14555.mp4'
}
]
}
params = urlencode(post_body)
url = str(self.update_url)
req = urllib2.Request(url, params)
response = urllib2.urlopen(req)
res = response.read()
print res
And here is the code of my web service:
def update_conversion_clips(request):
print "Web service is called"
try:
clips = request.POST.get('clips', None)
print clips
return HttpResponse(True)
except:
return HttpResponse(False)
Even first print statement is not executing.
Here is the error stack trace on application side:
Traceback (most recent call last):
File "conversion_script.py", line 48, in <module>
conversion_script.run()
File "conversion_script.py", line 44, in run
self.clips.update_clips_info(None)
File "/home/abc/video_experiments/conversion/clips_manager.py", line 59, in update_clips_info
response = urllib2.urlopen(req)
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: INTERNAL SERVER ERROR
and this is error on server side:
[20/Feb/2014 04:13:15] "POST /update_conversion_clips HTTP/1.1" 500 68733
According to my research this is happening due to multilevel dict that I am sending in POST. But I could not find any solution to resolve it.
New code now sending data as json (still does not work):
values = dict()
values['clips'] = [
{
'clip_id': 17555,
'db_url': '/720p/14555.mp4'
}
]
req = urllib2.Request(self.update_url)
req.add_header('Content-Type', 'application/json')
response = urllib2.urlopen(req, json.dumps(values))
res = response.read()
print res
and on server side:
try:
data = json.loads(request.body)
clips = data['clips']
except:
print "Exception occured!"
HttpResponse(True)

urlencode isn't really a good format for this data. A much better one would be JSON.
req = urllib2.Request(self.update_url)
req.add_header('Content-Type', 'application/json')
response = urllib2.urlopen(req, json.dumps(data))
print response.read()
(you could make this part a lot simpler by using the third-party requests library).
And in the server:
clips = json.loads(request.body)

python timer + urllib2 code errors

I am trying to pull information from a site ever 5 seconds but it doesn't seem to be working and I get errors every time I run it.
Code below:
import urllib2, threading
def readpage():
data = urllib2.urlopen('http://forums.zybez.net/runescape-2007-prices').read()
for line in data:
if 'forums.zybez.net/runescape-2007-prices/player/' in line:
a = line.split('/runescape-2007-prices/player/'[1])
print(a.split('">')[0])
t = threading.Timer(5.0, readpage)
t.start()
I get these errors:
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 808, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 1080, in run
self.function(*self.args, **self.kwargs)
File "C:\Users\Jordan\Desktop\username.py", line 3, in readpage
data = urllib2.urlopen('http://forums.zybez.net/runescape-2007-prices').rea
()
File "C:\Python27\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
Help would be appreciated, thanks!

The site is rejecting the default User-Agent reported by urllib2. You can change it for all requests in the script using install_opener.
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0')]
urllib2.install_opener(opener)
You'll also need to split the data from by the site to read it line by line
urllib2.urlopen('http://forums.zybez.net/runescape-2007-prices').read().splitlines()
and change
line.split('/runescape-2007-prices/player/'[1])
to
line.split('/runescape-2007-prices/player/')[1]
Working:
import urllib2, threading
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0')]
urllib2.install_opener(opener)
def readpage():
data = urllib2.urlopen('http://forums.zybez.net/runescape-2007-prices').read().splitlines()
for line in data:
if 'forums.zybez.net/runescape-2007-prices/player/' in line:
a = line.split('/runescape-2007-prices/player/')[1]
print(a.split('">')[0])
t = threading.Timer(5.0, readpage)
t.start()

Did you try opening that url without the thread? The error code says 403: Forbidden, maybe you need authentication for that web page.

This has nothing to do with Python -- the server is denying your requests to that URL.
I suspect that either the URL is incorrect or you've hit some kind of rate limiting and are being blocked.
EDIT: how to make it work
The site is blocking Python's User-Agent. Try this:
import urllib2, threading
def readpage():
headers = { 'User-Agent' : 'Mozilla/5.0' }
req = urllib2.Request('http://forums.zybez.net/runescape-2007-prices', None, headers)
data = urllib2.urlopen(req).read()
for line in data:
if 'forums.zybez.net/runescape-2007-prices/player/' in line:
a = line.split('/runescape-2007-prices/player/'[1])
print(a.split('">')[0])

Parse.com user login - 404 error

I am fairly inexperienced with user authentication especially through restful apis. I am trying to use python to log in with a user that is set up in parse.com. The following is the code I have:
API_LOGIN_ROOT = 'https://api.parse.com/1/login'
params = {'username':username,'password':password}
encodedParams = urllib.urlencode(params)
url = API_LOGIN_ROOT + "?" + encodedParams
request = urllib2.Request(url)
request.add_header('Content-type', 'application/x-www-form-urlencoded')
# we could use urllib2's authentication system, but it seems like overkill for this
auth_header = "Basic %s" % base64.b64encode('%s:%s' % (APPLICATION_ID, MASTER_KEY))
request.add_header('Authorization', auth_header)
request.add_header('X-Parse-Application-Id', APPLICATION_ID)
request.add_header('X-Parse-REST-API-Key', MASTER_KEY)
request.get_method = lambda: http_verb
# TODO: add error handling for server response
response = urllib2.urlopen(request)
#response_body = response.read()
#response_dict = json.loads(response_body)
This is a modification of an open source library used to access the parse rest interface.
I get the following error:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 703, in __call__
handler.post(*groups)
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 464, in post
url = user.login()
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 313, in login
url = self._executeCall(self.username, self.password, 'GET', data)
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 292, in _executeCall
response = urllib2.urlopen(request)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 438, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 372, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 521, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 404: Not Found
Can someone point me to where I am screwing up? I'm not quite sure why I'm getting a 404 instead of an access denied or some other issue.

Make sure the "User" class was created on Parse.com as a special user class. When you are adding the class, make sure to change the Class Type to "User" instead of "Custom". A little user head icon will show up next to the class name on the left hand side.
This stumped me for a long time until Matt from the Parse team showed me the problem.

Please change: API_LOGIN_ROOT = 'https://api.parse.com/1/login' to the following: API_LOGIN_ROOT = 'https://api.parse.com/1/login**/**'
I had the same problem using PHP, adding the / at the end fixed the 404 error.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

urllib2.urlopen crashing for unknown reason - python

This is what I do to overcome this problem: while True: try: p = urllib2.urlopen(req) break except Exception as e: print(e, 'Trying again...')

Related

Some Image URLs working while most dont Pillow

HTTP Error 400: Bad Request

Post multilevel dict from python application to python webservice

python timer + urllib2 code errors

Parse.com user login - 404 error

Categories

Resources