I had an application that was downloading a .CSV file from a password-protected website then processing it futher.
I was using FancyURLOpener, and simply hardcoding the username and password. (Obviously, security is not a high priority in this particular instance).
Since downloading Python 3.1.2, this code has stopped working. After fixing the obvious issue of it now being in the "request" namespace, it's crashing in a less obvious way.
Does anyone know of the changes that have happened to the implementation, and how to use it now? The documentation seems to be short of examples.
Here is a cut down version of the code:
import urllib.request;
class TracOpener (urllib.request.FancyURLopener) :
def prompt_user_passwd(self, host, realm) :
return ('andrew_ee', '_my_unenctryped_password')
csvUrl='http://mysite/report/19?format=csv#USER=fred_nukre'
opener = TracOpener();
f = opener.open(csvUrl); # This is failing!
s = f.read();
f.close();
s;
For the sake of completeness, here's the entire call stack:
Traceback (most recent call last):
File "C:\reporting\download_csv_file.py", line 12, in <module>
f = opener.open(csvUrl);
File "C:\Program Files\Python31\lib\urllib\request.py", line 1454, in open
return getattr(self, name)(url)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1628, in open_http
return self._open_generic_http(http.client.HTTPConnection, url, data)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1624, in _open_generic_http
response.status, response.reason, response.msg, data)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1640, in http_error
result = method(url, fp, errcode, errmsg, headers)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1878, in http_error_401
return getattr(self,name)(url, realm)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1950, in retry_http_basic_auth
return self.open(newurl)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1454, in open
return getattr(self, name)(url)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1628, in open_http
return self._open_generic_http(http.client.HTTPConnection, url, data)
File "C:\Program Files\Python31\lib\urllib\request.py", line 1590, in _open_generic_http
auth = base64.b64encode(user_passwd).strip()
File "C:\Program Files\Python31\lib\base64.py", line 56, in b64encode
raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str
It's a known bug: http://bugs.python.org/issue8123
Related
Python 3.5.2, Tweepy 3.5.0, Windows 8.1
I'm following a tutorial made by sentdex that shows how to stream data from Twitter using tweepy. (His tutorial is in Python 2 but it is pretty easy to Python 3-ify it)
However, when I run the script, it doesn't spit out any data. It hangs until I get a 3-way IncompleteRead exception, or until I do Ctrl+C.
Here is my Listener class code:
class listener(StreamListener):
def on_date(self,data):
try:
print(data)
save = open('twitDB.csv', 'a')
save.write(data)
save.write('\n')
save.close()
return True
except BaseException as e:
print('failed on data,',str(e))
time.sleep(5)
def on_error(self,status):
print(status)
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken,asecret)
twitterStream = Stream(auth=auth, listener=listener())
twitterStream.filter(track=["car"])
As you can see, I have it set up to catch errors and print data out while saving it to a csv, but it doesn't really do anything, just hangs.
Also, for track, I did try to use something less general but it still hanged.
When KeyboardInterrupt is raised:
Traceback (most recent call last):
File "C:\Program Files\Python35\lib\site-packages\requests\packages\urllib3\co
ntrib\pyopenssl.py", line 217, in recv_into
return self.connection.recv_into(*args, **kwargs)
File "C:\Program Files\Python35\lib\site-packages\OpenSSL\SSL.py", line 1352,
in recv_into
self._raise_ssl_error(self._ssl, result)
File "C:\Program Files\Python35\lib\site-packages\OpenSSL\SSL.py", line 1167,
in _raise_ssl_error
raise WantReadError()
OpenSSL.SSL.WantReadError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "twittertest.py", line 33, in <module>
twitterStream.filter(track=["car"])
File "C:\Program Files\Python35\lib\site-packages\tweepy\streaming.py", line 4
45, in filter
self._start(async)
File "C:\Program Files\Python35\lib\site-packages\tweepy\streaming.py", line 3
61, in _start
self._run()
File "C:\Program Files\Python35\lib\site-packages\tweepy\streaming.py", line 2
63, in _run
self._read_loop(resp)
File "C:\Program Files\Python35\lib\site-packages\tweepy\streaming.py", line 3
13, in _read_loop
line = buf.read_line().strip()
File "C:\Program Files\Python35\lib\site-packages\tweepy\streaming.py", line 1
79, in read_line
self._buffer += self._stream.read(self._chunk_size)
File "C:\Program Files\Python35\lib\site-packages\requests\packages\urllib3\re
sponse.py", line 310, in read
data = self._fp.read(amt)
File "C:\Program Files\Python35\lib\http\client.py", line 448, in read
n = self.readinto(b)
File "C:\Program Files\Python35\lib\http\client.py", line 478, in readinto
return self._readinto_chunked(b)
File "C:\Program Files\Python35\lib\http\client.py", line 573, in _readinto_ch
unked
chunk_left = self._get_chunk_left()
File "C:\Program Files\Python35\lib\http\client.py", line 541, in _get_chunk_l
eft
chunk_left = self._read_next_chunk_size()
File "C:\Program Files\Python35\lib\http\client.py", line 501, in _read_next_c
hunk_size
line = self.fp.readline(_MAXLINE + 1)
File "C:\Program Files\Python35\lib\socket.py", line 575, in readinto
return self._sock.recv_into(b)
File "C:\Program Files\Python35\lib\site-packages\requests\packages\urllib3\co
ntrib\pyopenssl.py", line 230, in recv_into
[self.socket], [], [], self.socket.gettimeout())
KeyboardInterrupt
It's my first time going with a social media API, so I apologize if I'm missing something obvious. Help would be appreciated, thanks.
def on_date(self,data):
This should have been
def on_data(self,data):
Never mind, when I use on_status with status.text, it works, must be something I'm missing.
I am trying to import a large User Information list from a json file to the datastore using taskqueue and deferred.
A User contains the user's information including an image url from a different app. During the importing process, the image should be grabbed and uploaded to the blob (which works just fine when tested).
I got stuck with getting the blob_key of the uploaded image.
And I think it only occurs inside a taskqueue/deferred because I tried it inside a 'normal' GET request handler, it works just fine.
This is my handler:
class MigrationTask(BaseHandler):
def post(self):
if not self.request.get('file'):
return
json_data = open(self.request.get('file'))
data = json.load(json_data)
json_data.close()
for datum in data['results']:
deferred.defer(push_user_to_db, datum)
this are my functions:
#ndb.transactional(xg=True)
def _push_user_to_db(profilePicture=None, ...):
if profilePicture:
if 'url' in profilePicture:
con = urlfetch.fetch(image_url)
if con.status_code == 200:
file_name = files.blobstore.create(mime_type='application/octet-stream')
with files.open(file_name, 'a') as f:
f.write(con.content)
files.finalize(file_name)
blob_key = files.blobstore.get_blob_key(file_name) # this part is where it errs
image_url = images.get_serving_url(file_name)
# some codes here...
def push_user_to_db(kwargs):
_push_user_to_db(**kwargs)
part of the traceback:
blob_key = files.blobstore.get_blob_key(file_name)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\files\blobstore.py", line 132, in get_blob_key
namespace='')])[0]
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\datastore.py", line 654, in Get
return GetAsync(keys, **kwargs).get_result()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\datastore.py", line 629, in GetAsync
return _GetConnection().async_get(config, keys, local_extra_hook)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\datastore\datastore_rpc.py", line 1574, in async_get
pbs = [key_to_pb(key) for key in keys]
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\model.py", line 653, in key_to_pb
return key.reference()
AttributeError: 'Key' object has no attribute 'reference'
PS: I've also tried taskqueue instead of deferred.
EDIT(1):
This is the traceback:
ERROR 2015-03-03 06:32:44,720 webapp2.py:1552] 'Key' object has no attribute 'reference'
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1535, in __call__
rv = self.handle_exception(request, response, e)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1529, in __call__
rv = self.router.dispatch(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1102, in __call__
return handler.dispatch()
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\deferred\deferred.py", line 310, in post
self.run_from_request()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\deferred\deferred.py", line 305, in run_from_request
run(self.request.body)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\deferred\deferred.py", line 147, in run
return func(*args, **kwds)
File "C:\project directory\migration.py", line 141, in push_user_to_db
_push_user_to_db(**kwargs)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\utils.py", line 179, in inner_wrapper
return wrapped_decorator(func, args, kwds, **options)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\model.py", line 3759, in transactional
func, args, kwds, **options).get_result()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\tasklets.py", line 325, in get_result
self.check_success()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\tasklets.py", line 371, in _help_tasklet_along
value = gen.send(val)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\context.py", line 999, in transaction
result = callback()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\model.py", line 3767, in <lambda>
return transaction_async(lambda: func(*args, **kwds), **options)
File "C:\project directory\migration.py", line 56, in _push_user_to_db
blob_key = files.blobstore.get_blob_key(file_name)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\files\blobstore.py", line 132, in get_blob_key
namespace='')])[0]
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\datastore.py", line 654, in Get
return GetAsync(keys, **kwargs).get_result()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\datastore.py", line 629, in GetAsync
return _GetConnection().async_get(config, keys, local_extra_hook)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\datastore\datastore_rpc.py", line 1574, in async_get
pbs = [key_to_pb(key) for key in keys]
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\model.py", line 653, in key_to_pb
return key.reference()
AttributeError: 'Key' object has no attribute 'reference'
Heads up! Writing files to the Blobstore using the files api has been deprecated. I had this issue before. My codes run perfectly fine in development server (localhost) but erred on App Engine server. The solution is to write the files in the Google Cloud Storage via Blobstore API.
I'm in the process of switching an application over from Python 2.5 to 2.7 and have begun encountering a problem with the images service. For example, saving this entity using db.put():
from google.appengine.api import images
class Images(db.Expando):
ImageTitle = db.StringProperty()
ImageFile = blobstore.BlobReferenceProperty()
ImageReference = db.StringProperty()
def put(self, **kwargs):
if not self.ImageReference:
self.ImageReference = images.get_serving_url(self.ImageFile.key())
super(Images, self).put(**kwargs)
Now yields this error:
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1536, in __call__
rv = self.handle_exception(request, response, e)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1530, in __call__
rv = self.router.dispatch(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1102, in __call__
return handler.dispatch()
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "C:\Users\VB User\Bruha\src\handler_product_page_image.py", line 40, in post
image.put()
File "C:\Users\VB User\Bruha\src\db_models.py", line 56, in put
self.ImageReference = images.get_serving_url(self.ImageFile.key())
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\images\__init__.py", line 1792, in get_serving_url
rpc = get_serving_url_async(blob_key, size, crop, secure_url, filename, rpc)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\images\__init__.py", line 1907, in get_serving_url_async
None)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\images\__init__.py", line 1034, in _make_async_call
rpc = create_rpc()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\images\__init__.py", line 1028, in create_rpc
return apiproxy_stub_map.UserRPC("images", deadline, callback)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\apiproxy_stub_map.py", line 405, in __init__
self.__rpc = CreateRPC(service, stubmap)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\apiproxy_stub_map.py", line 69, in CreateRPC
'a CreateRPC method.') % service)
AssertionError: The service "images" doesn't have a CreateRPC method.
Calling the 'execute_transforms' method also yields the same error.
Any help understanding what is going on would be much appreciated.
You are running the dev server, so when you start up check for this message `'Could not initialize images API; you are likely missing '
'the Python "PIL" module. ImportError: %s', e
If you are getting this message then the images service RPC is not being registered (the RegisterStub call in dev_appserver will be failing) and you will get the error you are seeing, because the assertion fails in CreateRPC call.
So check to see if PIL is correctly installed for Python 2.7
Thanks for your help in advance!
I want to get contents of a website, so I use urllib.urlopen(url).
set url='http://localhost:8080'(tomcat page)
If I use Google App Engine Launcher, run the application, browse http://localhost:8082 , it works well.
But if I specify the address and port for the application:
python `"D:\Program Files\Google\google_appengine\dev_appserver.py" -p 8082 -a 10.96.72.213 D:\pagedemon\videoareademo`
there's something wrong:
Traceback (most recent call last):
File "D:\Program Files\Google\google_appengine\google\appengine\ext\webapp\_webapp25.py", line 701, in __call__
handler.get(*groups)
File "D:\pagedemon\videoareademo\home.py", line 76, in get
wp = urllib.urlopen(url)
File "C:\Python27\lib\urllib.py", line 84, in urlopen
return opener.open(url)
File "C:\Python27\lib\urllib.py", line 205, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 343, in open_http
errcode, errmsg, headers = h.getreply()
File "D:\Program Files\Google\google_appengine\google\appengine\dist\httplib.py", line 334, in getreply
response = self._conn.getresponse()
File "D:\Program Files\Google\google_appengine\google\appengine\dist\httplib.py", line 222, in getresponse
deadline=self.timeout)
File "D:\Program Files\Google\google_appengine\google\appengine\api\urlfetch.py", line 263, in fetch
return rpc.get_result()
File "D:\Program Files\Google\google_appengine\google\appengine\api\apiproxy_stub_map.py", line 592, in get_result
return self.__get_result_hook(self)
File "D:\Program Files\Google\google_appengine\google\appengine\api\urlfetch.py", line 365, in _get_fetch_result
raise DownloadError(str(err))
DownloadError: ApplicationError: 2 [Errno 11003] getaddrinfo failed
The strangest thing is when I change the url form "http://localhost:8080" to "http://127.0.0.1:8080", it works well!
I googled a lot, but I didn't find any good solutions.Hoping for some help!
Also, I didn't configure any proxy.IE works well.
Your system doesn't necessarily know that localhost should resolve to 127.0.0.1. You might need to put an entry into your hosts file. On Windows, it's located at C:\Windows\System32\drivers\etc\hosts
I'm trying to do what should be a fairly simple url fetch with google appengine. However, it keeps failing.
result = urlfetch.fetch(url=apiurl, method=urlfetch.POST)
I also tried using httplib:
conn = httplib.HTTPConnection("api.eve-online.com")
conn.request("POST", "/char/CharacterSheet.xml.aspx", params, headers)
response = conn.getresponse()
self.response.out.write(response.read())
Both of these return very similar errors,
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\webapp\__init__.py", line 515, in __call__
handler.get(*groups)
File "C:\Users\Martin\Documents\google_appengine\martindevans\eveapi.py", line 24, in get
method=urlfetch.POST)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\urlfetch.py", line 241, in fetch
return rpc.get_result()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\apiproxy_stub_map.py", line 530, in get_result
return self.__get_result_hook(self)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\urlfetch.py", line 315, in _get_fetch_result
rpc.check_success()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\apiproxy_stub_map.py", line 502, in check_success
self.__rpc.CheckSuccess()
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\apiproxy_rpc.py", line 149, in _WaitImpl
self.request, self.response)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\apiproxy_stub.py", line 80, in MakeSyncCall
method(request, response)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\urlfetch_stub.py", line 133, in _Dynamic_Fetch
deadline=deadline)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\urlfetch_stub.py", line 223, in _RetrieveURL
connection.request(method, full_path, payload, adjusted_headers)
File "C:\Python27\lib\httplib.py", line 946, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 986, in _send_request
self.putheader(hdr, value)
File "C:\Python27\lib\httplib.py", line 924, in putheader
str = '%s: %s' % (header, '\r\n\t'.join(values))
TypeError: sequence item 0: expected string, int found
I have no idea what's going on here. I'm not providing any sequences to the urlfetch method, so I'm unsure where to start debugging this.
edit: As asked, here are the headers:
headers = { "Content-type": "application/x-www-form-urlencoded" }
This can't be the problem though, because the first approach doesn't even set any headers!
Are you using Python 2.7 ? It supports only 2.5.2. While 2.6 mostly works, 2.7 don't.