How to encode my url parameter in google app engine? - python

I do geocoding with python and I think I need to encode the variable region with urllencode so that it works with content that has whitespace and other special characters:
url = urllib.urlencode('http://maps.googleapis.com/maps/api/geocode/json?address='+region+'&sensor=false')
logging.info('url:'+url)
result = urlfetch.fetch(url)
It generates an error log when the variable region contains a whitespace
Traceback (most recent call last):
File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
return method(*args, **kwargs)
File "/base/data/home/apps/s~montaoproject/pricehandling.355268396595012751/in.py", line 153, in get
url = urllib.urlencode('http://maps.googleapis.com/maps/api/geocode/json?address='+region+'&sensor=false')
File "/base/python27_runtime/python27_dist/lib/python2.7/urllib.py", line 1275, in urlencode
raise TypeError
TypeError: not a valid non-string sequence or mapping object
The background is another question I asked where I had I problem that I'm troublehhotting to be that the code works but not for regions that are two or more words ie names with whitespaces.
https://stackoverflow.com/questions/8441063/how-should-i-use-urlfetch-here
On production I used another variable. I thought it did not matter that it had whitespace. When I try variables that do not contain whitespace it works.
So could you please tell me how I should encode the url variable to admit whitespace and other "special" characters?
Thank you

Just encode your querystring part
Like:
param = {"address" : region,
"sensor" : "false"
}
or
param = [("address", region), ("sensor", "false")]
then
encoded_param = urllib.urlencode(param)
url = 'http://maps.googleapis.com/maps/api/geocode/json'
url = url + '?' + encoded_param
result = urlfetch.fetch(url)

Use urllib.pathname2url, it works directly on a single string value, no need for a dictionary

Related

Trying to e.164 a phone number from form input

I'm trying to take a UK mobile phone number input from a web form and use Python to clean it into a E.164 format, then validate it, before entering it into a database.
The library I'm trying to use is "Phonenumbers" and the code I'm experimenting with so far is:
def Phone():
my_number = '+4407808765066'
clean_phone = phonenumbers.parse(my_number, "GB")
cleaner_phone = phonenumbers.format_number(clean_phone,
phonenumbers.PhoneNumberFormat.E164)
valid = phonenumbers.is_possible_number(cleaner_phone)
print(cleaner_phone)
Just working through the logic, my expectation is that it should take the contents of my_number variable, format it through into the clean_phone variable, then format it to E.164 standard before passing it to the validation and return the output to valid. The print statement is for me to see the output.
Everything looks to work ok if I comment out the valid variable line. As soon as I uncomment it, I get an error (see below).
Traceback (most recent call last):
File "phone_test.py", line 14, in <module>
Phone()
File "phone_test.py", line 10, in Phone
valid = phonenumbers.is_possible_number(cleaner_phone)
File "D:\Dropbox\Coding Projects\learner_driver_app\env\lib\site-packages\phonenumbers\phonenumberutil.py", line 2257, in is_possible_number
result = is_possible_number_with_reason(numobj)
File "D:\Dropbox\Coding Projects\learner_driver_app\env\lib\site-packages\phonenumbers\phonenumberutil.py", line 2358, in is_possible_number_with_reason
return is_possible_number_for_type_with_reason(numobj, PhoneNumberType.UNKNOWN)
File "D:\Dropbox\Coding Projects\learner_driver_app\env\lib\site-packages\phonenumbers\phonenumberutil.py", line 2393, in is_possible_number_for_type_with_reason
national_number = national_significant_number(numobj)
File "D:\Dropbox\Coding Projects\learner_driver_app\env\lib\site-packages\phonenumbers\phonenumberutil.py", line 1628, in national_significant_number
if numobj.italian_leading_zero:
AttributeError: 'str' object has no attribute 'italian_leading_zero'
Where am I going wrong?
Your my_number is a variable of type str (string), thus the last line of your error). The string class does not know the attribute national_number.
Reading through their examples on GitHub, I suspect you need to pass your string through the parse() function first before you can use functions from the library.
def Phone():
my_number = '+4407811111111'
number_prased = phonenumbers.parse(my_number, None) # this is new
clean_phone = phonenumbers.format_number(number_parsed,
phonenumbers.PhoneNumberFormat.E164)
return clean_phone
The None in parse() may be replaced by a country code if it is known. Otherwise, it will try to figure it out but may fail.
Edit to account for more information in the original question:
Apparently phonenumbers.format_number() returns a string, therefore you have to re-parse the number again to get an object of type phonenumbers.phonenumber.PhoneNumber (you can check the type of objects with type(my_object)). After that, your code will return True.
You can't use the format_number function with a string as argument, it expects a PhoneNumber object.
You can get one by using the parse function.
See https://github.com/daviddrysdale/python-phonenumbers/tree/dev/python#example-usage

UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode

I know many people encountered this error before but I couldn't find the solution to my problem.
I have a URL that I want to normalize:
url = u"http://www.dgzfp.de/Dienste/Fachbeitr%C3%A4ge.aspx?EntryId=267&Page=5"
scheme, host_port, path, query, fragment = urlsplit(url)
path = urllib.unquote(path)
path = urllib.quote(path,safe="%/")
This gives an error message:
/usr/lib64/python2.6/urllib.py:1236: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
res = map(safe_map.__getitem__, s)
Traceback (most recent call last):
File "url_normalization.py", line 246, in <module>
logging.info(get_canonical_url(url))
File "url_normalization.py", line 102, in get_canonical_url
path = urllib.quote(path,safe="%/")
File "/usr/lib64/python2.6/urllib.py", line 1236, in quote
res = map(safe_map.__getitem__, s)
KeyError: u'\xc3'
I tried to remove the unicode indicator "u" from the URL string and I do not get the error message. But How can I get rid of the unicode automatically because I read it directly from a database.
urllib.quote() does not properly parse Unicode. To get around this, you can call the .encode() method on the url when reading it (or on the variable you read from the database). So run url = url.encode('utf-8'). With this you get:
import urllib
import urlparse
from urlparse import urlsplit
url = u"http://www.dgzfp.de/Dienste/Fachbeitr%C3%A4ge.aspx?EntryId=267&Page=5"
url = url.encode('utf-8')
scheme, host_port, path, query, fragment = urlsplit(url)
path = urllib.unquote(path)
path = urllib.quote(path,safe="%/")
and then your output for the path variable will be:
>>> path
'/Dienste/Fachbeitr%C3%A4ge.aspx'
Does this work?

Python-ldap ldap.initialize rejects a URL that ldapurl considers valid

I want to open a connection to a ldap directory using ldap url that will be given at run time. For example :
ldap://192.168.2.151/dc=directory,dc=example,dc=com
It is valid as far as I can tell. Python-ldap url parser ldapurl.LDAPUrl accepts it.
url = 'ldap://192.168.2.151/dc=directory,dc=example,dc=com'
parsed_url = ldapurl.LDAPUrl(url)
parsed_url.dn
'dc=directory,dc=example,dc=com'
But if I use it to initialize a LDAPObject, I get a ldap.LDAPError exception
ldap.initialize(url)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/ldap/functions.py", line 91, in initialize
return LDAPObject(uri,trace_level,trace_file,trace_stack_limit)
File "/usr/lib/python2.7/dist-packages/ldap/ldapobject.py", line 70, in __init__
self._l = ldap.functions._ldap_function_call(ldap._ldap_module_lock,_ldap.initialize,uri)
File "/usr/lib/python2.7/dist-packages/ldap/functions.py", line 63, in _ldap_function_call
result = func(*args,**kwargs)
ldap.LDAPError: (0, 'Error')
I found that if I manually encode the dn part of the url, it works :
url = 'ldap://192.168.2.151/dc=directory%2cdc=example%2cdc=com'
#url still valid
parsed_url = ldapurl.LDAPUrl(url)
parsed_url.dn
'dc=directory,dc=example,dc=com'
#and will return a valid connection
ldap.initialize(url)
<ldap.ldapobject.SimpleLDAPObject instance at 0x1400098>
How can I ensure robust url handling in ldap.initialize without encoding parts of the url myself ? (which, I'm afraid, won't be that robust anyway).
You can programatically encode the last part of the URL:
from urllib import quote # works in Python 2.x
from urllib.parse import quote # works in Python 3.x
url = 'ldap://192.168.2.151/dc=directory,dc=paralint,dc=com'
idx = url.rindex('/') + 1
url[:idx] + quote(url[idx:], '=')
=> 'ldap://192.168.2.151/dc=directory%2Cdc=paralint%2Cdc=com'
One can use LDAPUrl.unparse() method to get a properly encoded version of the URI, like this :
>>> import ldapurl
>>> url = ldapurl.LDAPUrl('ldap://192.168.2.151/dc=directory,dc=example,dc=com')
>>> url.unparse()
'ldap://192.168.2.151/dc%3Ddirectory%2Cdc%3Dparalint%2Cdc%3Dcom???'
>>> ldap.initialize(url.unparse())
<ldap.ldapobject.SimpleLDAPObject instance at 0x103d998>
And LDAPUrl.unparse() will not reencode an already encoded url :
>>> url = ldapurl.LDAPUrl('ldap://example.com/dc%3Dusers%2Cdc%3Dexample%2Cdc%3Dcom%2F???')
>>> url.unparse()
'ldap://example.com/dc%3Dusers%2Cdc%3Dexample%2Cdc%3Dcom%2F???'
So you can use it blindly on any ldap uri your program must handle.

Google app engine key value error

I am writing a google app engine app and I have this key value error upon requests coming in
from the backtrace I just access and cause the key error
self.request.headers
entire code snippet is here, I just forward the headers unmodified
response = fetch( "%s%s?%s" % (
self.getApiServer() ,
self.request.path.replace("/twitter/", ""),
self.request.query_string
),
self.request.body,
method,
self.request.headers,
)
and get method handling the request calling proxy()
# handle http get
def get(self, *args):
parameters = self.convertParameters(self.request.query_string)
# self.prepareHeader("GET", parameters)
self.request.query_string = "&".join("%s=%s" % (quote(key) , quote(value)) for key, value in parameters.items())
self.proxy(GET, *args)
def convertParameters(self, source):
parameters = {}
for pairs in source.split("&"):
item = pairs.split("=")
if len(item) == 2:
parameters[item[0]] = unquote(item[1])
return parameters
the error back trace:
'CONTENT_TYPE'
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 513, in __call__
handler.post(*groups)
File "/base/data/home/apps/waytosing/1.342850593213842824/com/blogspot/zizon/twitter/RestApiProxy.py", line 67, in post
self.proxy(POST, *args)
File "/base/data/home/apps/waytosing/1.342850593213842824/com/blogspot/zizon/twitter/RestApiProxy.py", line 47, in proxy
self.request.headers,
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/urlfetch.py", line 240, in fetch
allow_truncated, follow_redirects)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/urlfetch.py", line 280, in make_fetch_call
for key, value in headers.iteritems():
File "/base/python_runtime/python_dist/lib/python2.5/UserDict.py", line 106, in iteritems
yield (k, self[k])
File "/base/python_runtime/python_lib/versions/1/webob/datastruct.py", line 40, in __getitem__
return self.environ[self._trans_name(item)]
KeyError: 'CONTENT_TYPE'
Any idea why it happens or is this a known bug?
This looks weird. The docs mention that response "Headers objects do not raise an error when you try to get or delete a key that isn't in the wrapped header list. Getting a nonexistent header just returns None". It's not clear from the request documentation if request.headers are also objects of this class, but even they were regular dictionaries, iteritems seems to be misbehaving. So this might be a bug.
It might be worth inspecting self.request.headers, before calling fetch, and see 1) its actual type, 2) its keys, and 3) if trying to get self.request.headers['CONTENT_TYPE'] raises an error then.
But, if you simply want to solve your problem and move forward, you can try to bypass it like:
if 'CONTENT_TYPE' not in self.request.headers:
self.request.headers['CONTENT_TYPE'] = None
(I'm suggesting setting it to None, because that's what a response Header object should return on non-existing keys)
Here's my observation about this problem:
When the content-type is application/x-www-form-urlencoded and POST data is empty (e.g. jquery.ajax GET, twitter's favorite and retweet API...), the content-type is dropped by Google appengine.
You can add:
self.request.headers.update({'content-type':'application/x-www-form-urlencoded'})
before urlfetch.
Edit: indeed, looking at the error more carefully, it doesn't seem to be related to convertParameters, as the OP points out in the comments. I'm retiring this answer.
I'm not entirely sure what you mean by "just forward the headers unmodified", but have you taken a look at self.request.query_string before and after you call convertParameters? More to the point, you're leaving out any (valid) GET parameters of the form "key=" (that is, keys with empty values).
Maybe your original query_string had a value like "CONTENT_TYPE=", and your convertParameters is stripping it out.
Known issue http://code.google.com/p/googleappengine/issues/detail?id=3427 and potential workarounds here http://code.google.com/p/googleappengine/issues/detail?id=2040

ValueError: invalid literal for int() with base 10 trying to retrieve a query parameter from Google App engine

I received the following error when trying to retrieve data using Google App Engine from a single entry to a single page e.g. foobar.com/page/1 would show all the data from id 1:
ValueError: invalid literal for int() with base 10
Here are the files:
Views.py
class One(webapp.RequestHandler):
def get(self, id):
id = models.Page.get_by_id(int(str(self.request.get("id"))))
page_query = models.Page.get(db.Key.from_path('Page', id))
pages = page_query
template_values = {
'pages': pages,
}
path = os.path.join(os.path.dirname(__file__), 'template/list.html')
self.response.out.write(template.render(path, template_values))
Urls.py:
(r'/browse/(\d+)/', One),
Error:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 501, in __call__
handler.get(*groups)
File "/Volumes/foobar/views.py", line 72, in get
id = models.Page.get_by_id(int(str(self.request.get("id"))))
ValueError: invalid literal for int() with base 10: ''
Change self.request.get("id") to simply id, which is already being passed to your get handler.
The code, as you have it, would only work for URLs like /browse/1/?id=1
I'm not quite sure what you're trying to achieve here. The first line:
id = models.Page.get_by_id(int(str(self.request.get("id"))))
returns a Page object with an ID fetched from the query string. To make it work with the passed in argument, change it to:
id = models.Page.get_by_id(int(id))
Odder is the second line:
page_query = models.Page.get(db.Key.from_path('Page', id))
This does not return a query - it returns a page object, and if you replace 'id' with 'int(id)' does precisely the same thing as the first line. What are you trying to achieve here?
I had a similar error in one of my code. I just did a simple hack of converting it into decimal first and later converting it into int int(Decimal(str(self.request.get("id"))))
The error (as I'm sure that you know - or should know as it seems you have come very far in programming) means that you are typing/reading an invalid literal (invalid character) for something that should have base 10 value (i.e. 0-9 numbers).

Categories