URLLIB Request unable to handle Chinese Character - python

I'm trying to do a curl request to api but the chinese chracter is giving a error
UnicodeEncodeError: 'ascii' codec can't encode characters in position 96-97: ordinal not in range(128)
Unlike in php where i can simply just throw the character into curl but python seem to be different. Any idea how i can do the same?
Here's the code I'm using:
query = urllib.parse.urlencode({'': '不存'})
url = 'https://www.googleapis.com/language/translate/v2' + query
print(url);
search_response = urllib.request.urlopen(url)
search_results = search_response.read().decode("utf8")
results = json.loads(search_results)
data = results['responseData']
print(data);

Related

UnicodeEncodeError : 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256). How to solve? [duplicate]

This question already has answers here:
What are the character encodings UTF-8 and ISO-8859-1 rules
(2 answers)
Closed 2 years ago.
I have to access to a db through this code that is provided by MobiDB to have disorder prediction in proteins.
import urllib2
import json
# Define request
acceptHeader = 'My_File_TrEMBL.txt' # text/csv and text/plain supported
request = urllib2.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" : acceptHeader})
# Send request
response = urllib2.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
Since I'm not using Python 2.6 I changed the script as follows:
import urllib.request
import json
# Define request
acceptHeader ='My_File_TrEMBL.txt'
# text/csv and text/plain supported
request = urllib.request.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" :
acceptHeader})
# Send request
response = urllib.request.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
So I am not using urllib2 but urllib.request. The problem arises when the variable request is passed to urllib.request.urlopen that returns me this instance:
" 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256) "
I understood that is something related to ASCII code, but since I am new to Python and I am eager given the deadline of the work I'd like any help you can give me.
Obliged
Decode the bytes content using utf-8 encoding and read the content sing json.loads
response = urllib.request.urlopen(request)
#get the content and decode it using utf-8
respcontent = response.read().decode('utf-8')
data = json.loads(respcontent)
print(data)

Unable to call an API and get respective output in Python

I am new to python and I am trying to call an API and get output, however, I am running into the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8f in position
2: invalid start byte
The code I have written thus far is as follows:
import requests
file = open("NLPData/testimg.png")
payload = {file}
response = requests.get("http://sampleapi.com/get", params=payload)
print(response.text)

UnicodeDecodeError on Windows, but not when running the exact same code on Mac

I'm trying to download json data via an API. The code is as follows:
import urllib.request, ssl, json
context = ssl._create_unverified_context()
rsbURL = "https://rsbuddy.com/exchange/summary.json"
with urllib.request.urlopen(rsbURL, context=context) as url:
data = json.loads(url.read().decode('UTF-8'))
This code works perfectly fine on my Mac, and I confirmed that data is what is supposed to be the JSON string. However, when I run the exact same code on windows, I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
What is going on and how do I fix it?
Looks like the server is sending a compressed response for some reason (it shouldn't be doing that unless you explicitly set the accept-encoding header). You can adapt your code to work with compressed responses like this:
import gzip
import urllib.request, ssl, json
context = ssl._create_unverified_context()
rsbURL = "https://rsbuddy.com/exchange/summary.json"
with urllib.request.urlopen(rsbURL, context=context) as url:
if url.info().get('Content-Encoding') == 'gzip':
body = gzip.decompress(url.read())
else:
body = url.read()
data = json.loads(body)

sending binary data over json

I wanted to upload file to s3 using python. I am using requests_aws4 auth library for this
import requests
from requests_aws4auth import AWS4Auth
# data=encode_audio(data)
endpoint = 'http://bucket.s3.amazonaws.com/testing89.mp3'
data = //some binary data(mp3) here
auth = AWS4Auth('xxxxxx', 'xxxxxx', 'eu-west-2', 's3')
response = requests.put(endpoint,data=data, auth=auth, headers={'x-amz-acl': 'public-read'})
print response.text
I have tried the above code and got the following error.
'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128).
This works fine if I send text data since it is also ascii but when the binary data is being sent I think there is some concatenation error of binary data with auth data. Am I wrong somewhere? Someone please guide me. Thanks.

Character encoding in a GET request with http.client lib (Python)

I am a beginner in python and I coded this little script to send an HTTP GET request on my local server (localhost). It works great, except that I wish I could send Latin characters such as accents.
import http.client
httpMethod = "GET"
url = "localhost"
params = "Hello World"
def httpRequest(httpMethod, url, params):
conn = http.client.HTTPConnection(url)
conn.request(httpMethod, '/?param='+params)
conn.getresponse().read()
conn.close()
return
httpRequest(httpMethod, url, params)
When I insert the words with accent in my parameter "params", this is the error message that appears:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 14: ordinal not in range(128)
I don't know if there is a solution using http.client library but I think so. When I look in the documentation http.client, I can see this:
HTTPConnection.request
Strings are encoded as ISO-8859-1, the default charset for HTTP
You shouldn't construct arguments manually. Use urlencode instead:
>>> from urllib.parse import urlencode
>>> params = 'Aserejé'
>>> urlencode({'params': params})
'params=Aserej%C3%A9'
So, you can do:
conn.request(httpMethod, '/?' + urlencode({'params': params}))
Also note that yout string will be encoded as UTF-8 before being URL-escaped.

Categories