URLLIB Request unable to handle Chinese Character

URLLIB Request unable to handle Chinese Character - python

I'm trying to do a curl request to api but the chinese chracter is giving a error
UnicodeEncodeError: 'ascii' codec can't encode characters in position 96-97: ordinal not in range(128)
Unlike in php where i can simply just throw the character into curl but python seem to be different. Any idea how i can do the same?
Here's the code I'm using:
query = urllib.parse.urlencode({'': '不存'})
url = 'https://www.googleapis.com/language/translate/v2' + query
print(url);
search_response = urllib.request.urlopen(url)
search_results = search_response.read().decode("utf8")
results = json.loads(search_results)
data = results['responseData']
print(data);

Related

UnicodeEncodeError : 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256). How to solve? [duplicate]

This question already has answers here:
What are the character encodings UTF-8 and ISO-8859-1 rules
(2 answers)
Closed 2 years ago.
I have to access to a db through this code that is provided by MobiDB to have disorder prediction in proteins.
import urllib2
import json
# Define request
acceptHeader = 'My_File_TrEMBL.txt' # text/csv and text/plain supported
request = urllib2.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" : acceptHeader})
# Send request
response = urllib2.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
Since I'm not using Python 2.6 I changed the script as follows:
import urllib.request
import json
# Define request
acceptHeader ='My_File_TrEMBL.txt'
# text/csv and text/plain supported
request = urllib.request.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" :
acceptHeader})
# Send request
response = urllib.request.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
So I am not using urllib2 but urllib.request. The problem arises when the variable request is passed to urllib.request.urlopen that returns me this instance:
" 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256) "
I understood that is something related to ASCII code, but since I am new to Python and I am eager given the deadline of the work I'd like any help you can give me.
Obliged

Decode the bytes content using utf-8 encoding and read the content sing json.loads
response = urllib.request.urlopen(request)
#get the content and decode it using utf-8
respcontent = response.read().decode('utf-8')
data = json.loads(respcontent)
print(data)

Unable to call an API and get respective output in Python

I am new to python and I am trying to call an API and get output, however, I am running into the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8f in position
2: invalid start byte
The code I have written thus far is as follows:
import requests
file = open("NLPData/testimg.png")
payload = {file}
response = requests.get("http://sampleapi.com/get", params=payload)
print(response.text)

UnicodeDecodeError on Windows, but not when running the exact same code on Mac

I'm trying to download json data via an API. The code is as follows:
import urllib.request, ssl, json
context = ssl._create_unverified_context()
rsbURL = "https://rsbuddy.com/exchange/summary.json"
with urllib.request.urlopen(rsbURL, context=context) as url:
data = json.loads(url.read().decode('UTF-8'))
This code works perfectly fine on my Mac, and I confirmed that data is what is supposed to be the JSON string. However, when I run the exact same code on windows, I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
What is going on and how do I fix it?

Looks like the server is sending a compressed response for some reason (it shouldn't be doing that unless you explicitly set the accept-encoding header). You can adapt your code to work with compressed responses like this:
import gzip
import urllib.request, ssl, json
context = ssl._create_unverified_context()
rsbURL = "https://rsbuddy.com/exchange/summary.json"
with urllib.request.urlopen(rsbURL, context=context) as url:
if url.info().get('Content-Encoding') == 'gzip':
body = gzip.decompress(url.read())
else:
body = url.read()
data = json.loads(body)

sending binary data over json

I wanted to upload file to s3 using python. I am using requests_aws4 auth library for this
import requests
from requests_aws4auth import AWS4Auth
# data=encode_audio(data)
endpoint = 'http://bucket.s3.amazonaws.com/testing89.mp3'
data = //some binary data(mp3) here
auth = AWS4Auth('xxxxxx', 'xxxxxx', 'eu-west-2', 's3')
response = requests.put(endpoint,data=data, auth=auth, headers={'x-amz-acl': 'public-read'})
print response.text
I have tried the above code and got the following error.
'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128).
This works fine if I send text data since it is also ascii but when the binary data is being sent I think there is some concatenation error of binary data with auth data. Am I wrong somewhere? Someone please guide me. Thanks.

Character encoding in a GET request with http.client lib (Python)

I am a beginner in python and I coded this little script to send an HTTP GET request on my local server (localhost). It works great, except that I wish I could send Latin characters such as accents.
import http.client
httpMethod = "GET"
url = "localhost"
params = "Hello World"
def httpRequest(httpMethod, url, params):
conn = http.client.HTTPConnection(url)
conn.request(httpMethod, '/?param='+params)
conn.getresponse().read()
conn.close()
return
httpRequest(httpMethod, url, params)
When I insert the words with accent in my parameter "params", this is the error message that appears:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 14: ordinal not in range(128)
I don't know if there is a solution using http.client library but I think so. When I look in the documentation http.client, I can see this:
HTTPConnection.request
Strings are encoded as ISO-8859-1, the default charset for HTTP

You shouldn't construct arguments manually. Use urlencode instead:
>>> from urllib.parse import urlencode
>>> params = 'Aserejé'
>>> urlencode({'params': params})
'params=Aserej%C3%A9'
So, you can do:
conn.request(httpMethod, '/?' + urlencode({'params': params}))
Also note that yout string will be encoded as UTF-8 before being URL-escaped.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

URLLIB Request unable to handle Chinese Character - python

Related

UnicodeEncodeError : 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256). How to solve? [duplicate]

Unable to call an API and get respective output in Python

UnicodeDecodeError on Windows, but not when running the exact same code on Mac

sending binary data over json

Character encoding in a GET request with http.client lib (Python)

Categories

Resources