making a simple GET/POST with url Encoding python - python

i have a custom url of the form
http://somekey:somemorekey#host.com/getthisfile.json
i tried all the way but getting errors :
method 1 :
from httplib2 import Http
ipdb> from urllib import urlencode
h=Http()
ipdb> resp, content = h.request("3b8138fedf8:1d697a75c7e50#abc.myshopify.com/admin/shop.json")
error :
No help on =Http()
Got this method from here
method 2 :
import urllib
urllib.urlopen(url).read()
Error :
*** IOError: [Errno url error] unknown url type: '3b8108519e5378'
I guess something wrong with the encoding ..
i tried ...
ipdb> url.encode('idna')
*** UnicodeError: label empty or too long
Is there any way to make this Complex url get call easy .

You are using a PDB-based debugger instead of a interactive Python prompt. h is a command in PDB. Use ! to prevent PDB from trying to interpret the line as a command:
!h = Http()
urllib requires that you pass it a fully qualified URL; your URL is lacking a scheme:
urllib.urlopen('http://' + url).read()
Your URL does not appear to use any international characters in the domain name, so you do not need to use IDNA encoding.
You may want to look into the 3rd-party requests library; it makes interacting with HTTP servers that much easier and straightforward:
import requests
r = requests.get('http://abc.myshopify.com/admin/shop.json', auth=("3b8138fedf8", "1d697a75c7e50"))
data = r.json() # interpret the response as JSON data.

The current de facto HTTP library for Python is Requests.
import requests
response = requests.get(
"http://abc.myshopify.com/admin/shop.json",
auth=("3b8138fedf8", "1d697a75c7e50")
)
response.raise_for_status() # Raise an exception if HTTP error occurs
print response.content # Do something with the content.

Related

How can I read the contents of an URL with Transcrypt? Where is urlopen() located?

In Transcrypt I try to read JSON data from a URL, so I try:
import urllib.request
data = urllib.request.urlopen(data_url)
But I get the error "Import error, can't find [...] urllib.request". So urllib.request doesn't seem to be support; strangely though the top-level import urllib works, but with this I do not get to the urlopen() function...
Any idea where urlopen() is located in Transcrypt? Or is there another way to retrieve URLs?
I don't believe Transcrypt has the Python urllib library available. You will need to use a corresponding JavaScript library instead. I prefer axios, but you can also just use the built in XMLHttpRequest() or window.fetch()
Here is a Python function you can incorporate that uses window.fetch():
def fetch(url, callback):
def check_response(response):
if response.status != 200:
console.error('Fetch error - Status Code: ' + response.status)
return None
return response.json()
prom = window.fetch(url)
resp = prom.then(check_response)
resp.then(callback)
prom.catch(console.error)
Just call this fetch function from your Python code and pass in the URL and a callback to utilize the response after it is received.

cannot access URL due to SSL module not available Python

First time trying to use Python 3.6 requests library's get() function with data from quandl.com and load and dump json.
import json
import requests
request = requests.get("https://www.quandl.com/api/v3/datasets/CHRIS/MX_CGZ2.json?api_key=api_keyxxxxx", verify=False)
request_text=request.text()
data = json.loads(request_text)
data_serialized = json.dumps(data)
print(data_serialized)
I have an account at quandl.com to access the data. The error when python program is run in cmd line says "cannot connect to HTTPS URL because SSL mode not available."
import requests
import urllib3
urllib3.disable_warnings()
r = requests.get(
"https://www.quandl.com/api/v3/datasets/CHRIS/MX_CGZ2.json?api_key=api_keyxxxxx").json()
print(r)
Output will be the following since i don't have API Key
{'quandl_error': {'code': 'QEAx01', 'message': 'We could not recognize your API key. Please check your API key and try again.'}}
You don't need to import json module as requests already have it.
Although I have verified 'quandl_api_keys received the following error when trying to retrieve data with 'print' data 'json.dumps' function: "quandl_error": {"code": "QEAx01" ... discovered that the incorrect fonts on the quotation marks around key code in the .env file resulted in this error. Check language settings and fonts before making requests after the error mssg.

Python 3.4 HTTP Error 505 retrieving json from url

I am trying to connect to a page that takes in some values and return some data in JSON format in Python 3.4 using urllib. I want to save the values returned from the json into a csv file.
This is what I tried...
import json
import urllib.request
url = 'my_link/select?wt=json&indent=true&f=value'
response = urllib.request.Request(url)
response = urllib.request.urlopen(response)
data = response.read()
I am getting an error below:
urllib.error.HTTPError: HTTP Error 505: HTTP Version Not Supported
EDIT: Found a solution to my problem. I answered it below.
You have found a server that apparently doesn't want to talk HTTP/1.1. You could try lying to it by claiming you are using a HTTP/1.0 client instead, by patching the http.client.HTTPConnection class:
import http.client
http.client.HTTPConnection._http_vsn = 10
http.client.HTTPConnection._http_vsn_str = 'HTTP/1.0'
and re-trying your request.
I used FancyURLopener and it worked. Found this useful: docs.python.org: urllib.request
url_request = urllib.request.FancyURLopener({})
with url_request.open(url) as url_opener:
json_data = url_opener.read().decode('utf-8')
with open(file_output, 'w', encoding ='utf-8') as output:
output.write(json_data)
Hope this helps those having the same problems as mine.

"ConnectionResetError" What should I do?

# -*- coding: UTF-8 -*-
import urllib.request
import re
import os
os.system("cls")
url=input("Url Link : ")
if(url[0:8]=="https://"):
url=url[:4]+url[5:]
if(url[0:7]!="http://"):
url="http://"+url
try :
try :
value=urllib.request.urlopen(url,timeout=60).read().decode('cp949')
except UnicodeDecodeError :
value=urllib.request.urlopen(url,timeout=60).read().decode('UTF8')
par='<title>(.+?)</title>'
result=re.findall(par,value)
print(result)
except ConnectionResetError as e:
print(e)
TimeoutError is disappeared. But ConnectionResetError appear. What is this Error? Is it server problem? So it can't solve with me?
포기하지 마세요! Don't give up!
Some website require specific HTTP Header, in this case, User-agent. So you need to set this header in your request.
Change your request like this (17 - 20 line of your code)
# Make request object
request = urllib.request.Request(url, headers={"User-agent": "Python urllib test"})
# Open url using request object
response = urllib.request.urlopen(request, timeout=60)
# read response
data = response.read()
# decode your value
try:
value = data.decode('CP949')
except UnicodeDecodeError:
value = data.decode('UTF-8')
You can change "Python urllib test" to anything you want. Almost every servers use User-agent for statistical purposes.
Last, consider using appropritate whitespaces, blank lines, comments to make your code more readable. It will be good for you.
More reading:
HTTP/1.1: Header Field Definitions - to understand what is User-agent header.
21.6. urllib.request — Extensible library for opening URLs — Python 3.4.3 documentation - Always read documentation. Link to urllib.request.Request section.

How do I get HTTP header info without authentication using python?

I'm trying to write a small program that will simply display the header information of a website. Here is the code:
import urllib2
url = 'http://some.ip.add.ress/'
request = urllib2.Request(url)
try:
html = urllib2.urlopen(request)
except urllib2.URLError, e:
print e.code
else:
print html.info()
If 'some.ip.add.ress' is google.com then the header information is returned without a problem. However if it's an ip address that requires basic authentication before access then it returns a 401. Is there a way to get header (or any other) information without authentication?
I've worked it out.
After try has failed due to unauthorized access the following modification will print the header information:
print e.info()
instead of:
print e.code()
Thanks for looking :)
If you want just the headers, instead of using urllib2, you should go lower level and use httplib
import httplib
conn = httplib.HTTPConnection(host)
conn.request("HEAD", path)
print conn.getresponse().getheaders()
If all you want are HTTP headers then you should make HEAD not GET request. You can see how to do this by reading Python - HEAD request with urllib2.

Categories