python socket, HTTPS request load full html code - python

I'm learning how to use socket to make https request, and my problem is that I can success request (status 200), but I will only have a part of the webpage content (can't understand why it's splitted in this way)
I will receive my Http header, with a part of the html code. I tried it with at least 3 different website (including github), and I always have the same result.
I'm able to connect with my account to a website, having my cookies to use my account, load a new page with those cookie and get a status 200, and juste have a part of the website... Like just having site's navbars.
If someone have any clue.
import socket
import ssl
HOST = 'www.python.org'
PORT = 443
MySock = socket.socket()
MySock = ssl.wrap_socket(MySock, ssl_version=ssl.PROTOCOL_SSLv23)
MySock.connect((HOST,PORT))
MySock.send("""GET / HTTP/1.1
Host: {}
""".format(HOST).encode())
#Create file to check reponse content
with open('PythonOrg.html', 'w') as File:
print(MySock.recv(50000).decode(), file=File)

1) I seem to not be able to load content with a large buffer, in MySock.recv(50000), I need to loop with smaller buffer, like 4096, and concatenate a variable.
2) A request required time to receive the entire response, I used time.sleep function to manage this waiting, not sur if it's the best way with an ssl socket to wait the server. If anyone have a nice way to get the entire response message when it's big, feel free.

Related

503 Reponse when trying to use python request on local website

I'm trying to scrape my own site from my local server. But when I use python requests on it, it gives me a response 503. Other ordinary sites on the web work. Any reason/solution for this?
import requests
url = 'http://127.0.0.1:8080/full_report/a1uE0000002vu2jIAA/'
r = requests.get(url)
print r
prints out
<Response [503]>
After further investigation, I've found a similar problem to mine.
Python requests 503 erros when trying to access localhost:8000
However, I don't think he's solved it yet. I can access the local website via the web browser but can't access using the requests.get function. I'm also using Django to host the server.
python manage.py runserver 8080
When I use:
curl -vvv http://127.0.0.1:8080
* Rebuilt URL to: http://127.0.0.1:8080/
* Trying 10.37.135.39...
* Connected to proxy.kdc.[company-name].com (10.37.135.39) port 8099 (#0)
* Proxy auth using Basic with user '[company-id]'
> GET http://127.0.0.1:8080/ HTTP/1.1
> Host: 127.0.0.1:8080
> Proxy-Authorization: Basic Y2FhNTc2OnJ2YTkxQ29kZQ==
> User-Agent: curl/7.49.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: BlueCoat-Security-Appliance
< Location:http://10.118.216.201
< Connection: Close
<
<HTML>
<HEAD><TITLE>Redirection</TITLE></HEAD>
<BODY><H1>Redirect</H1></BODY>
* Closing connection 0
I cannot request a local url using python requests because the company's network software won't allow it. This is a dead end and other avenues must be pursued.
EDIT: Working Solution
>>> import requests
>>> session = requests.Session()
>>> session.trust_env = False
>>> r = session.get("http://127.0.0.1:8080")
>>> r
<Response [200]>
Maybe you should disable your proxies in your requests.
import requests
proxies = {
"http": None,
"https": None,
}
requests.get("http://127.0.0.1:8080/myfunction", proxies=proxies)
ref:
https://stackoverflow.com/a/35470245/8011839
https://2.python-requests.org//en/master/user/advanced/#proxies
HTTP Error 503 means:
The Web server (running the Web site) is currently unable to handle the HTTP request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. Some servers in this state may also simply refuse the socket connection, in which case a different error may be generated because the socket creation timed out.
You may do following things:
Check you are able to open URL in the browser
If URL is opening, then check the domain in your code, it might be incorrect.
If in browser also it is not opening, your site may be overloaded or server resources are full to perform request
The most common cause of a 503 error is that a proxy host of some form is unable to communicate with the back end. For example, if you have Varnish trying to handle a request but Apache is down.
In your case, you have Django running on port 8080. (That's what the 8080 means). When you try to get content from 127.0.0.1, though, you're going to the default HTTP port (80). This means that your default server (Apache maybe? NginX?) is trying to find a host to serve 127.0.0.1 and can't find one.
You have two choices. Either you can update your server's configuration, or you can include the port in the URL.
url = 'http://127.0.0.1:8080/full_report/a1uE0000002vu2jIAA/'

Python - Using Windows hosts file when using Python Requests / Use predefined IP Address without making a DNS request

I am trying to use Python requests to make a HTTP GET request to a domain, without using urllib3/httplib.HTTPConnection to perform a DNS request for the domain. I set the domain in the Windows hosts file, but Python requests appears to override this, so I need to define the DNS resolution for the domain in the script.
I want to script to bypass the dns request so I can set the IP address. In the example below I've set this to 45.22.67.8, and I will change this to my public IP address later.
I tried using this 'monkey patching' technique but it doesn't work. Requests doesn't generate a DNS request in Wireshark, but it also doesn't connect to the HTTP server.
import socket
import requests
from requests.packages.urllib3.connection import HTTPConnection
socket.getaddrinfo = '45.22.67.8'
url = "http://www.randomdomain.com"
requests.get(url, timeout=10)
Error
'str' object is not callable
Thanks!
Edit: just updated the code in my example. All I want to do is override future http connections to trick the http packets to go to a different destination IP.

With Bottle, how could I just peek the head of http request instead of receiving whole http request?

I don't know if it is possible with Bottle.
My website (powered by Bottle) allow users to upload image files. But I limited the size of it to 100K. I use the following code in web server to do that.
uploadLimit = 100 # 100k
uploadLimitInByte = uploadLimit* 2**10
print("before call request.headers.get('Content-Length')")
contentLen = request.headers.get('Content-Length')
if contentLen:
contentLen = int(contentLen)
if contentLen > uploadLimitInByte:
return HTTPResponse('upload limit is 100K')
But when I clicked upload button in web browser to upload a file with its size like 2MB, it seems the server is receiving the whole 2MB http request.
I expect the above code just receive http headers instead of receiving whole http request. That could not prevent wasting time on receving unecessary bytes

Set port in requests

I'm attempting to make use of cgminer's API using Python. I'm particularly interested in utilizing the requests library.
I understand how to do basic things in requests, but cgminer wants to be a little more specific. I'd like to shrink
import socket
import json
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(('localhost', 4028))
sock.send(json.dumps({'command': 'summary'}))
using requests instead.
How does one specify the port using that library, and how does one send such a json request and await a response to be stored in a variable?
Request is an HTTP library.
You can specify the port in the URL http://example.com:4028/....
But, from what I can read in a hurry here cgminer provides a RPC API (or JSON RPC?) not an HTTP interface.
As someone who has learned some of the common pitfalls of python networking the hard way, I'm adding this answer to emphasize an important-but-easy-to-mess-up point about the 1st arg of requests.get():
localhost is an alias which your computer resolves to 127.0.0.1, the IP address of its own loopback adapter. foo.com is also an alias, just one that gets resolved further away from the host.
requests.get('foo.com:4028') #<--fails
requests.get('http://foo.com:4028') #<--works usually
& for loopbacks:
requests.get('http://127.0.0.1:4028') #<--works
requests.get('http://localhost:4028') #<--works
this one requires import socket & gives you the local ip of your host (aka, your address within your own LAN); it goes a little farther out from the host than just calling localhost, but not all the way out to the open-internet:
requests.get('http://{}:4028'.format(socket.gethostbyname(socket.gethostname()))) #<--works
You can specify the port for the request with a colon just as you would in a browser, such as
r = requests.get('http://localhost:4028'). This will block until a response is received, or until the request times out, so you don't need to worry about awaiting a response.
You can send JSON data as a POST request using the requests.post method with the data parameter, such as
import json, requests
payload = {'command': 'summary'}
r = requests.post('http://localhost:4028', data=json.dumps(payload))
Accessing the response is then possible with r.text or r.json().
Note that requests is an HTTP library - if it's not HTTP that you want then I don't believe it's possible to use requests.

Python Sockets, download is almost 10x the size of original file, upload is 0 bytes

Creating an Mobile application with embedded Python 2.7
Using Marmalade C++ SDK.
I'm integrating connectivity to cloud file transfer services.
FTP: file transfers work flawlessly
Dropbox: authenticates then gives me: socket [Errno 22] Invalid argument
Google Drive: Authenticates, lists metadata, but file transfers illicit some strange behavior
Since I've made all the bindings to the marmalade socket subsystem (unix like) but some features are unimplemented. To connect to Google Drive, initially I did some modifications to httplib2 / init.py, setting all instances of:
self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
#to this:
self.sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
After doing this little patch I could successfully connect and download metadata from Google Drive. However:
When I upload a 7KB file, the file appears on Google Drive, but has a
file size of 0 bytes
When I download a 7KB file using urllib, I get a
54KB file back
I know it must have to do with a misconfiguration of the socket properties, but not all properties are implemented.
Here are some standard python test outputs (test_sockets , test_httplib )
Implementation here:
Marmalade /h/std/netdb.h
Are there any that I should try as a viable replacement?
I don't have a clue.
From: unix-man setsockopt(2)
SO_DEBUG enables recording of debugging information
SO_REUSEADDR enables local address reuse
SO_REUSEPORT enables duplicate address and port bindings
SO_KEEPALIVE enables keep connections alive
SO_DONTROUTE enables routing bypass for outgoing messages
SO_LINGER linger on close if data present
SO_BROADCAST enables permission to transmit broadcast messages
SO_OOBINLINE enables reception of out-of-band data in band
SO_SNDBUF set buffer size for output
SO_RCVBUF set buffer size for input
SO_SNDLOWAT set minimum count for output
SO_RCVLOWAT set minimum count for input
SO_SNDTIMEO set timeout value for output
SO_RCVTIMEO set timeout value for input
SO_ACCEPTFILTER set accept filter on listening socket
SO_TYPE get the type of the socket (get only)
SO_ERROR get and clear error on the socket (get only)
Here is my Google upload / download / listing source code
I'll brute force this until the problem is resolved, hopefully. Ill report back if I figure it out
I figured it out. it was 2 problems with my file handling code.
in uploading:
database_file = drive.CreateFile()
database_file['title'] = packageName
# this needs to be set
database_file.SetContentFile(packageName)
#
database_file['parents']=[{"kind": "drive#fileLink" ,'id': str(cloudfolderid) }]
In downloading, I was using the wrong url (webContentLink is for browsers only, use "downloadUrl" ). I also then needed to craft a header to authorize the download
import urllib2
import json
url = 'https://doc-14-5g-docs.googleusercontent.com/docs/securesc/n4vedqgda15lkaommio7l899vgqu4k84/ugncmscf57d4r6f64b78or1g6b71168t/1409342400000/13487736009921291757/13487736009921291757/0B1umnT9WASfHUHpUaWVkc0xhNzA?h=16653014193614665626&e=download&gd=true'
#Parse saved credentials
credentialsFile = open('./configs/gcreds.dat', 'r')
rawJson = credentialsFile.read()
credentialsFile.close()
values = json.loads(rawJson)
#Header must include: {"Authorization" : "Bearer xxxxxxx SomeToken xxxxx"}
ConstructedHeader = "Bearer " + str(values["token_response"]["access_token"])
Header = {"Authorization": ConstructedHeader}
req = urllib2.Request( url, headers= Header )
response = urllib2.urlopen(req)
output = open("UploadTest.z.crypt",'wb')
output.write(response.read())
output.close()

Categories