Limit HTTP POST size for form submission - python

Let's say I have a simple HTML webpage (served using apache) as
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width">
<meta name="description" content="CGI script test">
<meta name="keywords" content="test">
<meta name="author" content="cgi test">
<title> CGI Script Test </title>
</head>
<body>
<form action="/cgi-bin/submit.py" method="POST">
<label for="entry">Entry name: </label>
<input type="text" id="entry" name="entryname" placeholder="placeholder" maxlength="10">
</form>
</body>
</html>
where data submitted in the form is processed using submit.py, a python script (placed in my cgi-bin directory) as
#!/usr/bin/python
import cgi,re
form = cgi.FieldStorage()
print("Content-Type: text/html\n\n")
print("<title>Hello World</title>")
print("<h1>HELLO</h1>")
text=str(form.getvalue("entryname"))
print("<p> Parsing result...</p>")
result = re.sub('[^a-zA-Z0-9:##/-_,]', ' ', text)
print("<h3> Resulting Info: </h3>")
print("<p>" + str(result) + "</p>")`
I want to avoid my server getting stuffed with POSTs that are excessively long. If I load the HTML webpage above, I can use the "inspect element" tool in firefox to delete the "maxlength" requirement and stuff in as much information as I want. The python script then receives the full input text. This is my first website and I want to make sure I do this right. Is there a limit to the size of the POST sent to the server, and if not, how do I limit it to prevent abuse?

You can examine Content-Length header and compare it with a limit. cgitb.enable should be helpful with displaying errors when the limit is reached.
import cgitb
import os
MAX_POST_BODY_SIZE = 1024
cgitb.enable(display=0)
content_length = int(os.getenv('CONTENT_LENGTH', 0))
if content_length > MAX_POST_BODY_SIZE:
raise RuntimeError('POST body too long')
Update 1
I've looked into cgi module's code and it seems POST body size limiting is actually implemented in FieldStorage but it's not documented. There's cgi.maxlen attribute:
# Maximum input we will accept when REQUEST_METHOD is POST
# 0 ==> unlimited input
maxlen = 0
Hence, it should be just:
import cgi
import cgitb
cgi.maxlen = 1024
cgitb.enable(display=0)
form = cgi.FieldStorage()
[...] would the server still have to allocate memory to receive the full post?
As far as I can see in the initialiser of FieldStorage the steps are as follows:
fp is assigned, self.fp = sys.stdin.buffer
Content-Length is validated
when then Content-Type is application/x-www-form-urlencoded read_urlencoded is called which reads Content-Length bytes from the instance's fp attribute.
To test it with your CGI server, send a big request and look at htop or other process monitor for the CGI process' memory usage.
from urllib.request import urlopen
urlopen(
'http://localhost:8000/cgi-bin/submit.py',
data=b'entryname=%s' % (b'abcd' * 2 ** 24), # 64 MiB
)

Related

Adding external javascript to page that will be sent by a server

I'm building a little server that sends a page with a script in javascript, it works when I try to open it in my browser, but if I request the page from the server, the page is recived, but not the script, so I get these errors:
Script.js is missing between the files:
Looks strange because from the network session i can see a request for the script with the status 200:
The js file i'm tryng to add is Chart.js, so I can't add it internally, it would become impossible to work with it, but for now the server is just a few lines of code in python that use the SimpleHTTPRequestHandler, and I'm probably going to replace it, may it be because the SimpleHTTPRequestHandler can't handle multiple file requests?
Here's the code, tried to make a snippet but it does't work there too (probably that's just me never wrote a snippet before):
HTML:
<!doctype html>
<html>
<head>
<script src="script.js"></script>
</head>
<body>
<p id = "paragraph"></p>
<script>
document.getElementById('paragraph').innerHTML = sayHello();
</script>
</body>
</html>
JS:
function sayHello(){
return "HelloWorld!"
}
Here is the python server script:
from http.server import HTTPServer, BaseHTTPRequestHandler
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.end_headers()
with open("index.html", "r") as page:
self.wfile.write(page.read())
httpd = HTTPServer(("192.168.1.100", 8000), SimpleHTTPRequestHandler)
httpd.serve_forever()
I think you get an element with id, tag name or class Name and add file
var head = document.getElementsByTagName('head')[0];
var script = document.createElement('script');
script.type = 'text/javascript';
script.onload = function() {
callFunctionFromScript();
}
script.src = 'path/to/your-script.js';
head.appendChild(script);
also check this link

How to get authorization token through OAuth2 and read Outlook emails through http requests?

I've been retrieving my emails in Outlook by only using python requests to GET the endpoint https://outlook.office365.com/api/v1.0/me/messages
Example code:
import requests
requests.get('https://outlook.office365.com/api/v1.0/me/messages', auth=(email, pwd))
I would get a json object back, parse through it, and get the contents of my email. But now Microsoft has deprecated it and I have been trying to migrate over to using Microsoft Graph. My question is, how do I get an OAuth2 token without having to launch a browser using http requests?
So far I've been reading through the docs and registered my "app" (just a regular python script) in the Application Registration Portal. Every example that I come across, I always have to go visit an authorization url where I have to manually log in through a front end UI which looks like this:
I want to be able to do this through http requests/without a frontend UI but I can't seem to find any answers on how to do this.
This is the code I have so far:
outlook_test.py
import requests
authorize_url = 'https://login.microsoftonline.com/common/oauth2/v2.0/authorize'
token_url = 'https://login.microsoftonline.com/common/oauth2/v2.0/token'
payload = {
'client_id': app_client_id, # Variable exists but not exposed in this question
'response_type': 'code',
'redirect_uri': 'https://login.microsoftonline.com/common/oauth2/nativeclient',
'response_mode': 'form_post',
'scope': 'mail.read',
'state': '12345'
}
r = requests.get(authorize_url, params=payload)
print(r.status_code)
print(r.text)
This is what I get back:
200
<!DOCTYPE html>
<html dir="ltr" class="" lang="en">
<head>
<title>Sign in to your account</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=2.0, user-scal able=yes">
<meta http-equiv="Pragma" content="no-cache">
<meta http-equiv="Expires" content="-1">
<meta name="PageID" content="ConvergedSignIn" />
<meta name="SiteID" content="" />
<meta name="ReqLC" content="1033" />
<meta name="LocLC" content="en-US" />
<noscript>
<meta http-equiv="Refresh" content="0; URL=https://login.microsoftonline.com/jsdisabled" />
</noscript>
<link rel="shortcut icon" href="https://secure.aadcdn.microsoftonline-p.com/ests/2.1.8502.8/co ntent/images/favicon_a_eupayfgghqiai7k9sol6lg2.ico" />
<meta name="robots" content="none" />
...
This is what I have in my platform settings when I registered my app if it helps:
Is there any way I can get the authorization code programatically? I tried passing in the auth parameter as well but that didn't work.
New Findings
I recently found out that python requests handles OAuth2 but now I am getting a different error when trying to follow their examples.
This is the error that I'm getting:
File "outlook_test.py", line 31, in <module>
token = oauth.fetch_token(token_url=token_url, auth=auth)
File "C:\Users\ryee\Documents\gitLabQA\QA_BDD\outlook_env\lib\site-packages\requests_oauthlib\oauth2_session.py", line 307, in fetch_token
self._client.parse_request_body_response(r.text, scope=self.scope)
File "C:\Users\ryee\Documents\gitLabQA\QA_BDD\outlook_env\lib\site-packages\oauthlib\oauth2\rfc6749\clients\base.py", line 415, in parse_request_body_response
self.token = parse_token_response(body, scope=scope)
File "C:\Users\ryee\Documents\gitLabQA\QA_BDD\outlook_env\lib\site-packages\oauthlib\oauth2\rfc6749\parameters.py", line 425, in parse_token_response
validate_token_parameters(params)
File "C:\Users\ryee\Documents\gitLabQA\QA_BDD\outlook_env\lib\site-packages\oauthlib\oauth2\rfc6749\parameters.py", line 432, in validate_token_parameters
raise_from_error(params.get('error'), params)
File "C:\Users\ryee\Documents\gitLabQA\QA_BDD\outlook_env\lib\site-packages\oauthlib\oauth2\rfc6749\errors.py", line 405, in raise_from_error
raise cls(**kwargs)
oauthlib.oauth2.rfc6749.errors.InvalidClientIdError: (invalid_request) AADSTS90014: The required field 'scope' is missing.
Trace ID: 2359d8a6-0140-43c1-8ff5-8103045d2f00
Correlation ID: dff39a1f-ffe1-493e-aea3-1536974b777d
I tried '[Mail.Read]' as a scope but I'm getting an "Invalid Scope parameter".
My new python script:
outlook_test.py
from oauthlib.oauth2 import BackendApplicationClient
from requests.auth import HTTPBasicAuth
from requests_oauthlib import OAuth2Session
auth = HTTPBasicAuth(client_app_id, app_secret)
client = BackendApplicationClient(client_id=client_app_id)
oauth = OAuth2Session(client=client)
token = oauth.fetch_token(token_url=token_url, auth=auth)
The scope you are using, Mail.Read requires user consent. What you want is Application permission Mail.Read which requires admin consent(This link shows how to get Admin Consent).
You are trying to access a user's messages, thus the user needs to consent to that action.
According to the docs:-
I found this tutorial very helpful for getting access codes:
This tutorial uses Microsoft Graph (which covers several Microsoft products including Microsoft Outlook) rather than the outlook REST API (which covers just Outlook).
https://learn.microsoft.com/en-us/outlook/rest/python-tutorial
At first, I thought setting up a Django server was overkill. Then I realized that I wanted a way for my Python instance to capture the access code after going through single-sign-on. (I MUST use my browser for single-sign-on because my institution uses multi-factor authentation.) Having a Django server is a natural way to do this.
So I created a new PyCharm Django project (which is straight-forward in PyCharm) and began following the tutorial.
I found it essential to continue following the tutorial all the way through displaying my emails to avoid getting authentication errors -- deviate from the tutorial, and I got error messages (such as this one) that were unpenetrable.
(I previously posted this answer here in response to a different question.)

How to open a .html file and click submit button using python

I have a .html file where I am sending a value using the submit button as follows:
<HTML>
<HEAD>
<TITLE>XYZ Ltd.</TITLE>
</HEAD>
<BODY>
<FORM ACTION="http://192.168.2.2/cgi-bin/http_recv.cgi" METHOD="POST">
<TEXTAREA NAME="DATA_SEND" COLS='160' ROWS='40' WRAP='none'>
</TEXTAREA>
<INPUT TYPE="SUBMIT" VALUE="Send Data">
</FORM>
</BODY>
</HTML>
I did go through selenium and from my understanding it doesn't suit me. I would like to have a .html as above and maintain it, so it has to be opened and clicked. A cgi/python example did come into my notice but I would go for it only if there is no other alternative.
How can I use python to:
Open the .html file and
Press the "Send Data" button
Read any response given (assuming the response maybe displayed within a HTML page or a dialog box)
Python code for sending Data
`def hello():
Dict={'Title': 'This is title','Subtitle':'subtitle'}
return render_template('hello.html',Dict=Dict)`
Code for writing values which is passed from python as dictionary into HTML
`<form accept-charset="utf-8" class="simform" method="POST"
enctype=multipart/form-data>
Title <input type="text" name="Title" value="{{ Dict.get('Title')
}}" maxlength="36">
SubTitle <input type="text" name="SubTitle" value="{{
Dict.get('SubTitle') }}" maxlength="70">
<button type="submit" class="save btn btn-default">Submit</button>
</form>`
I believe this is exactly what you are looking for .Its a simple python server with the baseHttpHandler of Python.
class S(BaseHTTPRequestHandler):
def _set_headers(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
def do_GET(self):
self._set_headers()
self.wfile.write("<html><body><h1>hi!</h1></body></html>")
def do_HEAD(self):
self._set_headers()
def do_POST(self):
# Doesn't do anything with posted data
self._set_headers()
self.wfile.write("<html><body><h1>POST!</h1></body></html>")
def run(server_class=HTTPServer, handler_class=S, port=80):
server_address = ('', port)
httpd = server_class(server_address, handler_class)
print 'Starting httpd...'
httpd.serve_forever()
You can run the code by passing an appropriate port of your choice to run method or the default 80 will be used. To test this or to do a get or post , you could run a curl as follows:
Send a GET request:: curl http://localhost
Send a HEAD request:: curl -I http://localhost
Send a POST request:: curl -d "foo=bar&bin=baz" http://localhost
You could also create a seperate file of index.html and read using the codecs in python. Since the input would be string , it could be tampered with , eventually displaying the desired page.
Use flask to host your HTML Page and use a POST request to send data to and from your python script.
This link should help you more :
https://www.tutorialspoint.com/flask/index.htm
"Clicking" a button is nothing more than a POST request with the form data in the body.
If you need something generic, you would have to parse the HTML, find what data the host accepts and POST it.
But if you just need this for this example, meaning, you already know the data the server accepts, you can just forget about the HTML and just use something like requests to post the data

How do I display UTF-8 characters sent through a websocket?

I'm trying to build a simple web socket server that loads a file with some tweets in it (as CSV) and then just sends the string of the tweet to a web browser through a websocket. Here is a gist with the sample that I'm using for testing. Here's the Autobahn server component (server.py):
import random
import time
from twisted.internet import reactor
from autobahn.websocket import WebSocketServerFactory, \
WebSocketServerProtocol, \
listenWS
f = open("C:/mypath/parsed_tweets_sample.csv")
class TweetStreamProtocol(WebSocketServerProtocol):
def sendTweet(self):
tweet = f.readline().split(",")[2]
self.sendMessage(tweet, binary=False)
def onMessage(self, msg, binary):
self.sendTweet()
if __name__ == '__main__':
factory = WebSocketServerFactory("ws://localhost:9000", debug = False)
factory.protocol = TweetStreamProtocol
listenWS(factory)
reactor.run()
And here is the web component (index.html):
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<script type="text/javascript">
var ws = new WebSocket("ws://localhost:9000");
ws.onmessage = function(e) {
document.getElementById('msg').textContent = e.data; //unescape(encodeURIComponent(e.data));
console.log("Got echo: " + e.data);
}
</script>
</head>
<body>
<h3>Twitter Stream Visualization</h3>
<div id="msg"></div>
<button onclick='ws.send("tweetme");'>
Get Tweet
</button>
</body>
</html>
When the tweet arrives in the browser, the UTF-8 characters aren't properly displayed. How can I modify these simple scripts to display the proper UTF-8 characters in the browser?
This works for me:
from autobahn.twisted.websocket import WebSocketServerProtocol, \
WebSocketServerFactory
class TweetStreamProtocol(WebSocketServerProtocol):
def sendTweets(self):
for line in open('gistfile1.txt').readlines():
## decode UTF8 encoded file
data = line.decode('utf8').split(',')
## now operate on data using Python string functions ..
## encode and send payload
payload = data[2].encode('utf8')
self.sendMessage(payload)
self.sendMessage((u"\u03C0"*10).encode("utf8"))
def onMessage(self, payload, isBinary):
if payload == "tweetme":
self.sendTweets()
if __name__ == '__main__':
import sys
from twisted.python import log
from twisted.internet import reactor
log.startLogging(sys.stdout)
factory = WebSocketServerFactory("ws://localhost:9000", debug = False)
factory.protocol = TweetStreamProtocol
reactor.listenTCP(9000, factory)
reactor.run()
Notes:
above code is for Autobahn|Python 0.7 and above
I'm not sure if you sample Gist is properly UTF8 encoded file
However, the "last" pseudo Tweet is 10x "pi", and that properly shows in the browser, so
it works in principle ..
Also note: for reasons too long to explain here, Autobahn's sendMessage function expects payload to be already UTF8 encoded if isBinary == False. A "normal" Python string is Unicode, which needs to be encoded like above to UTF8 for sending.
instead of <meta http-equiv="content-type" content="text/html; charset=UTF-8">< try <meta charset = utf-8>.
if you're using XHTML then write <meta charset = utf-8 />

Python: Scrape data after uploading file

I am trying to upload a extract the response of a site based on the file that upload to file. Site has the following form.
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body>
<form method="POST" action="http://somewebsite.com/imgdigest" enctype="multipart/form-data">
quality:<input type="text" name="quality" value="2"><br>
category:<input type="text" name="category" value="1"><br>
debug:<input type="text" name="debug" value="1"><br>
image:<input type="file" name="image"><br>
<input type="submit" value="Submit">
</form>
</body>
</html>
What I want to do is upload a file, submit the form and extract the response.
I started by looking at an example, I think I successfully manage to get the upload work. Because when I ran this I didn't get any errors.
import urllib2_file
import urllib2
import request
import lxml.html as lh
data = {'name': 'image',
'file': open('/user/mydir/21T03NAPE7L._AA75_.jpg')
}
urllib2.urlopen('http://localhost/imgdigestertest.html', data)
Unfortunately I am not doing a request here to get the response back. I am not sure how I should do that response. Once I get the response I should be able to extract the data with some pattern match which I am comfortable off.
Based on the answer provided tried the following code:
import requests
url = 'http://somesite.com:61235/imgdigest'
files = {'file': ('21e1LOPiuyL._SL160_AA115_.jpg',
open('/usr/local/21e1LOPiuyL._SL160_AA115_.jpg', 'rb'))}
other_fields = {"quality": "2",
"category": "1",
"debug": "0"
}
headers={'content-type': 'text/html; charset=ISO-8859-1'}
response = requests.post(url, data=other_fields, files=files, headers=headers)
print response.text
now I get the following error: which tells me that some how image file doesn't get attached correctly. Do we have to specify the file type?
Image::Image(...): bufSize = 0. Can not load image data. Image size = 0. DigestServiceProvider.hpp::Handle(...) |
Use the requests library (pip install requests, if you use pip).
For their example, see here:
http://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file
To customize that to look like yours:
import requests
url = 'http://localhost:8080/test_meth'
files = {'file': ('21T03NAPE7L._AA75_.jpg',
open('./text.data', 'rb'))}
other_fields = {"quality": "2",
"category": "1",
"debug": "1"
}
response = requests.post(url, data=other_fields, files=files)
print response.text
On my local system, text.data contains this:
Data in a test file.
I wrote a server.py with cherrypy (pip install cherrypy) to test the client I gave above. Here is the source for the server.py:
import cherrypy
class Hello(object):
def test_meth(self, category, debug, quality, file):
print "Form values:", category, debug, quality
print "File name:", file.filename
print "File data:", file.file.read()
return "More stuff."
test_meth.exposed = True
cherrypy.quickstart(Hello())
When I run the above client.py, it prints:
More stuff.
which as you can see in the server.py example is what is returned.
Meanwhile, the server says:
Form values: 1 1 2
File name: 21T03NAPE7L._AA75_.jpg
File data: Data in a test file.
127.0.0.1 - - [14/Jul/2012:00:00:35] "POST /test_meth HTTP/1.1" 200 11 "" "python-requests/0.13.3 CPython/2.7.3 Linux/3.2.0-26-generic"
Thus, you can see that the client is posting the filename as described in the code and the file contents of the specified local file.
One thing to point out, at the beginning of this post I said to use the requests library. This is not to be confused with with the urllib request that you are importing in your original question.

Categories