How to open a .html file and click submit button using python - python

I have a .html file where I am sending a value using the submit button as follows:
<HTML>
<HEAD>
<TITLE>XYZ Ltd.</TITLE>
</HEAD>
<BODY>
<FORM ACTION="http://192.168.2.2/cgi-bin/http_recv.cgi" METHOD="POST">
<TEXTAREA NAME="DATA_SEND" COLS='160' ROWS='40' WRAP='none'>
</TEXTAREA>
<INPUT TYPE="SUBMIT" VALUE="Send Data">
</FORM>
</BODY>
</HTML>
I did go through selenium and from my understanding it doesn't suit me. I would like to have a .html as above and maintain it, so it has to be opened and clicked. A cgi/python example did come into my notice but I would go for it only if there is no other alternative.
How can I use python to:
Open the .html file and
Press the "Send Data" button
Read any response given (assuming the response maybe displayed within a HTML page or a dialog box)

Python code for sending Data
`def hello():
Dict={'Title': 'This is title','Subtitle':'subtitle'}
return render_template('hello.html',Dict=Dict)`
Code for writing values which is passed from python as dictionary into HTML
`<form accept-charset="utf-8" class="simform" method="POST"
enctype=multipart/form-data>
Title <input type="text" name="Title" value="{{ Dict.get('Title')
}}" maxlength="36">
SubTitle <input type="text" name="SubTitle" value="{{
Dict.get('SubTitle') }}" maxlength="70">
<button type="submit" class="save btn btn-default">Submit</button>
</form>`

I believe this is exactly what you are looking for .Its a simple python server with the baseHttpHandler of Python.
class S(BaseHTTPRequestHandler):
def _set_headers(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
def do_GET(self):
self._set_headers()
self.wfile.write("<html><body><h1>hi!</h1></body></html>")
def do_HEAD(self):
self._set_headers()
def do_POST(self):
# Doesn't do anything with posted data
self._set_headers()
self.wfile.write("<html><body><h1>POST!</h1></body></html>")
def run(server_class=HTTPServer, handler_class=S, port=80):
server_address = ('', port)
httpd = server_class(server_address, handler_class)
print 'Starting httpd...'
httpd.serve_forever()
You can run the code by passing an appropriate port of your choice to run method or the default 80 will be used. To test this or to do a get or post , you could run a curl as follows:
Send a GET request:: curl http://localhost
Send a HEAD request:: curl -I http://localhost
Send a POST request:: curl -d "foo=bar&bin=baz" http://localhost
You could also create a seperate file of index.html and read using the codecs in python. Since the input would be string , it could be tampered with , eventually displaying the desired page.

Use flask to host your HTML Page and use a POST request to send data to and from your python script.
This link should help you more :
https://www.tutorialspoint.com/flask/index.htm

"Clicking" a button is nothing more than a POST request with the form data in the body.
If you need something generic, you would have to parse the HTML, find what data the host accepts and POST it.
But if you just need this for this example, meaning, you already know the data the server accepts, you can just forget about the HTML and just use something like requests to post the data

Related

Limit HTTP POST size for form submission

Let's say I have a simple HTML webpage (served using apache) as
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width">
<meta name="description" content="CGI script test">
<meta name="keywords" content="test">
<meta name="author" content="cgi test">
<title> CGI Script Test </title>
</head>
<body>
<form action="/cgi-bin/submit.py" method="POST">
<label for="entry">Entry name: </label>
<input type="text" id="entry" name="entryname" placeholder="placeholder" maxlength="10">
</form>
</body>
</html>
where data submitted in the form is processed using submit.py, a python script (placed in my cgi-bin directory) as
#!/usr/bin/python
import cgi,re
form = cgi.FieldStorage()
print("Content-Type: text/html\n\n")
print("<title>Hello World</title>")
print("<h1>HELLO</h1>")
text=str(form.getvalue("entryname"))
print("<p> Parsing result...</p>")
result = re.sub('[^a-zA-Z0-9:##/-_,]', ' ', text)
print("<h3> Resulting Info: </h3>")
print("<p>" + str(result) + "</p>")`
I want to avoid my server getting stuffed with POSTs that are excessively long. If I load the HTML webpage above, I can use the "inspect element" tool in firefox to delete the "maxlength" requirement and stuff in as much information as I want. The python script then receives the full input text. This is my first website and I want to make sure I do this right. Is there a limit to the size of the POST sent to the server, and if not, how do I limit it to prevent abuse?
You can examine Content-Length header and compare it with a limit. cgitb.enable should be helpful with displaying errors when the limit is reached.
import cgitb
import os
MAX_POST_BODY_SIZE = 1024
cgitb.enable(display=0)
content_length = int(os.getenv('CONTENT_LENGTH', 0))
if content_length > MAX_POST_BODY_SIZE:
raise RuntimeError('POST body too long')
Update 1
I've looked into cgi module's code and it seems POST body size limiting is actually implemented in FieldStorage but it's not documented. There's cgi.maxlen attribute:
# Maximum input we will accept when REQUEST_METHOD is POST
# 0 ==> unlimited input
maxlen = 0
Hence, it should be just:
import cgi
import cgitb
cgi.maxlen = 1024
cgitb.enable(display=0)
form = cgi.FieldStorage()
[...] would the server still have to allocate memory to receive the full post?
As far as I can see in the initialiser of FieldStorage the steps are as follows:
fp is assigned, self.fp = sys.stdin.buffer
Content-Length is validated
when then Content-Type is application/x-www-form-urlencoded read_urlencoded is called which reads Content-Length bytes from the instance's fp attribute.
To test it with your CGI server, send a big request and look at htop or other process monitor for the CGI process' memory usage.
from urllib.request import urlopen
urlopen(
'http://localhost:8000/cgi-bin/submit.py',
data=b'entryname=%s' % (b'abcd' * 2 ** 24), # 64 MiB
)

Adding external javascript to page that will be sent by a server

I'm building a little server that sends a page with a script in javascript, it works when I try to open it in my browser, but if I request the page from the server, the page is recived, but not the script, so I get these errors:
Script.js is missing between the files:
Looks strange because from the network session i can see a request for the script with the status 200:
The js file i'm tryng to add is Chart.js, so I can't add it internally, it would become impossible to work with it, but for now the server is just a few lines of code in python that use the SimpleHTTPRequestHandler, and I'm probably going to replace it, may it be because the SimpleHTTPRequestHandler can't handle multiple file requests?
Here's the code, tried to make a snippet but it does't work there too (probably that's just me never wrote a snippet before):
HTML:
<!doctype html>
<html>
<head>
<script src="script.js"></script>
</head>
<body>
<p id = "paragraph"></p>
<script>
document.getElementById('paragraph').innerHTML = sayHello();
</script>
</body>
</html>
JS:
function sayHello(){
return "HelloWorld!"
}
Here is the python server script:
from http.server import HTTPServer, BaseHTTPRequestHandler
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.end_headers()
with open("index.html", "r") as page:
self.wfile.write(page.read())
httpd = HTTPServer(("192.168.1.100", 8000), SimpleHTTPRequestHandler)
httpd.serve_forever()
I think you get an element with id, tag name or class Name and add file
var head = document.getElementsByTagName('head')[0];
var script = document.createElement('script');
script.type = 'text/javascript';
script.onload = function() {
callFunctionFromScript();
}
script.src = 'path/to/your-script.js';
head.appendChild(script);
also check this link

Getting <response[200]> with Python http requests instead of INT

Im trying to create simple python code that would communicate with 9kw.eu captcha solving service through their api https://www.9kw.eu/api.html#apisubmit-tab. Basically I'm sending base64 encoded image with some keys:values and response from server should be number like: 58952554, but I'm only getting
<response[200]>
Which should mean that the server got my data, but im not getting anything else.
I'm able to get the right result with simple html form:
<form method="post" action="https://www.9kw.eu/index.cgi" enctype="multipart/form-data">
KEY:<br>
<input name="apikey" value="APIKEY"><br>
ACTION<br>
<input name="action" value="usercaptchaupload"><br>
FILE:<br>
<input name="file-upload-01" value="BASE64IMAGEDATAHERE"><br>
TOOL<br>
<input name="source" value="htmlskript"><br>
ROTATE<br>
<input name="rotate" value="1"><br>
Angle<br>
<input name="angle" value="40"><br>
BASE64
<input name="base64" value="1"><br>
Upload:<br>
<input type="submit" value="Upload and get ID">
</form>
This is the python code, which should do the same thing:
import requests
import time
#base64 image encoding
with open("funcaptcha1.png", "rb") as f:
data = f.read()
filekodovany = data.encode("base64")
#captcha uploader
udajepost = {'apikey':'APIKEY','action':'usercaptchaupload','file-upload-01':filekodovany,'source':'pythonator','rotate':'1','angle':'40','base64':'1'}
headers = {'Content-Type':'multipart/form-data'}
r = requests.post('https://www.9kw.eu/index.cgi', data = udajepost)
print(r)
Thanks for any help.
r = requests.post('https://www.9kw.eu/index.cgi', data = udajepost)
Here, r is the whole response object which has many attributes. I guess, you only need r.text. So, you can just use:
print(r.text)
You're looking for the response of the request:
print(r.text)
In this way you'll have the plain text response.
get json output by:
r.json()
and response_code by:
r.status_code

Django download not working with browser, but works fine in machines where Internet Download Manager(IDM) is installed

My django project uses the following code to download the file. It works all good if the client machine has IDM installed but fails to work if IDM is not installed. I couldn't find any reason for this weirdness.
views.py
def somefunction():
something something
return render(request,
'something/download/download.html',
{'pdf_file_location': pdf_file_location})
def download(request):
if not request.user.is_authenticated():
return render(request, 'login/login/login.html')
else:
filename = request.POST.get('pdf_file_location')
if request.method == 'POST':
while os.path.exists(filename) is False:
time.sleep(2)
chunk_size = 8192
response = StreamingHttpResponse(FileWrapper(open(filename, 'rb'), chunk_size),
content_type=mimetypes.guess_type(filename)[0])
response['Content-Length'] = os.path.getsize(filename)
response['Content-Disposition'] = "attachment; filename=%s" % filename[filename.find("UserSessionDetails-")+19:]
return response
return render(request, 'something/something/index.html')
download.html
<canvas id="c-timer" width="300" height="300">
<input id="pdf_file_location" type="hidden" value={{ pdf_file_location }} name="pdf_file_location"/>
</canvas>
js for the download.html
var val = document.getElementById('pdf_file_location').value
data ={"pdf_file_location": val};
something something and then finishTime is called
var finishTime = function () {
$.post( "/book_publish/download/",data);
};
I don't have much knowledge about how IDM works, but reading this article tells me that it shouldn't give any upper hand apart from the fact that it opens multiple connections for the operation, and my code is sending data in chunks. Is it that the browser can't stitch the data when it is sent in small chunks?
PROBLEM: The problem was I was using JS to post a request for the download, and since I'm a newbie at web, so I couldn't handle the request sent back. Hence it got all messed up.
And somehow IDM was able to catch that response and initiate the download process.
SOLUTION: I used a simple form post and submit button in the HTML itself and not use the JS for post request.

Python: Scrape data after uploading file

I am trying to upload a extract the response of a site based on the file that upload to file. Site has the following form.
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body>
<form method="POST" action="http://somewebsite.com/imgdigest" enctype="multipart/form-data">
quality:<input type="text" name="quality" value="2"><br>
category:<input type="text" name="category" value="1"><br>
debug:<input type="text" name="debug" value="1"><br>
image:<input type="file" name="image"><br>
<input type="submit" value="Submit">
</form>
</body>
</html>
What I want to do is upload a file, submit the form and extract the response.
I started by looking at an example, I think I successfully manage to get the upload work. Because when I ran this I didn't get any errors.
import urllib2_file
import urllib2
import request
import lxml.html as lh
data = {'name': 'image',
'file': open('/user/mydir/21T03NAPE7L._AA75_.jpg')
}
urllib2.urlopen('http://localhost/imgdigestertest.html', data)
Unfortunately I am not doing a request here to get the response back. I am not sure how I should do that response. Once I get the response I should be able to extract the data with some pattern match which I am comfortable off.
Based on the answer provided tried the following code:
import requests
url = 'http://somesite.com:61235/imgdigest'
files = {'file': ('21e1LOPiuyL._SL160_AA115_.jpg',
open('/usr/local/21e1LOPiuyL._SL160_AA115_.jpg', 'rb'))}
other_fields = {"quality": "2",
"category": "1",
"debug": "0"
}
headers={'content-type': 'text/html; charset=ISO-8859-1'}
response = requests.post(url, data=other_fields, files=files, headers=headers)
print response.text
now I get the following error: which tells me that some how image file doesn't get attached correctly. Do we have to specify the file type?
Image::Image(...): bufSize = 0. Can not load image data. Image size = 0. DigestServiceProvider.hpp::Handle(...) |
Use the requests library (pip install requests, if you use pip).
For their example, see here:
http://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file
To customize that to look like yours:
import requests
url = 'http://localhost:8080/test_meth'
files = {'file': ('21T03NAPE7L._AA75_.jpg',
open('./text.data', 'rb'))}
other_fields = {"quality": "2",
"category": "1",
"debug": "1"
}
response = requests.post(url, data=other_fields, files=files)
print response.text
On my local system, text.data contains this:
Data in a test file.
I wrote a server.py with cherrypy (pip install cherrypy) to test the client I gave above. Here is the source for the server.py:
import cherrypy
class Hello(object):
def test_meth(self, category, debug, quality, file):
print "Form values:", category, debug, quality
print "File name:", file.filename
print "File data:", file.file.read()
return "More stuff."
test_meth.exposed = True
cherrypy.quickstart(Hello())
When I run the above client.py, it prints:
More stuff.
which as you can see in the server.py example is what is returned.
Meanwhile, the server says:
Form values: 1 1 2
File name: 21T03NAPE7L._AA75_.jpg
File data: Data in a test file.
127.0.0.1 - - [14/Jul/2012:00:00:35] "POST /test_meth HTTP/1.1" 200 11 "" "python-requests/0.13.3 CPython/2.7.3 Linux/3.2.0-26-generic"
Thus, you can see that the client is posting the filename as described in the code and the file contents of the specified local file.
One thing to point out, at the beginning of this post I said to use the requests library. This is not to be confused with with the urllib request that you are importing in your original question.

Categories