I am trying to invoke a python code for screen scraping (using Beautiful Soup) from my jsp servlet. Or it would also work if it can be directly invoked from the HTML.
Looked through few threads but couldn't get any solution.
What I want is to give the python program some arguments and want it to do some screen scrapping and return the result to jsp somehow.
I assume you are talking about web scraping which is pulling information from other websites.
You are not going to be able to do something like this in somebody's browser because it violates Javascript's same origin policy, AND there is no way a browser is going to let you download and execute a script on a client's computer.
You could just write a python script to do this for you and execute it yourself on your machine however.
Just be sure you are not violating web site's terms of service.
EDIT:
In that case I would recommend running the script on the command line, and then using the output of the program in the servlet to generate the responses you want.
Related
I want to run a python program (kinda like this) from a browser
Anyway, as you see it has a few inputs, and i would like to "translate" that into a kind of form
def addNew():
appendBase = open('dBase.cfg','a')
uname = input('Username: ')
pword = input('Password: ')
appendBase.write(uname+','+pword+'\n')
print('\nAdded new profile: \n'+uname+','+pword)
appendBase.close()
Also i dont know how to get the print to the page, so it can show it
I've just started learning, so go easy on me, please
It is not possible to actually run this in the browser, for various reasons.
you can't run python in browsers. only javascript
you can't open local files from a browser
there's no command line to input from some terminal
Most things you see on the web have two parts
a part that actually runs in the browser. Written in HTML and javascript
another part where the browser connects, to send and receive data. That can be done in any language, including python. However, that part is not visible in the browser
The two parts communicate using HTTP protocol. So, start by reading a bit on HTML/javascript (W3Schools is an easy way to get started). When you feel comfortable with that, practice with a python web framework (django is the most popular, but flask is the easiest to get started), and see how javascript uses HTTP to connect to that.
I created a simple python script which takes a URL as input and once passed will do curl using multiple proxies and show the response code, now I want to create a webpage where others can use(my colleagues) as it will help them too, I want to create simple webpage which let them select set of proxy addresses, and input URL and upon submission, it will run the script on a machine(webserver) and populate the result to webpage using dynatable or datatable frameworks, but am not sure how or if it is possible as I didn't worked much in webserver thing, I want to know what tools I will need and how do I design it.
If python script can be called in terminal(as it needs to run curl) and show result on webpage based on output from script(which I will export to csv file), how can I do that? what to use xampp, wamp, lamp etc ?
You need a framework for this, something that will listen to your request coming from the front-end (webpage), there are tons out there as python framework, you can check bottle framework as a starting point.
So the flow would be something below.
1. from webpage, a request is sent to the backend
2. backend receive the request and run your logic (connecting to some server, computing logic, etc)
3. once backend process is done, backend then send the response to webpage
you can either use a REST approach or use templating functionality of the framework
You will need a request. You can do this in JavaScript with an ajax request there are frameworks to hook it up straight to your python which allow you not to code the JavaScript https://www.w3schools.com/xml/ajax_intro.asp there are many JavaScript frameworks that will make this easier/ shorter to code.
I have written my Python script to take an inputted argument via command line.
My script simply take a URL (inputted via command line), then runs the script; which counts how many lines of HTML code is on the URL page. Its a very simple script.
I would like to put this script on my website. If you click a button on my webpage, the URL of my webpage is sent to the script, then it process the information, and returns it to my website.
How would I be able to do this? What would the back-end architecture look like? How can I increase my processing speeds? Will my script be able to process multiple clicks from different users simultaneously?
There are a couple ways you could do this. The first and the simplest would be CGI or Common Gateway Interface. The second would be a python web framework like flask or django , which you could configure via wgsi so like you said, it would run when it's url is accessed.
I wrote a simple python script that authenticates to a website, gets the cookie write it to a file and do some scrapping at the website. I'm writting the cookie to a file so, I can reuse it and don't need to authenticate my self over and over.
At my personal computer the script works fine. Although when I upload it to my server it refuse to work.
The most strange part is if I upload the cookie created at my personal computer to the my server it will work fine. Of course, I have some issues at the function that saves the cookie...
As far I as know if I have library issues Python would warm me about it, so I guess my problem is much more complex.
I also tried to run as a root, but no lucky.
What do you think may be causing this stuff?
BTW: All pythons are 2.7
Refer to tags to more infos
I'm attempting to automate tests of Adobe Analytics (aka Omniture) instrumentation of a web app by implementing test scripts with the Selenium Python package.
If correctly instrumented, HTTP requests are made from the browser with certain expected query parameters. Is there a Python package that would allow me to capture those outgoing HTTP requests? Right now, we do it manually with the Chrome dev tools in the Network -> Images section.
This application is also available as a native app across nearly twenty other platforms (including Smart TVs and game consoles), and I'll need to perform similar tests across those. Although, unfortunately, I won't be able to automate the script, I'd still like to capture and store the HTTP calls. I'm currently using HTTPScoop to do this manually.
I'm most comfortable with Python, but if there's a simple way of doing this in another language, I'm all ears.
I was recently working on a similar task so I can share my experience and what I've learnt on the way (rather than give you the solution).
First you need to run a proxy on your machine (e.g. http://bmp.lightbody.net/). Then I needed to run manually a few commands ( https://github.com/lightbody/browsermob-proxy#rest-api). Once the proxy was running I wrote a small script following example here https://github.com/lightbody/browsermob-proxy#using-with-selenium. Finally you simply loop over the har entries as captured on the proxy and check if an analytics request is present (you can check for URL params if needed).
I have this ready in form of a unit test for FF and Chrome (for a given URL). To be able to run this test on different devices/OS/platforms one would probably need to run the code through selenium remote webdriver https://code.google.com/p/selenium/wiki/RemoteWebDriver using service like https://www.browserstack.com/ in the cloud. I contacted them but they don't have any documentation ready but suggested I refer to online resources. That's where I am now.
Hope it helps