How can I execute links2 to open a web page and locate and click a text link with Python?
Is pexpect able to do it? Any examples are appreciated.
Not sure why you want to do this. If you want to grab the web link and process the page content, urllib2 together with an HTML parser (BeautifulSoup for example) may be just fine.
If you do want to simulate moust clicks, you may want to use AutoPy.
Why do you want to use links2? I don't see how you could benefit from that. It is probably better to approach your problem in a different way, like with mechanize or maybe even twill.
Please provide a description of your overall problem instead of that specific question
if you want javascript support use selenium rc with whatever language you are comfortable with
Related
When you search something on your browser it will give you by default a list of websites related to the search that you have done, but I was wondering if there was a way to store/print/iterate the list of urls shown in that main page.
I haven't tried anything because I don't even know which python library should I use.
Which library should I use for this puprose?
I hope that it is a valid question.
Beautiful Soup
Requests
Selenium
Pick your poison.
Read the docs.
???
Profit!
I have a problem getting javascript content into HTML to use it for scripting. I used multiple methods as phantomjs or python QT library and they all get most of the content in nicely but the problem is that there are javascript buttons inside the page like this:
Pls see screenshot here
Now when I load this page from a script these buttons won't default to any value so I am getting back 0 for all SELL/NEUTRAL/BUY values below. Is there a way to set these values when you load the page from a script?
Example page with all the values is: https://www.tradingview.com/symbols/NEBLBTC/technicals/
Any help would be greatly appreciated.
If you are trying to achieve this with scrapy or with derivation of cURL or urrlib I am afraid that you can't do this. Python has another external packages such selenium that allow you to interact with the javascript of the page, but the problem with selenium is too slow, if you want something similar to scrapy you could check how the site works (as i can see it works through ajax or websockets) and fetch the info that you want through urllib, like you would do with an API.
Please let me know if you understand me or i misunderstood your question
I used seleneum which was perfect for this job, it is indeed slow but fits my purpose. I also used the seleneum firefox plugin to generate the python script as it was very challenging to find where exactly in the code as the button I had to press.
i need to write a python script , the script should access a webpage , which has a "upload" button , normally when you upload a photo with that button a new page opens . and once that page opens i need to look for a string there
so the script should upload there a photo , which i provide to the script and then check the output page for a string
i have no background in that sort of coding (i know basic python ) .
can i get a reference or some pointers on what reading should i do to perform that task? thank you very much
While this question is not specific enough to give you a good answer, I can make a couple of suggestions. I would look into using a library for sending requests to pages, such as requests. I would also look into libraries for parsing html, such as Beautiful Soup. Essentially you will need to use requests to get the page's html, and then you'll need to parse that html using Beautiful Soup to find what you're looking for on the page.
You should do some reading about these libraries and/or other similar ones and try to get a better understanding of your problem. Afterward, come back to Stack Overflow once you have more specific questions or problems you've run into.
I am not sure if this is possible, but I was wondering if it would be possible to write a script or program that would automatically open up my web browser, go to a certain site, fill out information, and click "send"? And if so, where would I even begin? Here's a more detailed overview of what I need:
Open browser
Go to website
Fill out a series of forms
Click OK
Fill out more forms
Click OK
Thank you all in advance.
There are a number of tools out there for this purpose. For example, Selenium, which even has a package on PyPI with Python bindings for it, will do the job.
maybe you can use zope.testbrowser it's really easy to use.
I'm a little new to web crawlers and such, though I've been programming for a year already. So please bear with me as I try to explain my problem here.
I'm parsing info from Yahoo! News, and I've managed to get most of what I want, but there's a little portion that has stumped me.
For example: http://news.yahoo.com/record-nm-blaze-test-forest-management-225730172.html
I want to get the numbers beside the thumbs up and thumbs down icons in the comments. When I use "Inspect Element" in my Chrome browser, I can clearly see the things that I have to look for - namely, an em tag under the div class 'ugccmt-rate'. However, I'm not able to find this in my python program. In trying to track down the root of the problem, I clicked to view source of the page, and it seems that this tag is not there. Do you guys know how I should approach this problem? Does this have something to do with the javascript on the page that displays the info only after it runs? I'd appreciate some pointers in the right direction.
Thanks.
The page is being generated via JavaScript.
Check if there is a mobile version of the website first. If not, check for any APIs or RSS/Atom feeds. If there's nothing else, you'll either have to manually figure out what the JavaScript is loading and from where, or use Selenium to automate a browser that renders the JavaScript for you for parsing.
Using the Web Console in Firefox you can pretty easily see what requests the page is actually making as it runs its scripts, and figure out what URI returns the data you want. Then you can request that URI directly in your Python script and tease the data out of it. It is probably in a format that Python already has a library to parse, such as JSON.
Yahoo! may have some stuff on their server side to try to prevent you from accessing these data files in a script, such as checking the browser (user-agent header), cookies, or referrer. These can all be faked with enough perseverance, but you should take their existence as a sign that you should tread lightly. (They may also limit the number of requests you can make in a given time period, which is impossible to get around.)