File Downloading using Python - python

I have to download file from website after selecting multiple option on the website.
I have three checkboxes each is having same name. I could select one box by using name and value like this.
urllib.urlencode({'contentPartnerIds':'67'})
I need to select another checkbox in the same group like this.
urllib.urlencode({'contentPartnerIds':'67','contentPartnerIds':'68'})
but this is not working. Could you please help on this?

Take a look at mechanize it's great when you need to do things with forms on page and is very simple.
http://wwwsearch.sourceforge.net/mechanize/

Related

Clicking multiple <span> elements with Selenium Python

I'm new to using Selenium, and I am having trouble figuring out how to click through all iterations of a specific element. To clarify, I can't even get it to click through one as it's a dropdown but is defined as an element.
I am trying to scrape fanduel; when clicking on a specific game you are presented with a bunch of main title bets and in order to get the information I need to click the dropdowns to get to that information. There is also another drop down that states, "See More" which is a similar problem, but assuming this gets fixed I'm assuming I will be able to figure that out.
So far, I have tried to use:
find_element_by_class_name()
find_element_by_css_selector()
I have also used them in the sense of elements, and tried to loop through and click on each index of the list, but that did not work.
If there are any ideas, they would be much appreciated.
FYI: I am using beautiful soup to scrape the website for the information, I figured Selenium would be helpful making the information that isn't currently accessible, accessible.
This image shows the dropdowns that I am trying to access, in this case the dropdown 'Win Margin'. The HTML code is shown to the left of it.
This also shows that there are multiple dropdowns, varying in amount based off the game.
You can also try using action chains from selenium
menu = driver.find_element_by_css_selector(".nav")
hidden_submenu = driver.find_element_by_css_selector(".nav # submenu1")
ActionChains(driver).move_to_element(menu).click(hidden_submenu).perform()
Source: here

How to fill textareas and select option (select tag) and hit submit (input tag) via python?

I work with python and data mine some content which I categorize into different categories.
Then I go to a specific webpage and submit manually the results.
Is there a way to automate the process? I guess this is a "form-submit" thread but I haven't seen any relevant module in Python. Can you suggest me something?
Selenium Webdriver is the most popular way to drive web pages, but Python also has beautifulsoup; Either library will work.
If you want make this automatic yo have to see which params are send in the form and make a request with this params to the endpoint but directly from your python app, or search a package that simulate a browser and fill the form, but I think that the correct way is making the request directly from your app

Remember elements text clicked of names in Selenium?

As an example lets say I wanted to record all bios of users on SO.
Lets say I loaded up: How to click an element in Selenium WebDriver using JavaScript
I clicked all users: .user-details a (11 of them)
I wrote Extracted text -> to a csv.
driver.get(‘Version compatibility of Firefox and the latest Selenium IDE (2.9.1.1-signed)’)
I read from csv the users.
user: Ripon Al Wasim [Is present again, do not click him] ??? How can this be achieved. As its text.
Is something like this accomplish-able or is this a limitation of selenium python?
You could click all of them, but lets say you had to scrape 200 pages and common name Bob popped up 430 times. I feel like it is unnecessary to click his name. Is something like this possible with Selenium?
I feel like I'm missing something and this is achievable but I am unaware how.
You could compare the text of text file and print(elem.get_attribute("href")) -> write that to a file and compare them. If elements were present, delete them but this is text. You could (maybe) put the text in an excel file. I'm not entirely sure if this is possible but you could write the css elements individually beside the text in the excel. And Delete rows where there are matched strings. And then get Selenium to load that up into Webdriver.
I'm not entirely convinced even this would work.
Is there a sane way of clicking css but ignoring names in a text file you have already clicked.
There's nothing special here with Selenium. That is your tool for interacting with the browser. It is your program that needs to decide how to do that interaction, and what you do with the information from it.
It sounds like you want to build a database of users, so why not use a database? something like SQLite or PostgreSQL might work nicely for you.
Among the user details, store the name as it appears in the link (assuming it will be unique for each user), and index that name. when scraping your page, pull that link text, then use SQL statements to search if the record exists by that name, if not, then click the link and add a new record.

Python Scrapy: imitate filter click

guys. I'm newbie to Scrapy and learning how to work with it. An issue occurred I can't figure out what to do next, perhaps someone more experienced will help me: I have a basic web-site with a list of items I want to parse and download. The issue is that the page has filters - pressing will filter items out. Basically I want to download data after pressing one filter but can not figure out how to do that. I noticed that pressing this filter will not change page url but the page will be reloaded so this is not ajax. Filter is marked in HTML as a link with href="javascript:qsn.set('comm','0',1);". How can I imitate this filter press with Scrapy?
Any help will be appreciated.

Parsing a Dynamic Web Page using Python

I am trying to parse a WebPage whose html source code changes when I press a arrow-key to get a drop-down list.
I want to parse the contents of that drop down list. How can I do that?
Example of the Problem: If you go to this site: http://in.bookmyshow.com/hyderabad and select the arrow button on comboBox "Select Movie" a drop-down list of movies appears. I want to get a list of these movies.
Thanks in advance.
The actual URL with the data used to populate the drop-down box is here:
http://in.bookmyshow.com/getJSData/?file=/data/js/GetEvents_MT.js&cmd=GETEVENTSWEB&et=MT&rc=HYD&=1425299159643&=1425299159643
I'd be a bit careful though and double-check with the site terms of use or if there are any APIs that you could use instead.
You may want to have a look at selenium. It allows you to reproduce exacly the same steps as you do because it also uses the browser (Firefox, Chrome, etc).
Ofc, it's not as fast as using mechanize, urllib, beautifulsoup and all this stuff, but it is worth a try.
You will need to dig into the JavaScript to see how that menu gets populated. If it is getting populated via AJAX, then it might be easy to get that content by re-doing a request to the same URL (e.g., do a GET to "http://www.example.com/get_dropdown_entries.php").

Categories