Scraping Google [duplicate] - python

This question already has an answer here:
scrape google resultstats with python [closed]
(1 answer)
Closed 9 years ago.
I am attempting to scrape Google search results as the results I receive using the API are not as useful as the results from the main site.
I am using the python requests library to grab the search page. However I am receiving an error:
Instant is off due to connection speed. Press Enter to search.
Is there any way I can disable instant search?
thanks

Python has a search api for python already, might save you some heartache.
https://developers.google.com/appengine/docs/python/search/

Related

Selenium python - How to capture network traffic responses [duplicate]

This question already has answers here:
How to capture network traffic with selenium
(3 answers)
Closed 2 years ago.
The issue I seem to be having is that I cannot find any way to access the network traffic responses in firefox using selenium (Python). I know that solutions exist for the Chrome webdriver, but for my case I need to use the Firefox version. I've been trying to figure this out for like have a day and I'm pulling out my hair at this point. Is there any way to get these responses?
solution using browsermob-proxy. Not exactly what I wanted, but it does give all the requests and all the responses.

Is it possible to override request payload in python? [duplicate]

This question already exists:
How to add/edit data in request-payload available in google chrome dev tools [duplicate]
Closed 3 years ago.
I've been looking for this answer for quite long but still with no results. I'm working with selenium and I need to override one request which is generated after the submit button has been clicked. It contains data in json format under "Request payload" in chrome dev tools. I found something like seleniumwires which provides some functionality like request.overrides but I'm not sure it is working as I want. Can anyone give me some hint where to start or which tools are approporiate to do that ?

getting full content of web page (using Python-requests) [duplicate]

This question already has answers here:
Programmatic Python Browser with JavaScript
(8 answers)
Closed 4 years ago.
I am new to this subject, so my question could prove stupid.. sorry in advance.
My challenge is to do web-scraping, say for this page: link (google)
I try to web-scrape it using Python,
My problem is that once I use Python requests.get, I don't seem to get the full content of the page. I guess it is because that page has many resources, and Python does not get them all. (more than that, once I scroll my mouse up - more data is reviled on Chrome. I can see from the source code that no more data is downloaded to be shown..)
How can I get the full content of a web page? what am I missing?
thanks
requests.get will get you the page web but only what the page decides to give a robot. If you want the full page web as you see it as a human you need to trick it by changing your headers. If you need to scroll or click on buttons in order to see the whole page web, which is what I think you'll need to do, I suggest you take a look at selenium.

Posting to friends facebook WITHOUT using Graph API - Python [duplicate]

This question already has an answer here:
Posting to friends' wall with Graph API via 'feed' connection failing since Feb 6th 2013
(1 answer)
Closed 9 years ago.
I know that using Graph API we can no longer post on a friends wall. Has anyone else found a way around it? I have my current application setup with access tokens and what not - but because Facebook graph API can no longer post to a friends wall using the friends profile ID, I am kinda lost on how to fix this. Is there a way around it? using Python?
Use the Feed Dialog instead. See here for the reasons why.

Simple way to get an accesskey for Amazon's Product Advertising API [duplicate]

This question already has answers here:
Amazon Search API [closed]
(2 answers)
Closed 8 years ago.
I would like to search Amazon.com by means of a Python script. I understand that Amazon has a Product Advertising API for doing this, but that using it requires an access key.
I've spent about an hour following link after link after link at aws.amazon.com's documentation for its web services, but I haven't managed to find any simple, direct way to get an access key. Everything I find seems to be aimed at those with far more elaborate projects in mind.
Is there a simple way to do what I'm trying to do? In particular, is there a simple way to get the necessary credentials?
(FWIW, I'm interested only in doing with a script what I would otherwise do by entering search terms in Amazon's website, and clicking on innumerable links to see several pages worth of results. IOW, all I'm trying to do is to give myself some way to circumvent the tedium involved in using Amazon's standard (i.e. web) search interface. I'm not at all interested in, e.g. displaying these results (or transforms thereof) in a website, or managing a small business through Amazon, or anything of this sort.)
You probably should use the Product Advertising API which gives you access to search products and such.
To get the key you will sign up in http://aws.amazon.com/ as amazon has moved everything in there, it seems to be free.
But probably you would end up using less time if you used something like BeautifulSoup to parse a search result page made with urllib.
EDIT: There is some python binding for this api called bottlenose

Categories