Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I need to scrape historical market rates of freight from differing origins and destinations. Currently, I only have interactive graphs like this available to me:
Sample Graph
You have to click on the graph to get the numbers to appear (all of them appear at once).
I have some experience with HTML web scraping through the Scrapy library, but I was wondering if something like BeautifulSoup would be capable of handling this type of problem.
To put it shortly - yes but it depends.
Most javascript graphs work by embeding json data in <script> tags or making ajax request for it. So there is graph data in json format somewhere - you just need to find it.
To find it you should first open up page source and ctrl+f for some keypoints you see in the graph. In your case start with £407 - it's very likely it's in embeded json:
<script type="application/ld+json">
{'prices': ['£407',...]}
</script>
Alternative it could also be retrieved as AJAX request. For example take this craft.co case. When you load https://craft.co/netflix page it makes AJAX request for graph data:
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I'm using beautiful soup to scrape a webpages.
I am trying to scrape data from this https://painel-covid19.saude.ma.gov.br/vacinas. But the problem is I am getting the tags in outputs empty. In the Inspect Element I can see the data, but in page source not. You can see the code is hidden in . How can I retrieve it using python? Someone can help me?
The issue isn't "not visible". The issue is that the data is being filled in by Javascript code. You won't see the data unless you are executing the Javascript on the page. You can do that with the selenium package, which runs a copy of Chrome to do the rendering.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to know if it's possible to use Python or any other way to generate an RSS feed for a website, if the site does not provide RSS feeds.
Are there any examples?
Yes, if I would build something like that I would design it like this.
Write a Flask server which would handle request.
On every request download data from the target website with bs4.
Transform the data to XML output according to RSS format.
It's a bit more than just short code, but nothing very hard.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am trying to write a program in Python, so that I can use a Raspberry Pi to take hardware input and then do some IoT stuff as a result.
I'm comfortable with all the hardware stuff, and am pretty sure that I'll be able to figure out how to do Facebook posting and tweeting, but I also want to submit data into a webpage.
I don't have control of the webpage, access to the code or anything like that and it's going to be nigh-on impossible to get the access to the code so I'm relying on inspect element here to get any data which I need. I'm also not supposed to post the URL of the said webpage publicly, as it needs a login which I am not at liberty to release.
I want to interact with several features on the webpage, namely:
A mouse-over drop-down menu
A text entering field
A few buttons
I think that I need to do something with 'event listeners', but I'm unsure how to go about this; I have quite a lot of Python experience but not much web development knowledge.
Thanks
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
i would like to scrape a table from a website and display the scraped table afterwards in a gui created with python. The table (and the website) is in html. How would you make it? In case that is possible i would love to display the table in my gui afterwards formatted in the exact same way. "Hypothetical it would be enough for me to take a screenshot of the table and display this picture afterwards in the gui"
Ps: i thought using scrapy but i guess in that case i have to build the table from scratch again in my gui later?
Thanks for your ideas!
You don't really need scrapy if you just need to scrape a single page and don't need crawling. As for the toolkit BeautifulSoup + Requests is a popular choice.
As for the table content - you will most likely store in a list of lists (maybe some structure - it's up to you) and than populate table in your GUI application.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
If I want to give input to the website through python program and display the result on terminal after online computation then how can I do it using python wrapper? As I am new to python, so Can anyone suggest me some tutorial for this?
It all depends on the website that you want to retrieve your result from, and how it accepts input from you. For example, if your webpage accepts GET or POST requests, then you can send it a HTTP request and print out the response onto terminal.
If your website accepts input via a submit form, on the other hand, you would have to find the link of the submit button and send your data to that page.
There is a Python library called Requests, which you can use to send HTTP requests to a webpage and get the response. I suggest you read its documentation, it has some good examples that you can base your idea off. Another library is the inbuilt urllib2, which would also work for your purposes.
The response to your request is most likely to be a HTTP webpage, so you may have to scrape out your desired content from inside that.