How to get some specific real-time data from several websites continuously using Python (Django)? - python

I am working on a web application that needs to pull data from other websites continuously. For example, the price of a product.
If the price changes on the original website, my site's price will also be changed automatically.
Is it possible to scrape/pull/get data from other websites continuously like this?

you can write a background job to fetch data from remote host every hour/minutes/... then update your data if changed.

Related

How to have python code get data from a website then update it on another website?

I want to have some python code use selenium or bs4 to check a website hourly for updates and the get that information and put it on a website in whatever way I want to format it. I would like to know what's the best way to connect these two together.
I would probably set up a cron job (scheduled task on windows) to run hourly to run your script to scrape the website you're getting data from. The logic in your script will depend on the website you're entering data into, but broadly speaking I would scrape the data, process it, then either send a POST request to the website or update the data source (a db, file, whatever) directly.

How to send scraped data to a page but without waiting for the page to load?

I am showing some scraped data on my Django website.. The data is changing a couple times every hour so it needs to be updated. I am using Beautiful Soup to scrape the data, then im sending it to the view and passing it in a context dictionary to present it on a website.
The problem is that scraping function takes some time to work and because of it the website is not loading until the function does its work. How can I make it load faster? There is no API on the data webstie.
It depends of your algorithm, if it is an algorithm that goes through a lot of elements that can be or not updated you can save some attribute for not enter in the element. It's a first thought if you show the algorithm maybe we can give you a better answer :)
Schedule a hourly scrape task using cron or Celery, store the result into the result into a database and serve your data from there.

Display Data In Real Time With Django

I have a simulator application that continuously spits out data, formatted in JSON, to a given host name and port number (UDP). I would like to be able to point the simulator output to a Django web application so that I can monitor/process the data as it comes in.
How do I receive and process data in real time using Django? What tools or packages are available to accomplish this? I did come across this answer: How to serve data from UDP stream over HTTP in Python?, but I don't completely understand.
Ex: Similar to this page: http://money.cnn.com/data/markets/
ALSO, I don't need to store any of the streaming data in a database. I just need to perform lookups based on the streaming data. Maybe it's not a Django issue at all?
Using Javascript.
Create a webpage with all the results, and then use javascript to collect the data from the page, and update it every X seconds.
Have the webpage be the JSON data, and the javascript grab it an interpret it.
get html code using javascript with a url
Then update the page using javascript. ww3 schools has great JS tutorials

Django - How do I download data once a day using Pandas for my site?

I'm a beginner to Django and am currently building a site that displays stock prices. To do that, I need to download (or update) the stock prices once a day. I know that I can retrieve stock prices using Pandas. However, I would like my site to do it once everyday at a specific time, instead of retrieving data every time a visitor visits the view. I'm a bit stuck here and did a lot of Google searches. Can someone please point me to a link that I can read up on?
EDIT: I'm currently making this site on my own computer so I haven't uploaded my files yet.
If you are using a Linux box (like Debian[0]), and have cron[1] up and running:
Create a Shell Script to call a program you will write to get the data using Pandas.
Use crontab -l to edit your crontab file and add you script to execute at any time you need.
[0] https://www.debian.org/
[1] http://linux.die.net/man/1/crontab

Refresh Webpage using python

I have a webpage showing some data. I have a python script that continuously updates the data(fetches the data from database, and writes it to the html page).It takes about 5 minutes for the script to fetch the data. I have the html page set to refresh every 60 seconds using the meta tag. However, I want to change this and have the page refresh as soon as the python script updates it, so basically I need to add some code to my python script that refreshes the html page as soon as it's done writing to it.
Is this possible ?
Without diving into complex modern things like WebSockets, there's no way for the server to 'push' a notice to a web browser. What you can do, however, it make the client check for updates in a way that is not visible to the user.
It will involve writing Javascript & writing an extra file. When writing your main webpage, add, inside Javascript, a timestamp (Unix timestamp will be easiest here). You also write that same timestamp to a file on the web server (let's call it updatetime.txt). Using an AJAX request on the page, you pull in updatetime.txt & see if the number in the file is bigger than the number stored when you generate the document, refresh the page if you see an updated time. You can alter how 'instantly' the changes get noticed but controlling how quickly you poll.
I won't go into too much detail on writing the code but I'd probably just use $.ajax() from JQuery (even though it's sort of overkill for one function) to make the calls. The trick to putting something on a time in JS is setinterval. You should be able to find plenty of documentation on using both of them already written.

Categories