I've been looking for a method to get a steam users status. This includes things like the game they're playing, if they're online or if they're offline. I was originally going to scrape the page, but I figured there must be an easier way to do it.
I looked around for pages which might be json of the users current status, perhaps even an API. But I haven't found anything yet.
Start with the Steam Web API. I think you are looking to work with GetPlayerSummaries.
Did you see:
http://steamcommunity.com/dev
Steam Web API Documentation
Valve provides these APIs so website developers can use data from
Steam in new and interesting ways. They allow developers to query
Steam for information that they can present on their own sites. At the
moment the only APIs we offer provide item data for Team Fortress 2,
but this list will grow over time.
Related
Currently I'm building a social media light, which blinks when there's an update on your Facebook/Twitter/Instagram/.... Now I'm looking at Instagram and I wish to know when someone liked one of my posts. The official API doesn't have this functionality. I then found that there was an unofficial version which implements more functionality of the official app. This is a version for PHP.
After looking for a while I couldn't find the API endpoint for accessing the likes feed. I think it's possible, since you can even request it in the web interface. Before I start on scraping it myself, are there other ways?
I also took a look at services like IFTTT, but the functionality isn't there either.
Currently I'm writing the app for Python, but I'm getting incredibly frustrated with the limited functionality the various social media offer. Facebook removed its notification functionality, Instagram doesn't have one and I haven't started on Twitter yet, but I get a feeling it will be a headache as well.
Instagram shutdown its Feed API a litte over a year ago that provided this functionality that IFTTT and others used.
Instagram Platform Update
My younger brother, who still lives in China is a fan of Michael Phelps. He wants to see his twitter posts. Since they can't access twitter behind the GFW and setting up a VPN is too hard for my mom. I want to write something that grabs the twitter and sends them to my mom's email.
I use python as my main language. Familiar with tweepy / request / scrapy
I have tried or thought about three ways of doing this:
Use the twitter API and grabs the user_timeline. However, this method will lost all graphical data and throws a bunch of useless links that are only visible after proper rendering
Do a web scraping and save the html content. Then send the html file as an attachment. However, this method still loses some graphical contents and is not that user friendly to someone in her 40s. In addition, it will be kinda hard to tell how many tweets I have scraped and if there's any updates.
Wrap the html content in the email and use html rendering within the email. I haven't work with this before so I am not exactly sure how its gonna work out.
I am aware that "what's the best way to do this" kinda question is always downvoted on SO but I do believe this problem is particular enough to engage meaningful Q&As. Any suggestion will be appreciated.
Have you thought of using selenium and taking screen shots of the browser window? Taking a screen shot with selenium is as easy as
browser.get('twitter.com')
browser.get_screenshot_as_file('twitter_screenshot.png')
You'd have to figure out a way to automate both watching for new tweets and running the selenium script when a new tweet is found. However in terms of preserving graphical content, taking screenshots w/ Selenium would be simple to implement.
Docs: http://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webdriver.WebDriver.get_screenshot_as_file
I am making a web application that will monitor the amount of members and discussions in each one of the groups listed here (http://www.codecademy.com/groups#web) and display that information in nice graphs.
However, as you have already seen, it looks like I need to create an account and login with it.
Having in mind that my project is using Python for the server side, how do I do it? Which API is easier? (Google, FB or twitter?)
I would really love if you could also provide some examples because I am really new at this (and at Python too).
The official wrapper around the Twitter API for Python is this one. I used it and it's very easy. You should first read this page and also register an application to get OAuth keys.
Example:
import twitter
# Remember to put these values
api = twitter.Api(consumer_key="",
consumer_secret="",
access_token_key="",
access_token_secret="")
# Get your timeline
print api.GetHomeTimeline()
Hope it helps.
I've looked at a lot of questions and libs and didn't found exactly what I wanted. Here's the thing, I'm developing an application in python for a user to get all sorts of things from social networks accounts. I'm having trouble with facebook. I would like, if possible, a step-by-step tutorial on the code and libs to use to get a user's information, from posts to photos information (with the user's login information, and how to do it, because I've had a lot of problem with authentication).
Thank you
I strongly encourage you to use Facebook's own APIs.
First of all, check out documentation on Facebook's Graph API https://developers.facebook.com/docs/reference/api/. If you are not familiar with JSON, DO read a tutorial on it (for instance http://secretgeek.net/json_3mins.asp).
Once you grasp the concepts, start using this API. For Python, there are at several alternatives:
facebook/python-sdk https://github.com/facebook/python-sdk
pyFaceGraph https://github.com/iplatform/pyFaceGraph/
It is also semitrivial to write a simple HTTP client that uses the graph API
I would suggest you to check out the Python libraries, try out the examples in their documentation and see if they are working and do the stuff you need.
Only as a last resort, would I write a scraper and try to extract data with screenscraping (it is much more painful and breaks more easily).
I have not used this with Facebook, but in the past when I had to scrape a site that required login I used Mechanize to handle the login and scraping and Beautiful Soup to parse the resulting HTML.
I'm trying to scrap a page in youtube with python which has lot of ajax in it
I've to call the java script each time to get the info. But i'm not really sure how to go about it. I'm using the urllib2 module to open URLs. Any help would be appreciated.
Youtube (and everything else Google makes) have EXTENSIVE APIs already in place for giving you access to just about any and all data you could possibly want.
Take a look at The Youtube Data API for more information.
I use urllib to make the API requests and ElementTree to parse the returned XML.
Main problem is, you're violating the TOS (terms of service) for the youtube site. Youtube engineers and lawyers will do their professional best to track you down and make an example of you if you persist. If you're happy with that prospect, then, on you head be it -- technically, your best bet are python-spidermonkey and selenium. I wanted to put the technical hints on record in case anybody in the future has needs like the ones your question's title indicates, without the legal issues you clearly have if you continue in this particular endeavor.
Here is how I would do it: Install Firebug on Firefox, then turn the NET on in firebug and click on the desired link on YouTube. Now see what happens and what pages are requested. Find the one that are responsible for the AJAX part of page. Now you can use urllib or Mechanize to fetch the link. If you CAN pull the same content this way, then you have what you are looking for, then just parse the content. If you CAN'T pull the content this way, then that would suggest that the requested page might be looking at user login credentials, sessions info or other header fields such as HTTP_REFERER ... etc. Then you might want to look at something more extensive like the scrapy ... etc. I would suggest that you always follow the simple path first. Good luck and happy "responsibly" scraping! :)
As suggested, you should use the YouTube API to access the data made available legitimately.
Regarding the general question of scraping AJAX, you might want to consider the scrapy framework. It provides extensive support for crawling and scraping web sites and uses python-spidermonkey under the hood to access javascript links.
You could sniff the network traffic with something like Wireshark then replay the HTTP calls via a scraping framework that is robust enough to deal with AJAX, such as scraPY.