I have Google Sheets, which only a few people have access to, the link is also closed, so I can't download a document using a regular download link:
def download_xlsx():
url = 'https://docs.google.com/spreadsheets/d/spreadsheetID/export?format=xlsx'
urllib.request.urlretrieve(url, 'table.xlsx')
I realized that the problem is that there is no access to the document by link, I cannot open this access, since the terms of reference are exactly that.
Maybe someone has already faced a similar task and will be able to suggest a solution?
Related
As the title states I'm looking for a way to get backlinks for a given url / website using Google APIs, since I already have an api key and I'd rather use it instead of relying on other services.
I already tested services like ahrefs, majestic, moz, serpstat etc and actually they can give me the infomation I need, but I was wondering if there was a way to do it with Google.
For what I've read during my past researches I saw that Google offered a way to do it, but then it became deprecated, so no more usable. Do they really took away this feature for good?
I've also noticed that Google offers a similar service with his Google Search Console, but it can just be used for your own website, I'd like to get those kind of information for a random given url.
Actually I will be using Python in my project, but I don't think there's a package able to deliver me these kind of data, or at least I looked for it and didn't find anything.
Any help would be appreciated.
I'd like to make a function which converts Google Drive videos into VLC streamable links (e.g. vlc://https://WEBSITE.com/FILE_ID.mkv.
I've tried methods which were shared on stack overflow, such as modifying the Google Drive link to:
https://drive.google.com/uc?export=download&id=FILE_ID
All the methods I've tried seem to not work anymore. Any ideas?
I've figured out the answer.
Google Drives' API has a download feature, you just need to make a request to https://www.googleapis.com/drive/v3/files/FILE_ID?alt=media&key=API_KEY
Now this doesn't generate a direct file path ending with .mp4 or .mkv but VLC and PotPlayer are able to recognize this link like this:
potplayer://https://www.googleapis.com/drive/v3/files/FILE_ID?alt=media&key=API_KEY
vlc://https://www.googleapis.com/drive/v3/files/FILE_ID?alt=media&key=API_KEY
Edit: this doesn't work in development, Google prevents bots from making requests like that. To work around this you need to set a header in your request. e.g.
url = "https://www.googleapis.com/drive/v3/files/FILE_ID?alt=media&key=API_KEY"
r = requests.get(url, headers={"Authorization":"Bearer " + accessToken})
You get the accessToken from the Google Drive API
Just make the file public and copy your ID.
You can find it here: /file/d/YOUR ID/view?usp=sharing.
Copy your ID and paste it in this:
drive.google.com/uc?export=view&id=YOUR ID
How can I download all the pdf (or specific extension files like .tif or .pdf) from a webpage that requires login. I dont want to log in everytime for every pdf so I cant use link generation and pushing to browser scheme
The solution was simple: just posting it for others may have the same question
mydriver.get("https://username:password#www.somewebsite.com/somelink")
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to download a file in python
I'm playing with Python for doing some crawling stuff. I do know there is urllib.urlopen("http://XXXX") That can help me to get the html for target website. However, The link to the original image in that webpage will usually make the image in the backup page unavailable. I am wondering is there a way that can also save the image in the local space, then we can read the full content on the website without internet connection. It's like back up the whole webpage, but I'm not sure is there any way to do that in Python. Also, if it can get rid of the advertisement stuff, it will be more awesome though. Thanks.
If you're looking to backup a single webpage, you're well on your way.
Since you mention crawling, if you want to backup an entire website, you'll need to do some real crawling and you'll need scrapy for that.
There are several ways of downloading files off the interwebs, just see these questions:
Python File Download
How to- download a file in python
Automate file download from http using python
Hope this helps
When I upload PDF to Google Docs (using Python's gdata library), I get link to the document:
>>> e.GetAlternateLink().href
Out[14]: 'http://docs.google.com/a/my.dom.ain/fileview id=<veery-long-doc-id>&hl=en'
Unfortunately using that link in IFRAME is not working for me because PDF viewer is redirecting to itself, breaking out of IFRAME.
Looking for the solution, I've found this: http://googlesystem.blogspot.com/2009/09/embeddable-google-document-viewer.html - which looks very nice, but I can't find a way to use it with document uploaded to Google Docs. Does somebody know how to do it/if it's at all possible?
Just for the record - I haven't found any way to force "internal" google google pdf viewer to not go out of the iframe. And as I mentioned in the question, I found this nice standalone viewer: https://googlesystem.blogspot.com/2009/09/embeddable-google-document-viewer.html, that can be used like this:
<iframe src="https://docs.google.com/gview?url=http://infolab.stanford.edu/pub/papers/google.pdf&embedded=true" style="width:600px; height:500px;" frameborder="0"></iframe>
-- but in order to use it you have to publish your PDF to the outside world. This wouldn't be a bad solution, because published document has unique id that is probably harder to guess than a password to google docs account. Unfortunately, even with hottest Google Docs API version 3 API, there seems to be no way of publishing PDF programatically..
In the end, I went for a mix of: standalone PDF viewer from google and some other web service that allows to programatically upload and publish PDF. A bit half-baked solution, but it works well so far.
To embed pdf files present in your google docs into your website use the below code:
<iframe src="http://docs.google.com/gview?a=v&pid=explorer&chrome=false&api=true&embedded=true&srcid=<id of your pdf>&hl=en&embedded=true" style="width:600px; height:500px;" frameborder="0"></iframe>
Try this!
Same as other answers above...
<iframe src="https://docs.google.com/gview?url={magical url that works}"></iframe>
except the magical url that works is https://drive.google.com/uc?id=<docId>&embedded=true.
Google Drive/Docs provides a bunch of different urls:
https://drive.google.com/open?id=<docId> Share link.
https://docs.google.com/document/<docId>/edit Open in Google Drive.
https://docs.google.com/document/d/<docId>/view Same as 'edit' above. I think.
https://docs.google.com/document/d/<docId>/pub?embedded=true For embedding in iframe if you File -> Publish to the web...
https://drive.google.com/uc?export=download&id=<docId> Direct download link.
I stumbed across this solution after a bunch of trial and error with different links. Hope this helps!
The Google Docs embedding in iframes via the viewer is problematic in IE8 if not already cached, and is is just not equal to the much better Scribd's facility that allows you to simply make a simple html page with the document embeded via their supplied object code for the document. I then use it as the source file for my iframe. It shows the print (and also a full screen button), right in the embedded frame page. Much more friendly and reliable for the page's visitors.
The following worked for me:
<iframe src="https://drive.google.com/viewerng/viewer?url=url_of_pdf?pid=explorer&efh=false&a=v&chrome=false&embedded=true" embedded=true></iframe>
Spent an hour on this, below worked:
Example:
<iframe src={`https://docs.google.com/gview?url=${encodeURIComponent('http://infolab.stanford.edu/pub/papers/google.pdf')}&embedded=true`}></iframe>
Note that encodeURIComponent was needed.