I'm subscribed to skillshare but turns out skillshare's UI is a huge mess and unproductive for learning. So, I am seeking for a way to mass download course(single course) at once.
I found this github.
https://github.com/crazygroot/skillsharedownloader
And it has a google collab link as well at the bottom.
https://colab.research.google.com/drive/1hUUPDDql0QLul7lB8NQNaEEq-bbayEdE#scrollTo=xunEYHutBEv%2F
I'm getting the below error:
Traceback (most recent call last):
File "/root/Skillsharedownloader/ss.py", line 11, in <module>
dl.download_course_by_url(course_url)
File "/root/Skillsharedownloader/downloader.py", line 34, in download_course_by_url
raise Exception('Failed to parse class ID from URL')
Exception: Failed to parse class ID from URL
This is the course link that I'm using:
https://www.skillshare.com/en/classes/React-JS-Learn-by-examples/1186943986/
I have encountered a similar issue. The problem was that it requires the URL to look like https://www.skillshare.com/classes/.*?/(\d+).
If your copying the URL from the address bar, check the URL again and make sure it has the same format. The current one looks like https://www.skillshare.com/en/classes/xxxxx. So simply remove /en.
Related
While i using facebook_scraper libraries to get post from facebook page with this code.
from facebook_scraper import get_posts
for post in get_posts('ThaiPBSFan', pages = 50):
print(post['text'][:100])
It work with few post, then error like this.
Traceback (most recent call last):
File ".\main.py", line 2, in <module>
for post in get_posts('ThaiPBSFan', pages = 50):
File "C:\Users\admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebook_scraper.py", line 75, in _get_posts
yield _extract_post(article)
File "C:\Users\admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebook_scraper.py", line 102, in _extract_post
text, post_text, shared_text = _extract_text(article)
File "C:\Users\admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebook_scraper.py", line 137, in _extract_text
nodes = article.find('p, header')
AttributeError: 'NoneType' object has no attribute 'find'
So what's a problem and how can i fix it.
From the traceback, it seems that facebook_scraper is not returning a valid post; this may be because there are no further posts to find on the page.
Therefore, you could use a try/except block to catch this exception, i.e.:
try:
for post in get_posts('ThaiPBSFan', pages=50):
print(post['text'][:100])
except AttributeError:
print("No more posts to get")
It's not ideal as you would preferably be able to get a more specific exception once there were no more posts to retrieve, but it should work in your case. Be careful with the code insider your try clause - if an AttributeError is raise anywhere else, you will miss it.
I had the same issue, but only when using the most recent version of the package (0.1.12). Try with an older version of the package. For example, I tried the version 0.1.4 and it worked well. To install it, write:
pip install facebook_scraper==0.1.4
in your terminal.
I have already seen the examples on here of using python's os library to get a local file's time stamp in python by passing it the local path (i.e. /var/www/html/etc.../filename.txt), but when I try to pass getmtime a link, it cannot process it.
Here is what the code looks like:
import os
print(os.path.getmtime('https://www.sec.gov/Archives/edgar/data/1474439/000169655519000022/xslF345X03/wf-form4_156772823294389.xml'))
Here is the error I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.7/genericpath.py", line 55, in getmtime
return os.stat(filename).st_mtime
FileNotFoundError: [Errno 2] No such file or directory: 'https://www.sec.gov/Archives/edgar/data/1474439/000169655519000022/xslF345X03/wf-form4_156772823294389.xml'
I know that this link exists.
So it obviously doesn't like me passing it a link. Is there another function that you use to pass links, to get the last modification time of a remote file?
An URL is not necessarily a file. You can ask the remote server to tell you about the link, and the remote server may provide a Last-Modified header, or may not, at the remote server's discretion. It could also lie, if so instructed. In order to do this, you would need to make a HTTP request; the easiest way to do it from Python is the nice requests library.
import requests
import dateutil.parser
response = requests.head(url)
last_modified = response.headers.get('Last-Modified')
if last_modified:
last_modified = dateutil.parser.parse(last_modified)
I am trying to use the Google Drive API to download publicly available files however whenever I try to proceed I get an import error.
For reference, I have successfully set up the OAuth2 such that I have a client id as well as a client secret , and a redirect url however when I try setting it up I get an error saying the object has no attribute urllen
>>> from apiclient.discovery import build
>>> from oauth2client.client import OAuth2WebServerFlow
>>> flow = OAuth2WebServerFlow(client_id='not_showing_client_id', client_secret='not_showing_secret_id', scope='https://www.googleapis.com/auth/drive', redirect_uri='https://www.example.com/oauth2callback')
>>> auth_uri = flow.step1_get_authorize_url()
>>> code = '4/E4h7XYQXXbVNMfOqA5QzF-7gGMagHSWm__KIH6GSSU4#'
>>> credentials = flow.step2_exchange(code)
And then I get the error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/oauth2client/util.py", line
137, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Library/Python/2.7/site-packages/oauth2client/client.py", line
1980, in step2_exchange
body = urllib.parse.urlencode(post_data)
AttributeError: 'Module_six_moves_urllib_parse' object has no attribute
'urlencode'
Any help would be appreciated, also would someone mind enlightening me as to how I instantiate a drive_file because according to https://developers.google.com/drive/web/manage-downloads, I need to instantiate one and I am unsure of how to do so.
Edit: So I figured out why I was getting the error I got before. If anyone else is having the same problem then try running.
sudo pip install -I google-api-python-client==1.3.2
However I am still unclear about the drive instance so any help with that would be appreciated.
Edit 2: Okay so I figured out the answer to my whole question. The drive instance is just the metadata which results when we use the API to search for a file based on its id
So as I said in my edits try the sudo pip install and a file instance is just a dictionary of meta data.
I keep getting this exception from TweetStream 1.1.1, "exception.code == 404:uthenticationError("Access denied")" It worked last week and now it doesn't. I have tried different usernames and passwords. I can log into twitter with my account information. I even deleted and reinstalled the module. what gives? Thanks for the help!
I try running this...
import tweetstream
stream = tweetstream.SampleStream("MY_USERNAME", "MY_PASSWORD")
for tweet in stream:
print tweet
The error actually looks like this:
Traceback (most recent call last):
File "<pyshell#28>", line 1, in <module>
for tweet in stream:
File "C:\Python27\lib\site-packages\tweetstream-1.1.1-py2.7.egg\tweetstream\streamclasses.py", line 165, in __iter__
self._init_conn()
File "C:\Python27\lib\site-packages\tweetstream-1.1.1-py2.7.egg\tweetstream\streamclasses.py", line 103, in _init_conn
raise AuthenticationError("Access denied")
AuthenticationError: Access denied
Twitter released the next version of API (1.1). And tweetstream doesn't support it yet. See relevant issue on tweetstream project issue tracker.
Had the same problem here, and I could not get the patched version mentioned on the project issue tracker (linked by #alecxe) to work either.
Twitter provides a list of libraries that should work with the newer API, at https://dev.twitter.com/docs/twitter-libraries
It lists many, including several for Python.
Hey guys, I am a little lost on how to get the auth token. Here is the code I am using on the return from authorizing my app:
client = gdata.service.GDataService()
gdata.alt.appengine.run_on_appengine(client)
sessionToken = gdata.auth.extract_auth_sub_token_from_url(self.request.uri)
client.UpgradeToSessionToken(sessionToken)
logging.info(client.GetAuthSubToken())
what gets logged is "None" so that does seem right :-(
if I use this:
temp = client.upgrade_to_session_token(sessionToken)
logging.info(dump(temp))
I get this:
{'scopes': ['http://www.google.com/calendar/feeds/'], 'auth_header': 'AuthSub token=CNKe7drpFRDzp8uVARjD-s-wAg'}
so I can see that I am getting a AuthSub Token and I guess I could just parse that and grab the token but that doesn't seem like the way things should work.
If I try to use AuthSubTokenInfo I get this:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 507, in __call__
handler.get(*groups)
File "controllers/indexController.py", line 47, in get
logging.info(client.AuthSubTokenInfo())
File "/Users/matthusby/Dropbox/appengine/projects/FBCal/gdata/service.py", line 938, in AuthSubTokenInfo
token = self.token_store.find_token(scopes[0])
TypeError: 'NoneType' object is unsubscriptable
so it looks like my token_store is not getting filled in correctly, is that something I should be doing?
Also I am using gdata 2.0.9
Thanks
Matt
To answer my own question:
When you get the Token just call:
client.token_store.add_token(sessionToken)
and App Engine will store it in a new entity type for you. Then when making calls to the calendar service just dont set the authsubtoken as it will take care of that for you also.