Python gingerit module example does not work

Python gingerit module example does not work - python

I want to do a python script that is correcting spelling and grammar mistakes of a text.
I found the gigerit library but when I try to run the code from their documentation, I receive an ugly error.
This is the code:
from gingerit.gingerit import GingerIt
text = 'The smelt of fliwers bring back memories.'
parser = GingerIt()
parser.parse(text)
And this is a part of the error:
JSONDecodeError Traceback (most recent call last)
During handling of the above exception, another exception occurred:
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
JSONDecodeError: [Errno Expecting value] <!DOCTYPE html>
I also created another empty environment where I installed this library in order to avoid conflicts. But whatever I do, I can't manage to make this code run.

Had the same issue and fixed it with the following solution provided by MythicalMAxX on https://github.com/Azd325/gingerit/issues/24:
"It was cloudflare antibot which was blocking request.
So you all can fix it by importing cloudscraper
and replacing line 16
session = requests.Session()
TO
session = cloudscraper.create_scraper()
In gingerit.py"

Related

Exception: Failed to parse class ID from URL

I'm subscribed to skillshare but turns out skillshare's UI is a huge mess and unproductive for learning. So, I am seeking for a way to mass download course(single course) at once.
I found this github.
https://github.com/crazygroot/skillsharedownloader
And it has a google collab link as well at the bottom.
https://colab.research.google.com/drive/1hUUPDDql0QLul7lB8NQNaEEq-bbayEdE#scrollTo=xunEYHutBEv%2F
I'm getting the below error:
Traceback (most recent call last):
File "/root/Skillsharedownloader/ss.py", line 11, in <module>
dl.download_course_by_url(course_url)
File "/root/Skillsharedownloader/downloader.py", line 34, in download_course_by_url
raise Exception('Failed to parse class ID from URL')
Exception: Failed to parse class ID from URL
This is the course link that I'm using:
https://www.skillshare.com/en/classes/React-JS-Learn-by-examples/1186943986/

I have encountered a similar issue. The problem was that it requires the URL to look like https://www.skillshare.com/classes/.*?/(\d+).
If your copying the URL from the address bar, check the URL again and make sure it has the same format. The current one looks like https://www.skillshare.com/en/classes/xxxxx. So simply remove /en.

First Flask unittest is successful, all subsequent tests fail. Maybe a stale app_context?

I'm trying to get some unit tests setup using the Python unittest module, but I cannot get more than one test to pass. They seem to run them in alphabetical order and only the first one completes successfully. I can run any one test individually, though, and it completes fine. I'm guessing it has something to do with not having a fresh app_context each time, but I can't figure it out.
❯ ./test_api.py
.FF
======================================================================
FAIL: test_get_category_of_questions (__main__.TriviaTestCase)
Test getting a list of trivia questions by category.
----------------------------------------------------------------------
Traceback (most recent call last):
File "./test_api.py", line 73, in test_get_category_of_questions
self.assertEqual(response.status_code, 200)
AssertionError: 404 != 200
======================================================================
FAIL: test_get_one_question (__main__.TriviaTestCase)
Test getting a specific question.
----------------------------------------------------------------------
Traceback (most recent call last):
File "./test_api.py", line 35, in test_get_one_question
self.assertEqual(response.status_code, 200)
AssertionError: 404 != 200
----------------------------------------------------------------------
Ran 3 tests in 0.343s
FAILED (failures=2)
[1] 3318 exit 1 ./test_api.py
The last two 404s should be 200s. There might be a clue in the fact that I'm not even getting the expected 404s. There should be some json attached to the response body, but there is not and the mimetype is actually ['text/html']. I'm completely confused and not sure at all how to proceed. I would switch to pytest which I've had better luck with, but this project requires the use of unittest.
https://github.com/matthew02/FSND_TriviaAPI/blob/master/backend/test_api.py
https://github.com/matthew02/FSND_TriviaAPI

I got it to work. I had to create the Flask app only once as a class attribute rather than creating a new app each time in setUp(). Then I created a new test_client() in setUp() and I was able to use that client to make requests. Thanks to everyone who looked into it with me.

Looking at the documentation here:
https://flask.palletsprojects.com/en/1.1.x/testing/
https://flask.palletsprojects.com/en/1.1.x/api/
It seems that you must either close your connection, or enclose your test_client() in a with block to defer it. Something like:
with self.app.test_client() as client:
response = client.get('/questions')
print(f'test_get_default_page_of_questions response is {response}')
self.assertEqual(response.status_code, 200)

Facebook scraper 'NoneType' object has no attribute 'find' while get_post

While i using facebook_scraper libraries to get post from facebook page with this code.
from facebook_scraper import get_posts
for post in get_posts('ThaiPBSFan', pages = 50):
print(post['text'][:100])
It work with few post, then error like this.
Traceback (most recent call last):
File ".\main.py", line 2, in <module>
for post in get_posts('ThaiPBSFan', pages = 50):
File "C:\Users\admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebook_scraper.py", line 75, in _get_posts
yield _extract_post(article)
File "C:\Users\admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebook_scraper.py", line 102, in _extract_post
text, post_text, shared_text = _extract_text(article)
File "C:\Users\admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebook_scraper.py", line 137, in _extract_text
nodes = article.find('p, header')
AttributeError: 'NoneType' object has no attribute 'find'
So what's a problem and how can i fix it.

From the traceback, it seems that facebook_scraper is not returning a valid post; this may be because there are no further posts to find on the page.
Therefore, you could use a try/except block to catch this exception, i.e.:
try:
for post in get_posts('ThaiPBSFan', pages=50):
print(post['text'][:100])
except AttributeError:
print("No more posts to get")
It's not ideal as you would preferably be able to get a more specific exception once there were no more posts to retrieve, but it should work in your case. Be careful with the code insider your try clause - if an AttributeError is raise anywhere else, you will miss it.

I had the same issue, but only when using the most recent version of the package (0.1.12). Try with an older version of the package. For example, I tried the version 0.1.4 and it worked well. To install it, write:
pip install facebook_scraper==0.1.4
in your terminal.

Script raises HTTPError, but cannot catch

I have an error which I have a little difficulty understanding. I have a script which uses biopython to query a database. Sometimes, biopython can't find what we're looking for, and an HTTPError is thrown. I cannot, however catch the HTTPError, as I get the following error message:
HTTPError: HTTP Error 404: Not Found
During handling of the above exception, another exception occurred:
NameError Traceback (most recent call
last) in ()
51 UniProt = text[index+9:index+15]
52 uniprot_IDs[bigg_ID] = UniProt
---> 53 except HTTPError:
54 if err.code == '404':
55 uniprot_IDs[biGG_ID] = None
NameError: name 'HTTPError' is not defined
How can an error which is not defined be thrown in the first place? What am I missing?
This is the relevant code:
from Bio.KEGG import REST, Enzyme
from DataTreatment import openJson, write
...
try:
ec_number = some_string
text = REST.kegg_get('ec:'+ec_number).read()
...
except HTTPError:
if err.code == '404':
a_dict[a_key] = None

You need to import the HTTPError class. If you already imported, then make sure you got the right one. You can try to catch with a generic Exception and use type(ex) to find out which it is and import the correct type.

You need to import the HTTPError-class, try this;
In the top of your code, add
from urllib.error import HTTPError
Source: Entrez._HTTPError vs. Entrez.HTTPError (via Entrez.efetch)

Python 3.4.3 save image from url to file using urllib

I tried to make a python program that would allow me to download a jpg file from a website.
Why I'm doing this is really for no reason at all, I just wanted to try it for fun.
Anyways, here is the code:
import urllib
a = 1
while a == 1:
urllib.urlretrieve("http://lemerg.com/data/wallpapers/38/957049.jpg","D:\\Users\\Elias\\Desktop\\FolderName-957049.jpg")
(You may have to properly tab it in, it wouldn't let me here)
So basically what I want it to do is to repeatedly download the same file until I close the program. Just don't ask why.
The error code I get is:
Traceback (most recent call last):
urllib.urlretrieve("http://lemerg.com/data/wallpapers/38/957049.jpg","D:\Users\Elias\Desktop\FolderName-957049.jpg")
AttributeError: 'module' object has no attribute 'urlretrieve'

urlretrieve() in Python3 is in the urllib.request module. Do this:
from urllib import request
a = 1
while a == 1:
request.urlretrieve("http://lemerg.com/data/wallpapers/38/957049.jpg","D:\\Users\\Elias\\Desktop\\FolderName-957049.jpg")

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python gingerit module example does not work - python

Related

Exception: Failed to parse class ID from URL

First Flask unittest is successful, all subsequent tests fail. Maybe a stale app_context?

Facebook scraper 'NoneType' object has no attribute 'find' while get_post

Script raises HTTPError, but cannot catch

Python 3.4.3 save image from url to file using urllib

Categories

Resources