I got gensim to work in Google Collab by following this process:
!pip install gensim
from gensim.summarization import summarize
Then I was able to call summarize(some_text)
Now I'm trying to run the same thing in VS code:
I've installed gensim:
pip3 install gensim
but when I run
from gensim.summarization import summarize
I get the error
Import "gensim.summarization" could not be resolvedPylancereportMissingImports
I've also tried from gensim.summarization.summarizer import summarize with same error. Regardless I haven't been able to call the function summarize(some_text) outside of Google Collab.
The summarization code was removed from Gensim 4.0. See:
https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4#12-removed-gensimsummarization
12. Removed gensim.summarization
Despite its general-sounding name, the module will not satisfy the
majority of use cases in production and is likely to waste people's
time. See this Github
ticket for
more motivation behind this.
If you need it, you could try:
installing an older gensim version (such as 3.8.3, the last official release in which it remained); or…
copy the source code out to your own local module
However, I expect you'd likely be disappointed by its inflexibility and how little it can do.
It was only extractive summarization - choosing a few key sentences from those that already exist. That only gives impressive results when the source text was already well-written in an expository style mixing high-level overview sentences with separate detail sentences. And, its method of analyzing/ranking words was very crude & hard-to-customize – totally unconnected to the more generic/configurable/swappable approaches used elsewhere in Gensim or in other text libraries.
So I had to download specifically
pip3 install gensim==3.6.0
I was using gensim==4.1.0 and this function no longer seems to work in this later version
Related
I'm trying to build a rubik's cube solver and I'm using kociemba module. I had some problems with installation so I downloaded it manually from GitHub - https://github.com/muodov/kociemba. Now I'm testing it but I'm getting an error that I don't understand. Answer if you can help, thanks!
This is the test code:
import kociemba
kociemba.solve('DRLUUBFBRBLURRLRUBLRDDFDLFUFUFFDBRDUBRUFLLFDDBFLUBLRBD')
And I'm getting this output:
No module named 'kociemba.ckociembawrapper'
C:\Users\Paweł\AppData\Local\Programs\Python\Python39\lib\site-
packages\kociemba\__init__.py:21: SlowContextWarning: Native version of the package
is not available. We have to fallback to pure-Python implementation of the
algorithm, which is usually many times slower. If performance is important to you,
check official project page for a native implementation:
https://github.com/muodov/kociemba
warnings.warn("Native version of the package is not available. "
I have included the 2 import statements in my views.py
from gensim.summarization.summarizer import summarizer
from gensim.summarization import keywords
However, even after I installed gensim using pip, I am getting the error:
ModuleNotFoundError: No module named 'gensim.summarization'
The summarization code was removed from Gensim 4.0. See:
https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4#12-removed-gensimsummarization
12. Removed gensim.summarization
Despite its general-sounding name, the module will not satisfy the
majority of use cases in production and is likely to waste people's
time. See this Github
ticket for
more motivation behind this.
If you need it, you could try:
installing the older gensim version; or…
copy the source code out to your own local module
However, I expect you'd likely be disappointed by its inflexibility and how little it can do.
It was only extractive summarization - choosing a few key sentences from those that already exist. That only gives impressive results when the source text was already well-written in an expository style mixing high-level overview sentences with separate detail sentences. And, its method of analyzing/ranking words was very crude & hard-to-customize – totally unconnected to the more generic/configurable/swappable approaches used elsewhere in Gensim or in other text libraries.
You can run pip freeze within your environment to ensure it is installed here.
If it is, then you should check naming of any modules or files you have in your directory to ensure there are no conflicts.
I had the same issue
As per Gensim’s Github changelog 188, gensim.summarization module has been removed in versions Gensim 4.x as it was an unmaintained third-party module.
To continue using gensim.summarization, you will need to downgrade the version of Gensim in the requirements.txt file by replacing it with gensim==3.8.3 or an older version.
This is partially related to an already closed question about keras package and R in google colab. But I have some specific doubts in such a workflow...
1 .It is known we have how to use R in google colab. And the use of the google colab's GPU and TPU are indeed interesting.
And although the documentations says we need to run install_keras() in R if we want to use GPU on keras, at google colab it works without this setting. Neither python installation is required.
But deep learning processes are time consuming... running all the code in just one notebook have some limitations... Splitting into several ones, saving and sharing the resultant object to re-use in the next notebook can be interesting.
We can think the above is more desirable specially because the environment is ephemeral. And the solution would be mounting the google drive to be able to use its data, and save some partial outputs on it. But mounting google drive appears to be restricted to python notebooks... yes, we have discussions proposing solutions as here and here but I was not able to implement them...
So I am really curious how the R keras users (and other R users) deal with such an issue when using google colab?
If we keep in this idea of a workflow using more than one notebook, some possible related questions is this one (without answer)
So, I have tried to another alternative: using the python notebook and run R in specific cells inside it using rpy2 like indicated here and other discussions I mentioned before... Yes, one can ask why not coding on Python... Ignoring it and still keeping R...
But happens that R's keras is an api for python keras, and need python to run... But I do not why, when I try to run any keras function, even a simple
%%R
imdb<-dataset_imdb()
I get:
R[write to console]: Error: Error 1 occurred creating conda
environment /root/.local/share/r-miniconda/envs/r-reticulate
I also see one saying Rkerkel does not see the colab's python like here, but I know it is not true, because R'keras works in the Rkernel, and if I run the same py_config, I am able to see the python versions...
But the point is... why in this python notebook, using rpy2, we cannot verify the Python...?
If we run the notebook with R Kernel, the all package requiring python works well without any intervention... that's strange...
I see discussions of how to install conda like here. But I believe this should not be the way... Maybe is related to rpy2...
I have tried some alternatives to check the existence of python versions inside the R cells (called with %%R), and I believe the R called in this sense are not able to see python...
%%R
library(reticulate)
py_config()
It returns the same
R[write to console]: Error: Error 1 occurred creating conda
environment /root/.local/share/r-miniconda/envs/r-reticulate
Error: Error 1 occurred creating conda environment
/root/.local/share/r-miniconda/envs/r-reticulate
So, my major question:
how to use effectively the R keras (and other R packages using python in its background) inside google colab's notebook with python kernel?
What I am missing here with rpy2?
I am trying to use LDA MAllet model. but I am facing with "No module named 'gensim.models.wrappers'" error.
I have gensim installed and ' gensim.models.LdaMulticore' works properly.
Java developer’s kit is installed
I have already downloaded mallet-2.0.8.zip and unzipped it on c:\ drive.
This is the code I am trying to use:
import os
from gensim.models.wrappers import LdaMallet
os.environ.update({'MALLET_HOME':r'C:/mallet-2.0.8/'})
mallet_path = r'C:/mallet-2.0.8/bin/mallet'
Does anyone know what is wrong here? Many thanks!
If you've installed the latest Gensim, 4.0.0 (as of late March, 2021), the LdaMallet model has been removed, along with a number of other tools which simply wrapped external tools/APIs.
You can see the note in the Gensim migration guide at:
https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4#15-removed-third-party-wrappers
If the use of that tool is essential to your project, you may be able to:
install an older version of Gensim, such as 3.8.3 - though of course you'd then be missing the latest fixes & optimizations on any other Gensim models you're using
extract the ldamallet.py source code from that older version & update/move it to your own code for private use - dealing with whatever issues arise
I had the same issue with Gensim's wrapper for MALLET but didn't want to downgrade. There is this new wrapper that seems to do the job pretty well.
https://github.com/maria-antoniak/little-mallet-wrapper/blob/master/demo.ipynb
I am writing a python program in Google App Engine that calculates tf-idf using TfidfVectorizer in sklearn.
I have added sklearn library and have the import as:
from sklearn.feature_extraction.text import TfidfVectorizer
However it gives me no module named _check_build although it is in the library that I have imported.
Note: I have the same code in pure python and it works just fine so there is nothing wrong with the python syntax or imports; The problem starts with GAE.
Do you know any way to solve this issue?
You can't. sklearn has a lot of 'c' based dependencies and typically any module that is named with a leading _ is a binary module.
So that's why you are getting a no module named _check_build error.
I seriously doubt you will get it to run even if you fake some of the 'c' libs unless they have pure python analogues.
I have done this in the past where libs had 'c' based performance versions as well as pure python.
if you are not using any of GAE-specific tools, try deploying your app on Heroku.
It let's you deploy a whole virtual environment with all the installed libraries on it. Specifically, Scikit-learn works on Heroku just fine. Check this Github repo for example.