Scrapy-elasticsearch plugin issue - python

I have recently installed this plugin which is working great ...
Now my issue is that when I repopulate the ES 'index' with new data, I want to delete the existing 'index' first in ES. This is to delete old data in ES.
The above mentioned plugin contains this file scrapyelasticsearch.py where I think I can add this code
es.delete(index='my-index', doc_type='test')
to delete the index before repopulating.
The plugin will automatically recreate the index before inserting data.
Question: I couldn't find where this file (scrapyelasticsearch.py) is located ? I am using Ubuntu 16.04, with ES and Scrapy also installed.
I tried this command to find this package
dpkg -l scrapyelasticsearch
but received this error
dpkg-query: no packages found matching scrapyelasticsearch
If anyone has used this plugin/package, please help me find this file scrapyelasticsearch.py
Any help is very appreciated. Thanks

The file is located in your site-packages directory of your python installation. So if you're running on system's python (not a virtual environment) it would be something like:
/usr/lib/python3.5/site-packages/
However, you should not modify site-package data!
What you should do is clone or fork the project on github, make your changes to it, and install this fork on your system.
git clone https://github.com/knockrentals/scrapy-elasticsearch.git
cd scrapy-elasticsearch
your_editing_program 'scrapyelasticsearch/scrapyelasticsearch.py'
# make changes
pip uninstall scrapy-elasticsearch # uninstall old original package
pip install . # install your package, you can also add -e flag for real time modifications

Related

No module named 'stix2' [duplicate]

After installing mechanize, I don't seem to be able to import it.
I have tried installing from pip, easy_install, and via python setup.py install from this repo: https://github.com/abielr/mechanize. All of this to no avail, as each time I enter my Python interactive I get:
Python 2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mechanize
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named mechanize
>>>
The installations I ran previously reported that they had completed successfully, so I expect the import to work. What could be causing this error?
In my case, it is permission problem. The package was somehow installed with root rw permission only, other user just cannot rw to it!
I had the same problem: script with import colorama was throwing an ImportError, but sudo pip install colorama was telling me "package already installed".
My fix: run pip without sudo: pip install colorama. Then pip agreed it needed to be installed, installed it, and my script ran. Or even better, use python -m pip install <package>. The benefit of this is, since you are executing the specific version of python that you want the package in, pip will unequivocally install the package into the "right" python. Again, don't use sudo in this case... then you get the package in the right place, but possibly with (unwanted) root permissions.
My environment is Ubuntu 14.04 32-bit; I think I saw this before and after I activated my virtualenv.
I was able to correct this issue with a combined approach. First, I followed Chris' advice, opened a command line and typed 'pip show packagename'
This provided the location of the installed package.
Next, I opened python and typed 'import sys', then 'sys.path' to show where my python searches for any packages I import. Alas, the location shown in the first step was NOT in the list.
Final step, I typed 'sys.path.append('package_location_seen_in_step_1'). You optionally can repeat step two to see the location is now in the list.
Test step, try to import the package again... it works.
The downside? It is temporary, and you need to add it to the list each time.
It's the python path problem.
In my case, I have python installed in:
/Library/Frameworks/Python.framework/Versions/2.6/bin/python,
and there is no site-packages directory within the python2.6.
The package(SOAPpy) I installed by pip is located
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/
And site-package is not in the python path, all I did is add site-packages to PYTHONPATH permanently.
Open up Terminal
Type open .bash_profile
In the text file that pops up, add this line at the end:
export PYTHONPATH=$PYTHONPATH:/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/
Save the file, restart the Terminal, and you're done
The Python import mechanism works, really, so, either:
Your PYTHONPATH is wrong,
Your library is not installed where you think it is
You have another library with the same name masking this one
I have been banging my head against my monitor on this until a young-hip intern told me the secret is to "python setup.py install" inside the module directory.
For some reason, running the setup from there makes it just work.
To be clear, if your module's name is "foo":
[burnc7 (2016-06-21 15:28:49) git]# ls -l
total 1
drwxr-xr-x 7 root root 118 Jun 21 15:22 foo
[burnc7 (2016-06-21 15:28:51) git]# cd foo
[burnc7 (2016-06-21 15:28:53) foo]# ls -l
total 2
drwxr-xr-x 2 root root 93 Jun 21 15:23 foo
-rw-r--r-- 1 root root 416 May 31 12:26 setup.py
[burnc7 (2016-06-21 15:28:54) foo]# python setup.py install
<--snip-->
If you try to run setup.py from any other directory by calling out its path, you end up with a borked install.
DOES NOT WORK:
python /root/foo/setup.py install
DOES WORK:
cd /root/foo
python setup.py install
I encountered this while trying to use keyring which I installed via sudo pip install keyring. As mentioned in the other answers, it's a permissions issue in my case.
What worked for me:
Uninstalled keyring:
sudo pip uninstall keyring
I used sudo's -H option and reinstalled keyring:
sudo -H pip install keyring
In PyCharm, I fixed this issue by changing the project interpreter path.
File -> Settings -> Project -> Project Interpreter
File -> Invalidate Caches… may be required afterwards.
I couldn't get my PYTHONPATH to work properly. I realized adding export fixed the issue:
(did work)
export PYTHONPATH=$PYTHONPATH:~/test/site-packages
vs.
(did not work)
PYTHONPATH=$PYTHONPATH:~/test/site-packages
This problem can also occur with a relocated virtual environment (venv).
I had a project with a venv set up inside the root directory. Later I created a new user and decided to move the project to this user. Instead of moving only the source files and installing the dependencies freshly, I moved the entire project along with the venv folder to the new user.
After that, the dependencies that I installed were getting added to the global site-packages folder instead of the one inside the venv, so the code running inside this env was not able to access those dependencies.
To solve this problem, just remove the venv folder and recreate it again, like so:
$ deactivate
$ rm -rf venv
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
Something that worked for me was:
python -m pip install -user {package name}
The command does not require sudo. This was tested on OSX Mojave.
In my case I had run pip install Django==1.11 and it would not import from the python interpreter.
Browsing through pip's commands I found pip show which looked like this:
> pip show Django
Name: Django
Version: 1.11
...
Location: /usr/lib/python3.4/site-packages
...
Notice the location says '3.4'. I found that the python-command was linked to python2.7
/usr/bin> ls -l python
lrwxrwxrwx 1 root root 9 Mar 14 15:48 python -> python2.7
Right next to that I found a link called python3 so I used that. You could also change the link to python3.4. That would fix it, too.
In my case it was a problem with a missing init.py file in the module, that I wanted to import in a Python 2.7 environment.
Python 3.3+ has Implicit Namespace Packages that allow it to create a packages without an init.py file.
Had this problem too.. the package was installed on Python 3.8.0 but VS Code was running my script using an older version (3.4)
fix in terminal:
py .py
Make sure you're installing the package on the right Python Version
I had colorama installed via pip and I was getting "ImportError: No module named colorama"
So I searched with "find", found the absolute path and added it in the script like this:
import sys
sys.path.append("/usr/local/lib/python3.8/dist-packages/")
import colorama
And it worked.
I had just the same problem, and updating setuptools helped:
python3 -m pip install --upgrade pip setuptools wheel
After that, reinstall the package, and it should work fine :)
The thing is, the package is built incorrectly if setuptools is old.
If the other answers mentioned do not work for you, try deleting your pip cache and reinstalling the package. My machine runs Ubuntu14.04 and it was located under ~/.cache/pip. Deleting this folder did the trick for me.
Also, make sure that you do not confuse pip3 with pip. What I found was that package installed with pip was not working with python3 and vice-versa.
I had similar problem (on Windows) and the root cause in my case was ANTIVIRUS software! It has "Auto-Containment" feature, that wraps running process with some kind of a virtual machine.
Symptoms are: pip install somemodule works fine in one cmd-line window and import somemodule fails when executed from another process with the error
ModuleNotFoundError: No module named 'somemodule'
In my case (an Ubuntu 20.04 VM on WIN10 Host), I have a disordered situation with many version of Python installed and variuos point of Shared Library (installed with pip in many points of the File System). I'm referring to 3.8.10 Python version.
After many tests, I've found a suggestion searching with google (but' I'm sorry, I haven't the link). This is what I've done to resolve the problem :
From shell session on Ubuntu 20.04 VM, (inside the Home, in my case /home/hduser), I've started a Jupyter Notebook session with the command "jupyter notebook".
Then, when jupyter was running I've opened a .ipynb file to give commands.
First : pip list --> give me the list of packages installed, and, sympy
wasn't present (although I had installed it with "sudo pip install sympy"
command.
Last with the command !pip3 install sympy (inside jupyter notebook
session) I've solved the problem, here the screen-shot :
Now, with !pip list the package "sympy" is present, and working :
In my case, I assumed a package was installed because it showed up in the output of pip freeze. However, just the site-packages/*.dist-info folder is enough for pip to list it as installed despite missing the actual package contents (perhaps from an accidental deletion). This happens even when all the path settings are correct, and if you try pip install <pkg> it will say "requirement already satisfied".
The solution is to manually remove the dist-info folder so that pip realizes the package contents are missing. Then, doing a fresh install should re-populate anything that was accidentally removed
When you install via easy_install or pip, is it completing successfully? What is the full output? Which python installation are you using? You may need to use sudo before your installation command, if you are installing modules to a system directory (if you are using the system python installation, perhaps). There's not a lot of useful information in your question to go off of, but some tools that will probably help include:
echo $PYTHONPATH and/or echo $PATH: when importing modules, Python searches one of these environment variables (lists of directories, : delimited) for the module you want. Importing problems are often due to the right directory being absent from these lists
which python, which pip, or which easy_install: these will tell you the location of each executable. It may help to know.
Use virtualenv, like #JesseBriggs suggests. It works very well with pip to help you isolate and manage the modules and environment for separate Python projects.
I had this exact problem, but none of the answers above worked. It drove me crazy until I noticed that sys.path was different after I had imported from the parent project. It turned out that I had used importlib to write a little function in order to import a file not in the project hierarchy. Bad idea: I forgot that I had done this. Even worse, the import process mucked with the sys.path--and left it that way. Very bad idea.
The solution was to stop that, and simply put the file I needed to import into the project. Another approach would have been to put the file into its own project, as it needs to be rebuilt from time to time, and the rebuild may or may not coincide with the rebuild of the main project.
I had this problem with 2.7 and 3.5 installed on my system trying to test a telegram bot with Python-Telegram-Bot.
I couldn't get it to work after installing with pip and pip3, with sudo or without. I always got:
Traceback (most recent call last):
File "telegram.py", line 2, in <module>
from telegram.ext import Updater
File "$USER/telegram.py", line 2, in <module>
from telegram.ext import Updater
ImportError: No module named 'telegram.ext'; 'telegram' is not a package
Reading the error message correctly tells me that python is looking in the current directory for a telegram.py. And right, I had a script lying there called telegram.py and this was loaded by python when I called import.
Conclusion, make sure you don't have any package.py in your current working dir when trying to import. (And read error message thoroughly).
I had a similar problem using Django. In my case, I could import the module from the Django shell, but not from a .py which imported the module.
The problem was that I was running the Django server (therefore, executing the .py) from a different virtualenv from which the module had been installed.
Instead, the shell instance was being run in the correct virtualenv. Hence, why it worked.
This Works!!!
This often happens when module is installed to an older version of python or another directory, no worries as solution is simple.
- import module from directory in which module is installed.
You can do this by first importing the python sys module then importing from the path in which the module is installed
import sys
sys.path.append("directory in which module is installed")
import <module_name>
Most of the possible cases have been already covered in solutions, just sharing my case, it happened to me that I installed a package in one environment (e.g. X) and I was importing the package in another environment (e.g. Y). So, always make sure that you're importing the package from the environment in which you installed the package.
For me it was ensuring the version of the module aligned with the version of Python I was using.. I built the image on a box with Python 3.6 and then injected into a Docker image that happened to have 3.7 installed, and then banging my head when Python was telling me the module wasn't installed...
36m for Python 3.6
bsonnumpy.cpython-36m-x86_64-linux-gnu.so
37m for Python 3.7 bsonnumpy.cpython-37m-x86_64-linux-gnu.so
I know this is a super old post but for me, I had an issue with a 32 bit python and 64 bit python installed. Once I uninstalled the 32 bit python, everything worked as it should.
I have solved my issue that same libraries were working fine in one project(A) but importing those same libraries in another project(B) caused error. I am using Pycharm as IDE at Windows OS.
So, after trying many potential solutions and failing to solve the issue, I did these two things (deleted "Venv" folder, and reconfigured interpreter):
1-In project(B), there was a folder named("venv"), located in External Libraries/. I deleted that folder.
2-Step 1 (deleting "venv" folder) causes error in Python Interpreter Configuration, and
there is a message shown at top of screen saying "Invalid python interpreter selected
for the project" and "configure python interpreter", select that link and it opens a
new window. There in "Project Interpreter" drop-down list, there is a Red colored line
showing previous invalid interpreter. Now, Open this list and select the Python
Interpreter(in my case, it is Python 3.7). Press "Apply" and "OK" at the bottom and you
are good to go.
Note: It was potentially the issue where Virtual Environment of my Project(B) was not recognizing the already installed and working libraries.

Download Images from google Image search not working

I'm trying to scrape images from google images using the google_images_download library by using it from another Python file. I previously used the code below about a month ago and it was fine but today morning it threw exception errors and then finally gave me the error
Unfortunately all 100 could not be downloaded because some images were not downloadable
I checked the documentation and GIT repo and noticed there were changes made 15 days ago, is there something I'm missing or is the library bugged? Also if there are better methods than this, kindly point me in the right direction. My code is below:
from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()
arguments = {"keywords":"potato harvesting","limit":100,"format":"jpg","print_urls":True}
paths = response.download(arguments)
I figured a way to fix the error ,through using the CLI instead of a Jupyter Notebook file. I will list down the steps:
First issue is probably how to uninstall the dependencies. Since I used the python setup.py install, I manually had to uninstall them, luckily I found python setup.py uninstall and the procedure is layed out there on how to manually uninstall (highest voted answer).
I then cloned the repo again in a new folder using !git clone https://github.com/Joeclinton1/google-images-download.git via the CLI and then opened the files by cd google-images-download.
After ,I reinstalled the packages using pip install . NOT python setup.py install.
Use the CLI to download images by following the repo instructions in https://google-images-download.readthedocs.io/en/latest/examples.html.
This worked for me and hopefully will work for you. Note: I cloned the repo in a new folder on the desktop for ease. The downloaded images will be in the same repo folder i.e. google-images-download under the Download file.

How to diagnose conan install issue

I have some installation issues with conan
After my Ubuntu 18.04 told "Command 'conan' not found", I guessed the Python
version is wrong. So I attempted to upgrade with the result
$ sudo apt-get install python
python is already the newest version (2.7.15~rc1-1)
However
$ locate python
/var/lib/binfmts/python2.7
/var/lib/binfmts/python3.6
When in this state I attempted to install conan
$ pip install conan
Collecting conan
...
Successfully installed Jinja2-2.10.1 MarkupSafe-1.1.1 PyJWT-1.7.1 PyYAML-5.1.2 astroid-1.6.6 attrs-19.1.0 backports.functools-lru-cache-1.5 bottle-0.12.17 certifi-2019.6.16 chardet-3.0.4 colorama-0.4.1 conan-1.18.0 configparser-3.7.4 deprecation-2.0.6 distro-1.1.0 enum34-1.1.6 fasteners-0.15 future-0.16.0 futures-3.3.0 idna-2.8 isort-4.3.21 lazy-object-proxy-1.4.1 mccabe-0.6.1 monotonic-1.5 node-semver-0.6.1 packaging-19.1 patch-1.16 pluginbase-0.7 pygments-2.4.2 pylint-1.9.5 pyparsing-2.4.2 python-dateutil-2.8.0 requests-2.22.0 singledispatch-3.4.0.3 six-1.12.0 tqdm-4.32.2 urllib3-1.25.3 wrapt-1.11.2
then 'conan' is listed as being installed but
$ conan
Command 'conan' not found, did you mean:
I.e, no error message or warning, just does not install.
I could find out that the path was not listed in my PATH, so I added '~.local/bin'. Now the story goes on with the error message
CMake Error at CMakeLists.txt:90 (include):
include could not find load file:
Conan
I found
https://docs.conan.io/en/latest/howtos/cmake_launch.html.
OK, I inserted in my CMakeLists.txt file line
# Download automatically, you can also just copy the conan.cmake file
if(NOT EXISTS "${CMAKE_BINARY_DIR}/conan.cmake")
message(STATUS "Downloading conan.cmake from https://github.com/conan-io/cmake-conan")
file(DOWNLOAD "https://raw.githubusercontent.com/conan-io/cmake-conan/master/conan.cmake"
"${CMAKE_BINARY_DIR}/conan.cmake")
endif()
include(${CMAKE_BINARY_DIR}/conan.cmake)
conan_cmake_run(REQUIRES Catch2/2.6.0#catchorg/stable
BASIC_SETUP)
I was also advised,
Please specify in command line CMAKE_BUILD_TYPE
(-DCMAKE_BUILD_TYPE=Release)
So I use
cmake .. -DCMAKE_BUILD_TYPE=Release
rather than
cmake ..
Still, I receive
ERROR: compiler not defined for compiler.libcxx
Please define compiler value first too
FATAL_ERROR;conan install command failed.
STATUS;Conan: Compiler GCC>=5, checking major version 7
STATUS;Conan: Checking correct version: 7
About two weeks ago I could install on another system the same project flawlessly. Can I go back somehow to that state? I expected conan to be stable, rather than alpha.
Edit 2:
I issued
conan profile new default --detect --force
The reply is
Found gcc 7
gcc>=5, using the major as version
************************* WARNING: GCC OLD ABI COMPATIBILITY ***********************
Conan detected a GCC version > 5 but has adjusted the 'compiler.libcxx' setting to
'libstdc++' for backwards compatibility.
Your compiler is likely using the new CXX11 ABI by default (libstdc++11).
(I do not really know why in the case of a new project I need backward compatibility) After that,
cmake ..
finally seems to work. I am afraid I will have further issues due to the compiler standards. For example, SystemC defaults to '98, but some other library uses feature needing '14, and now conan forces to use '11. Is there a way to handle all this centrally, specific to MY system?
Concerning the two python versions: I did not install this manually, only some other install programs did so. I do not really know why and which install script causes such doubling. BTW: Ubuntu said that V2.7 is the newest version, although V3.x is also present. I am a bit confused about these version numbers.
I simply made a new install, and did not especially very WHEN the second version of python appeared. I personally do not even use python, only some install scripts could install it.
Whether my system is specific: I do not think so. I just installed Ubuntu 18.04.2, and my primary goal was to install this SystemC related stuff. I really installed ONLY what was declared as missing. (plus livetex, git, etc.)
In the meantime 'cmake ..' terminated. Appearently, the installation by conan terminated OK. However, when configuring my project, gives messages like
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
SCV_INCLUDE_DIRS
The missing files are installed also by conan, using
[requires]
SystemC/2.3.3#minres/stable
SystemCVerification/2.0.1#minres/stable
doxygen_installer/1.8.15#bincrafters/stable
qt/5.12.0#bincrafters/stable
gtest/1.8.1#bincrafters/stable
flex/2.6.4#bincrafters/stable
I am using literally the same files (either my old disk connected to the bus or the new one, using the same cable). The installation made about a month ago runs fine, the new one behaves as described.
It looks like installing and using conan is too complicated for me. I wanted to simplify installation rather than complicate it.
There is a bunch of cases related to installation listed here:
https://docs.conan.io/en/latest/installation.html#known-installation-issues-with-pip
I would say Conan is installed but is not listed in your PATH. You could find Conan in your Python package folder and update your PATH with conan path:
python -m site # list your package folder
find <package folder> -name conan
echo PATH=${PATH}:<package folder> >> ~/.bashrc
source ~/.bashrc

How can I make this script run

I found this script (tutorial) on GitHub (https://github.com/amyoshino/Dash_Tutorial_Series/blob/master/ex4.py) and I am trying to run in my local machine.
Unfortunately I am having and Error
I would really appreciate if anyone can help me to run this script.
Perhaps this is something easy but I am new in coding.
Thank you!
You probably just need to pip install the dash-core-components library!
Take a look at the Dash Installation documentation. It currently recommends running these commands:
pip install dash==0.38.0 # The core dash backend
pip install dash-html-components==0.13.5 # HTML components
pip install dash-core-components==0.43.1 # Supercharged components
pip install dash-table==3.5.0 # Interactive DataTable component (new!)
pip install dash-daq==0.1.0 # DAQ components (newly open-sourced!)
For more info on using pip to install Python packages, see: Installing Packages.
If you have run those commands, and Flask still throws that error, you may be having a path/environment issue, and should provide more info in your question about your Python setup.
Also, just to give you a sense of how to interpret this error message:
It's often easiest to start at the bottom and work your way up.
Here, the bottommost message is a FileNotFound error.
The program is looking for the file in your Python37/lib/site-packages folder. That tells you it's looking for a Python package. That is the directory to which Python packages get installed when you use a tool like pip.

stuck in using Megam in Python ( nltk.classify.MaxentClassifier)

I'm using ubuntu x64, after two days and searching all the net, still i've not been able to install Megam,
i've read all information in this page http://www.cs.utah.edu/~hal/megam/
and installed x64 version of o'calm from http://packages.ubuntu.com/precise/ocaml
but when i want to use "megam" as a classifier in python, it says:
"NLTK was unable to find the megam file! Use software specific
configuration paramaters or set the MEGAM environment variable.
could anybody tell me how can i install and make use of it in python?
i've downloaded "ocaml-3.12.1.tar.gz" but the "make" command doesn't work (as it's said in its readme).
i've downloaded "megam_i686.opt" too, but it's not executable and i cannot run it
any help?
thanks in advance
For the future users:
megam is now available on MAC through brew:
$brew tap homebrew/science
$brew install megam
Use config_megam() to tell NLTK where the Megam executable is located. See: http://nltk.googlecode.com/svn/trunk/doc/api/nltk.classify.megam-module.html for details and documentation.
You also need to build MEGAM with the right 32/64 bit setting for your system. "megam_i686.opt" is for x86 iirc, so you should compile it for 64 bit. It is a while since I did this, but a simple build on an x64 system was all I needed: "Make doesn't work" is not very useful: I'm sure it gave you a few error messages...? Probably paths not set or are read only?
Edit: Looks like the above link is currently broken. The main Megam site can be found at:
http://www.umiacs.umd.edu/~hal/megam/
although it hasn't been updated for a while.
Answer given by Hugh Perkins, helped me resolve the issue (due to low reputation can't add a comment to that answer). After downloading the zip file (from http://thinknook.com/wp-content/uploads/2012/11/MEGAM.zip), I needed to tell python where it was, and that was done by adding it in os.environ as:
os.environ["MEGAM"] = '<<Complete path followed by file name>>/megam-64'
I downloaded from http://thinknook.com/wp-content/uploads/2012/11/MEGAM.zip , which was linked from http://thinknook.com/nltk-megam-maximum-entropy-library-on-64-bit-linux-2012-11-27/ This worked ok for me, on ubuntu 14.04
I manged to get megam to run on my docker instance running debian 9.7 following the steps below. Based on steps for MacOS install steps suggested by Jack Hong here.
apt-get install make
apt-get install ocaml-nox (or apt-get install ocaml, if you want x window support)
download source from here
unzip source creating a megam_0.92 directory
Edit the Makefile in megam_0.92 and make the following changes. (The
first change was already done in my particular instance):
WITHCLIBS=-I /usr/local/lib/ocaml/caml
WITHSTR =str.cma -cclib -lcamlstr
Save the changes and run make inside your megam_0.92 directory
add nltk.config_megam('//megam_0.92/megam') to your script and all should be well.

Categories