I'm trying to scrape images from google images using the google_images_download library by using it from another Python file. I previously used the code below about a month ago and it was fine but today morning it threw exception errors and then finally gave me the error
Unfortunately all 100 could not be downloaded because some images were not downloadable
I checked the documentation and GIT repo and noticed there were changes made 15 days ago, is there something I'm missing or is the library bugged? Also if there are better methods than this, kindly point me in the right direction. My code is below:
from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()
arguments = {"keywords":"potato harvesting","limit":100,"format":"jpg","print_urls":True}
paths = response.download(arguments)
I figured a way to fix the error ,through using the CLI instead of a Jupyter Notebook file. I will list down the steps:
First issue is probably how to uninstall the dependencies. Since I used the python setup.py install, I manually had to uninstall them, luckily I found python setup.py uninstall and the procedure is layed out there on how to manually uninstall (highest voted answer).
I then cloned the repo again in a new folder using !git clone https://github.com/Joeclinton1/google-images-download.git via the CLI and then opened the files by cd google-images-download.
After ,I reinstalled the packages using pip install . NOT python setup.py install.
Use the CLI to download images by following the repo instructions in https://google-images-download.readthedocs.io/en/latest/examples.html.
This worked for me and hopefully will work for you. Note: I cloned the repo in a new folder on the desktop for ease. The downloaded images will be in the same repo folder i.e. google-images-download under the Download file.
Related
I have a similar problem as: Kaggle API issue "Could not find kaggle.json. Make sure it's located in......"
I have the same error when I type kaggle competitions download -c spaceship-titanic
But in my case the folder ".kaggle/" is actually empty. So I assume I downloded kaggle api incorrectly, how I download it correctly?
Things I have tried acccording to https://github.com/Kaggle/kaggle-api
pip install kaggle, pip install --user kaggle, sudo pip install kaggle
The first two compiled, but didnot create the kaggle.json file.
The third didnot compiled and it said sudo command not found.
Sample photo for finding the accounts page
I am aware of this popular topic, however I am running into a different outcome when installing a python app using pip with git+https and python setup.py
I am building a docker image. I am trying to install in an image containing several other python apps, this custom webhook.
Using git+https
RUN /venv/bin/pip install git+https://github.com/alerta/alerta-contrib.git#subdirectory=webhooks/sentry
This seems to install the webhook the right way, as the relevant endpoint is l8r discoverable.
What is more, when I exec into the running container and doing a search for relevant files, I see the following
./venv/lib/python3.7/site-packages/sentry_sdk
./venv/lib/python3.7/site-packages/__pycache__/alerta_sentry.cpython-37.pyc
./venv/lib/python3.7/site-packages/sentry_sdk-0.15.1.dist-info
./venv/lib/python3.7/site-packages/alerta_sentry.py
./venv/lib/python3.7/site-packages/alerta_sentry-5.0.0-py3.7.egg-info
In my second approach I just copy this directory locally and in my Dockerfile I do
COPY sentry /app/sentry
RUN /venv/bin/python /app/sentry/setup.py install
This does not install the webhook appropriately and what is more, in the respective container I see a different file layout
./venv/lib/python3.7/site-packages/sentry_sdk
./venv/lib/python3.7/site-packages/sentry_sdk-0.15.1.dist-info
./venv/lib/python3.7/site-packages/alerta_sentry-5.0.0-py3.7.egg
./alerta_sentry.egg-info
./dist/alerta_sentry-5.0.0-py3.7.egg
(the sentry_sdk - related files must be irrelevant)
Why does the second approach fail to install the webhook appropriately?
Should these two option yield the same result?
What finally worked is the following
RUN /venv/bin/pip install /app/sentry/
I don't know the subtle differences between these two installation modes
I did notice however that /venv/bin/python /app/sentry/setup.py install did not produce an alerta_sentry.py but only the .egg file, i.e. ./venv/lib/python3.7/site-packages/alerta_sentry-5.0.0-py3.7.egg
On the other hand, /venv/bin/pip install /app/sentry/ unpacked (?) the .egg creating the ./venv/lib/python3.7/site-packages/alerta_sentry.py
I don't also know why the second installation option (i.e. the one creating the .egg file) was not working run time.
I am using Google's Neuroglancer which I downloaded from GitHub and trying to run an example script provided by them. However, one of the lines is import neuroglancer, and since I cloned the whole repo there is a neuroglancer folder with all of the required files, but I am getting the following error:
ImportError: No module named 'neuroglancer'
Is there any way I could fix this? I don't see the issue since neuroglancer is in the same file path as the python script.
In case you are running the example in the neuroglancer project try following their guide.
Otherwise you might want to try and installing the Neuroglancer package using something like pip and a virtual environment to be able to import the package into the project.
I have recently installed this plugin which is working great ...
Now my issue is that when I repopulate the ES 'index' with new data, I want to delete the existing 'index' first in ES. This is to delete old data in ES.
The above mentioned plugin contains this file scrapyelasticsearch.py where I think I can add this code
es.delete(index='my-index', doc_type='test')
to delete the index before repopulating.
The plugin will automatically recreate the index before inserting data.
Question: I couldn't find where this file (scrapyelasticsearch.py) is located ? I am using Ubuntu 16.04, with ES and Scrapy also installed.
I tried this command to find this package
dpkg -l scrapyelasticsearch
but received this error
dpkg-query: no packages found matching scrapyelasticsearch
If anyone has used this plugin/package, please help me find this file scrapyelasticsearch.py
Any help is very appreciated. Thanks
The file is located in your site-packages directory of your python installation. So if you're running on system's python (not a virtual environment) it would be something like:
/usr/lib/python3.5/site-packages/
However, you should not modify site-package data!
What you should do is clone or fork the project on github, make your changes to it, and install this fork on your system.
git clone https://github.com/knockrentals/scrapy-elasticsearch.git
cd scrapy-elasticsearch
your_editing_program 'scrapyelasticsearch/scrapyelasticsearch.py'
# make changes
pip uninstall scrapy-elasticsearch # uninstall old original package
pip install . # install your package, you can also add -e flag for real time modifications
I would like to use scheduler in my python program however I haven't been able to install it.
I tried with Easy_Install and PIP (neither of which I've used before) and I can't find a link for another method. I'm using Python 2.7 on Windows Vista
Since I've never used PIP before I had to install that first. After installing pip I went to command prompt, changed to the directory with pip and typed:
C:\Python27\Scripts>pip install apscheduler
It didn't come up with an error so I assumed it installed, however when I run my python program, which includes the line: from apscheduler.scheduler import Scheduler
it states:
ImportError: No module named apscheduler.scheduler
and when I look at the list of installed modules in Idle it's not there.
It's probably something obvious since I don't have a lot of experience in programming yet.
Help would be much appreciated!
sm
Hi again,
I got it working finally, in the end I didn't use PIP, in case other people need help this is what I did:
Downloaded the apscheduler tar.gz file
Downloaded 7-zip, since this can extract tar.gz files on Windows.
Extracted the tar.gz file using 7-zip, I had to do this twice since the first time I clicked extract it extracted to a .tar file (APScheduler-2.1.2.tar), it was necessary to extract this file as well.
Added C:python27\ to the windows path (this is in control panel->system & maintenance->system->advanced system settings->environment variables)
(I also added C:python27\scripts\ to the path, but not sure whether this makes a difference.
Opened command prompt and moved to the folder containing the extracted APScheduler files including the file named setup.py
In my case this was- C:\Python27\APScheduler\APScheduler-2.1.2\APScheduler-2.1.2\
In command prompt typed> python setup.py install
Hopefully this was everything, perhaps one day I'll delete everything and try again to check, but it took quite sometime to get it going so right now I think I'll leave it as is.