Requirements error while trying to deploy to Scrapy Cloud - python

I'm trying to deploy my spider to Scrapy Cloud using shub but I keep running into this following error:
$ shub deploy
Packing version 2df64a0-master
Deploying to Scrapy Cloud project "164526"
Deploy log last 30 lines:
---> Using cache
---> 55d64858a2f3
Step 11 : RUN mkdir /app/python && chown nobody:nogroup /app/python
---> Using cache
---> 2ae4ff90489a
Step 12 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt
---> Using cache
---> 51f233d54a01
Step 13 : COPY *.egg /app/
---> e2aa1fc31f89
Removing intermediate container 5f0a6cb53597
Step 14 : RUN if [ -d "/app/addons_eggs" ]; then rm -f /app/*.dash-addon.egg; fi
---> Running in 3a2b2bbc1a73
---> af8905101e32
Removing intermediate container 3a2b2bbc1a73
Step 15 : ENV PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
---> Running in ccffea3009a4
---> b4882513b76e
Removing intermediate container ccffea3009a4
Successfully built b4882513b76e
>>> Checking python dependencies
scrapinghub 1.9.0 has requirement six>=1.10.0, but you have six 1.7.3.
monkeylearn 0.3.5 has requirement requests>=2.8.1, but you have requests 2.3.0.
monkeylearn 0.3.5 has requirement six>=1.10.0, but you have six 1.7.3.
hubstorage 0.23.6 has requirement six>=1.10.0, but you have six 1.7.3.
Warning: Pip checks failed, please fix the conflicts.
Process terminated with exit code 1, signal None, status=0x0100
{"message": "Dependencies check exit code: 193", "details": "Pip checks failed, please fix the conflicts", "error": "requirements_error"}
{"message": "Requirements error", "status": "error"}
Deploy log location: /var/folders/w0/5w7rddxn28l2ywk5m6jwp7380000gn/T/shub_deploy_xi_w3xx8.log
Error: Deploy failed: b'{"message": "Requirements error", "status": "error"}'
It looks like a simple problem of an outdated package (six). However the installed package actually IS up to date:
$ pip show six
Name: six
Version: 1.10.0
Summary: Python 2 and 3 compatibility utilities
Home-page: http://pypi.python.org/pypi/six/
Author: Benjamin Peterson
Author-email: benjamin#python.org
License: MIT
Location: /Users/mac/.pyenv/versions/3.6.0/lib/python3.6/site-packages
Requires:
I'm running python 3.6 through pyenv on a Mac.
Any ideas?
EDIT:
my requirements.txt file only contains the following dependency:
newspaper==0.0.9.8
EDIT 2: scrapinghub.yml
projects:
default: 164526
requirements_file: requirements.txt
Thanks,
Simon!

Managed to solve this (with help from scrapinghub's support forum) by adding the following code to scrapinghub.yml:
stacks:
default: scrapy:1.3-py3
and changing requirements.txt to use the python3 branch of newspaper:
newspaper3k==0.1.9

Related

Torch installment fails in Elastic Beanstalk web application

I am trying to deploy a basic web application using Elastic Beanstalk from AWS.
My app is written in python and uses pyTorch library so it can import NLP model named "bart-cnn-large" (with it I can produce text summarization).
I have a file named requirements.txt and with it the EC2 sets the virtual environment.
However, it always fails when trying to install the pytorch library.
If I remove "torch" from the requirements.txt then the installation doesn't fail anymore.
but I get this message:
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models
won't be available and only tokenizers, configuration and file/data
utilities can be used.
If I leave "torch" in requirements I get this message:
2021/08/06 10:54:23.688955 [ERROR] An error occurred during execution
of command [app-deploy] - [InstallDependency]. Stop running the
command. Error: fail to install dependencies with requirements.txt
file with error Command /bin/sh -c
/var/app/venv/staging-LQM1lest/bin/pip install -r requirements.txt
failed with error exit status 1. Stderr:ERROR: Invalid requirement:
'torch==1.9.0+cu102 torchvision==0.10.0+cu102 torchaudio===0.9.0'
(from line 1 of requirements.txt)
the requirments.txt content:
>torch==1.9.0+cu102 torchvision==0.10.0+cu102 torchaudio===0.9.0 -f
https://download.pytorch.org/whl/torch_stable.html
Flask~=2.0.1
Werkzeug~=2.0.1
tika~=1.24
beautifulsoup4~=4.8.2
docx2txt~=0.8
transformers~=4.8.2
clean-text
I tried several versions of "pip torch" but none seems to work.
is this a storage problem? why won't it install?

scrapyd-deploy error: pkg_resources.DistributionNotFound

I have been trying for a long time to find a solution to the scrapyd error message: pkg_resources.DistributionNotFound: The 'idna<3,>=2.5' distribution was not found and is required by requests
What I have done:
$ docker pull ceroic/scrapyd
$ docker build -t scrapyd .
Dockerfile:
FROM ceroic/scrapyd
RUN pip install "idna==2.5"
$ docker build -t scrapyd .
Sending build context to Docker daemon 119.3kB
Step 1/2 : FROM ceroic/scrapyd
---> 868dca3c4d94
Step 2/2 : RUN pip install "idna==2.5"
---> Running in c0b6f6f73cf1
Downloading/unpacking idna==2.5
Installing collected packages: idna
Successfully installed idna
Cleaning up...
Removing intermediate container c0b6f6f73cf1
---> 849200286b7a
Successfully built 849200286b7a
Successfully tagged scrapyd:latest
I run the container:
$ docker run -d -p 6800:6800 scrapyd
Next:
scrapyd-deploy demo -p tutorial
And get error:
pkg_resources.DistributionNotFound: The 'idna<3,>=2.5' distribution was not found and is required by requests
I'm not a Docker expert, and I don't understand the logic. If idna==2.5 has been successfully installed inside the container, why does the error message require version 'idna<3,>=2.5'?
The answer is very simple. I finished my 3 days! torment. When I run the
scrapyd-deploy demo -p tutorial
then I do it not in the created container, but outside it.
The problem was solved by:
pip uninstall idna
pip install "idna == 2.5"
This was to be done on a virtual server, not a container. I can't believe I didn't understand it right away.

Django App Deployment to Azure Failing Via Three Methods (problems creating/accessing tmp files)

I have been trying to upload the code from my Django app to Azure. Using multiple methods, it has been failing, with the log messages shown below. The failure seems to relate to the creation of temporary directories by Onyx build. To be transparent, I'm new to both Django and Azure.
First, I worked through the Azure tutorial (link below) and was able to get it working with minimal issues.
https://learn.microsoft.com/en-us/azure/app-service/tutorial-python-postgresql-app?tabs=bash%2Cclone
When I attempted to follow the same steps (using "webapp up") with my own app, I get the following error message:
Zip deployment failed. {'id': 'XXXXX',
'status': 3, 'status_text': '', 'author_email': 'N/A', 'author':
'N/A', 'deployer': 'Push-Deployer', 'message': 'Created via a push
deployment', 'progress': '', 'received_time':
'2020-10-21T13:59:11.0137785Z', 'start_time':
'2020-10-21T13:59:11.3572791Z', 'end_time':
'2020-10-21T13:59:34.2598809Z', 'last_success_end_time': None,
'complete': True, 'active': False, 'is_temp': False, 'is_readonly':
True, 'url':
'https://att-informativeness-task.scm.azurewebsites.net/api/deployments/latest',
'log_url':
'https://att-informativeness-task.scm.azurewebsites.net/api/deployments/latest/log',
'site_name': 'att-informativeness-task'}. Please run the command az
webapp log deployment show -n att-informativeness-task -g DjangoPostgres-attInform-rg
Running the command they recommend,
az webapp log deployment show -n att-informativeness-task -g DjangoPostgres-attInform-rg
I get:
[ {
"details_url": null,
"id": "XXXXXX",
"log_time": "2020-10-21T13:59:11.2061269Z",
"message": "Updating submodules.",
"type": 0 }, {
"details_url": null,
"id": "XXXXXX",
"log_time": "2020-10-21T13:59:11.324348Z",
"message": "Preparing deployment for commit id '53d578b78c'.",
"type": 0 }, {
"details_url": null,
"id": "XXXXXX",
"log_time": "2020-10-21T13:59:11.6729002Z",
"message": "Repository path is /tmp/zipdeploy/extracted",
"type": 0 }, {
"details_url": "https://att-informativeness-task.scm.azurewebsites.net/api/deployments/53d578b78cf941b986537b13d0e6dd06/log/cb3995cc-27c2-4ca7-af9e-f0ca9f2446b7",
"id": "XXXXXX",
"log_time": "2020-10-21T13:59:11.8452701Z",
"message": "Running oryx build...",
"type": 2 } ]
In the log file they link, it gives an error similar to that which I get from Bitbucket. As that's pasted below in more readable format, I won't paste it here.
After numerous attempts, I then tried to follow the instructions to push to an Azure directory, as described by the following link (replacing the runtime with "python:3.6").
https://learn.microsoft.com/en-us/azure/developer/javascript/tutorial-vscode-azure-cli-node-04
when running the command
git push azure master
I get the error message:
remote: .......................... remote: Pip install requirements.
remote: Invalid requirement: 'asgiref #
file:///tmp/build/80754af9/asgiref_1602513567813/work' remote:
Traceback (most recent call last): remote: File
"c:\home\site\wwwroot\env\lib\site-packages\pip_internal\req\req_install.py",
line 252, in from_line remote: req = Requirement(req) remote:
File
"c:\home\site\wwwroot\env\lib\site-packages\pip_vendor\packaging\requirements.py",
line 104, in init remote: raise InvalidRequirement("Invalid
URL given") remote:
pip._vendor.packaging.requirements.InvalidRequirement: Invalid URL
given remote: remote: You are using pip version 10.0.1, however
version 20.2.4 is available. remote: You should consider upgrading via
the 'python -m pip install --upgrade pip' command. remote: An error
has occurred during web site deployment. remote: remote: Error -
Changes committed to remote repository but deployment to website
failed.
(As an aside, my pip version is 20.2.4)
Finally, I tried to link it to a bitbucket repository, as described in the link below.
https://stories.mlh.io/deploying-a-basic-django-app-using-azure-app-services-71ec3b21db08
In the Azure deployment center, it gave a "Failed" status, with the message below.
Command: oryx build /home/site/repository -o /home/site/wwwroot
--platform python --platform-version 3.7 -i /tmp/8d875cc3f8eaaa0 -p compress_virtualenv=tar-gz -p virtualenv_name=antenv --log-file
/tmp/build-debug.log Operation performed by Microsoft Oryx,
https://github.com/Microsoft/Oryx You can report issues at
https://github.com/Microsoft/Oryx/issues
Oryx Version: 0.2.20200917.1, Commit:
59deb778658a124cb74ea8e2c8f39fa87abcc9d9, ReleaseTagName: 20200917.1
Build Operation ID: |65ZW7nGEe38=.2744c71e_ Repository Commit :
fa4d6bc9674997d6c32b0dd6ffc32c29c4364488
Detecting platforms... Detected following platforms: python: 3.7.9
Warning: An outdated version of python was detected (3.7.9). Consider
updating.\nVersions supported by Oryx:
https://github.com/microsoft/Oryx
Using intermediate directory '/tmp/8d875cc3f8eaaa0'.
Copying files to the intermediate directory... Done in 1 sec(s).
Source directory : /tmp/8d875cc3f8eaaa0 Destination directory:
/home/site/wwwroot
Python Version: /opt/python/3.7.9/bin/python3.7 Python Virtual
Environment: antenv Creating virtual environment... Activating virtual
environment... Running pip install... [14:19:18+0000] Collecting
appnope==0.1.0 [14:19:18+0000] Downloading
appnope-0.1.0-py2.py3-none-any.whl (4.0 kB) ERROR: Could not install
packages due to an EnvironmentError: [Errno 2] No such file or
directory: '/tmp/build/80754af9/asgiref_1602513567813/work'
[14:19:18+0000] Processing
/tmp/build/80754af9/asgiref_1602513567813/work
WARNING: You are using pip version 20.1.1; however, version 20.2.4 is
available. You should consider upgrading via the
'/tmp/8d875cc3f8eaaa0/antenv/bin/python -m pip install --upgrade pip'
command. ERROR: Could not install packages due to an EnvironmentError:
[Errno 2] No such file or directory:
'/tmp/build/80754af9/asgiref_1602513567813/work'\n\nWARNING: You are
using pip version 20.1.1; however, version 20.2.4 is available.\nYou
should consider upgrading via the
'/tmp/8d875cc3f8eaaa0/antenv/bin/python -m pip install --upgrade pip'
command.\n/opt/Kudu/Scripts/starter.sh oryx build
/home/site/repository -o /home/site/wwwroot --platform python
--platform-version 3.7 -i /tmp/8d875cc3f8eaaa0 -p compress_virtualenv=tar-gz -p virtualenv_name=antenv --log-file
/tmp/build-debug.log
For all of these, it seems the issue centers on creating and accessing tmp directories. It's weird because the process worked with the Azure tutorial code. I tried to bring my own code in line with the example by the creation of files like "production.py" and "settings.txt," changing variables in my "settings.py" file, as well as following the recommendations after running,
python manage.py check --deploy
Really any suggestions would be welcome. Thanks!
Figured out what was going on. Thought I would post the answer here in case any other newbie runs into the same issue.
The problem was in the requirements.txt file. I had used the command,
pip freeze > requirements.txt
then didn't go through the results. Turns out there were a number of packages with an "#file..." to the right. Those were causing the errors. I just deleted the "#file..." and replaced it with a specific version for the package.
Also, make sure you specify that psycopg is the binary, using.
psycopg2-binary==2.8.6

Requirements error deploying to scrapy cloud with shub deploy

I deployed a project 3 days ago with shub deploy which ran perfectly. I just tried deploying the same code again today and it shows a requirements error like this:
Packing version c1f72fb-master
Deploying to Scrapy Cloud project "187201"
Deploy log last 30 lines:
---> 72b41733c189
Step 9 : RUN mkdir /app/python && chown nobody:nogroup /app/python
---> Using cache
---> dda1555878eb
Step 10 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install
--user --no-cache-dir -r /app/requirements.txt
---> Using cache
---> cccdde466280
Step 11 : COPY *.egg /app/
---> afc6b3540c92
Removing intermediate container bd3bedcee848
Step 12 : RUN if [ -d "/app/addons_eggs" ]; then rm -f /app/*.dash-
addon.egg; fi
---> Running in 80461e4402dc
---> 830db9615167
Removing intermediate container 80461e4402dc
Step 13 : ENV PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
---> Running in 9af6ab0fdc02
---> 0f33ba992cc7
Removing intermediate container 9af6ab0fdc02
Successfully built 0f33ba992cc7
>>> Checking python dependencies
WARNING: There're some errors when doing pip-check:
Traceback (most recent call last):
File "/usr/local/bin/pip", line 4, in <module>
import re
File "/usr/local/lib/python3.6/re.py", line 142, in <module>
class RegexFlag(enum.IntFlag):
AttributeError: module 'enum' has no attribute 'IntFlag'
{"message": "Dependencies check exit code: 1", "details": "Pip checks
failed, please fix the conflicts", "error": "requirements_error"}
{"status": "error", "message": "Requirements error"}
Deploy log location:
c:\users\sim04\appdata\local\temp\shub_deploy__oqwt2.log
Error: Deploy failed: {"status": "error", "message": "Requirements error"}
Make sure you followed these steps to specify your required modules correctly.
Create a file named scrapinghub.yml in your project's main folder with following contents.
projects:
default: 111149
requirements:
file: requirements.txt
Where 111149 is my proejct ID on scrapinghub.
Create another file named requirements.txt in same directory.
and put your required modules along with the version number you are using in that file like so,
MySQL-python==1.2.5
PS: I was using MySQLDB module so I put that.

Defining a correct requirements.txt file

I have developed a Flask web application that works on my local computer but which I am now trying to port onto the web (via IBM Bluemix). My first attempt to do so was unsuccessful. The error message I receive is:
Server error, status code: 400, error code: 170001, message: Staging error: no available stagers
When I check the log files with cf logs myapp --recent I find:
2015-11-08T15:34:15.92-0500 [STG/35] OUT -----> Downloaded app package (72K)
2015-11-08T15:34:19.98-0500 [STG/35] OUT -----> Downloaded app buildpack cache (39M)
2015-11-08T15:34:24.82-0500 [STG/0] OUT -------> Buildpack version 1.3.1
2015-11-08T15:34:40.57-0500 [STG/0] OUT -----> Installing dependencies with pip
2015-11-08T15:34:41.54-0500 [STG/0] OUT You are using pip version 6.1.0.dev0, however version 7.1.2 is available.
2015-11-08T15:34:41.54-0500 [STG/0] OUT You should consider upgrading via the 'pip install --upgrade pip' command.
2015-11-08T15:34:41.56-0500 [STG/0] OUT Collecting flask.ext.wtf (from -r requirements.txt (line 2))
2015-11-08T15:34:41.88-0500 [STG/0] OUT Could not find a version that satisfies the requirement flask.ext.wtf (from -r requirements.txt (line 2)) (from versions: )
2015-11-08T15:34:41.88-0500 [STG/0] OUT No matching distribution found for flask.ext.wtf (from -r requirements.txt (line 2))
2015-11-08T15:34:41.96-0500 [STG/0] OUT Staging failed: Buildpack compilation step failed
2015-11-08T15:34:41.97-0500 [STG/0] ERR
2015-11-08T15:34:42.67-0500 [API/2] ERR encountered error: App staging failed in the buildpack compile phase
2015-11-08T15:35:37.75-0500 [API/3] OUT Updated app with guid b580bb64-4415-4bb4-8fd1-1e4d3de4f7d9 ({"name"=>"cultural-insight", "memory"=>128, "environment_json"=>"PRIVATE DATA HIDDEN"})
2015-11-08T15:35:49.95-0500 [API/3] OUT Updated app with guid b580bb64-4415-4bb4-8fd1-1e4d3de4f7d9 ({"state"=>"STOPPED"})
2015-11-08T15:35:52.41-0500 [DEA/113] OUT Got staging request for app with id b580bb64-4415-4bb4-8fd1-1e4d3de4f7d9
2015-11-08T15:35:52.47-0500 [API/0] ERR exception handling first response Staging error: failed to stage application:
2015-11-08T15:35:52.47-0500 [API/0] ERR Not enough memory resources available
2015-11-08T15:50:52.42-0500 [API/0] ERR encountered error: Staging error: failed to stage application: staging had already been marked as failed, this could mean that staging took too long
The problem appears to be that pip can't find Flask-WTF, which I need for my application to work.
I installed Flask-WTF on my local machine with pip install Flask-WTF. The contents of requirements.txt, which the builder ingests while setting up, is simply:
Flask==0.10.1
Flask-WTF
In particular, I am unsure why pip is asking for flask.ext.wtf? Additionally it is troubling me that the application says it is limited to 128 MB, when I bumped it up to 512 MB.
Altogether, I'm not sure what's going on. How do I resolve this issue? The full source code is here.
run this (EDITED):
pip install --upgrade pip
and update your requirements.txt file:
pip freeze > requirements.txt

Categories