I had multiple spiders in my project folder and want to run all the spiders at once, so i decided to run them using scrapyd service.
I have started doing this by seeing here
First of all i am in current project folder
I had opened the scrapy.cfg file and uncommented the url line after
[deploy]
I had run scrapy server command, that works fine and scrapyd server runs
I tried this command scrapy deploy -l
Result : default http://localhost:6800/
when i tried this command scrapy deploy -L scrapyd i got following output
Result:
Usage
=====
scrapy deploy [options] [ [target] | -l | -L <target> ]
deploy: error: Unknown target: scrapyd
when i tried to deploy the project with this command scrapy deploy scrapyd -p default got following error
Usage
=====
scrapy deploy [options] [ [target] | -l | -L <target> ]
deploy: error: Unknown target: scrapyd
I am really unable to identify whey scrapyd is showing the above errors, can lead me in to a correct way of how to deploy a project in to scrapyd
Thanks in advance..........
Edited Code:
After seeing the answer of Peter Kirby,i named target in scrapy.cfg and tried the following command in my project folder,
command:
scrapy deploy ebsite -p ebsite
then i got the below error
Building egg of ebsite-1341808241
'build/lib' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-2.7' does not exist -- can't clean it
zip_safe flag not set; analyzing archive contents...
Deploying ebsite-1341808241 to http://localhost:6800/addversion.json
Deploy failed: <urlopen error [Errno 111] Connection refused>
How to solve this.....
From scrapyd service documentation: (http://scrapy.readthedocs.org/en/latest/topics/scrapyd.html?highlight=scrapyd)
You can define targets by adding them to your project’s scrapy.cfg
file... Here’s an example of defining a new target scrapyd2 with
restricted access through HTTP basic authentication:
[deploy:scrapyd2]
url = http://scrapyd.mydomain.com/api/scrapyd/
username = john
password = secret
Essentially what your error means is that your "target" name is not correct. If I remember correctly, the scrapy.cfg file sets the initial target name as "default". What you should be typing is something like:
scrapy deploy default -p project_name
Just type scrapy deploy if you have no named targets and left settings at default!
I got this error when I try to deploy my project without the scrapyd running, so simple run
scrapyd
on another terminal fixed the error
This is the scrapyd proc have no permission!
You need kill the proc,then use root user,Just type:
scrapy server
then the new scrapyd will run.then you can do as scrapyd documention says.
Related
I'm trying to test a django app in a docker image. I have followed the tutorial until the it starts the django project. When I run, docker-compose run web django-admin startproject composeexample . I get
Error response from daemon: OCI runtime create failed:
container_linux.go:380: starting container process caused: exec:
"django-admin": executable file not found in $PATH: unknown
ERROR: 1
It's weird because I can run django-admin createproject ... seperately. Any thoughts?
I have added the ..\python39\scripts in the path. Also I have python 3.9.10 from the windows store.
I am trying to deploy Django+Scrapy project on Ubuntu 16.04. When I run scrapyd-deploy, as it is described in the docs, - I get:
Packing version 1526639948
Deploying to project "first_scrapy" in http://my_ip/addversion.json
Deploy failed (404): <full HTML code of '404.html' page>
When I run scrapyd-deploy -l - I see:
default http://my_ip
My scrapy.cfg:
[settings]
default = first_scrapy.settings
[deploy]
url = http://my_ip
username = root
password = rootpassword
project = first_scrapy
What am I doing wrong?
UPDATE 1:
If I change in my scrapy.cfg url=http://my_ip:6800 - this still throws 404 error. Next I tried to run scrapyd in the second console and this was the first time I saw another answer - details are here.
So question now is - how to run scrapyd constantly so if I close the console - it will be still working?
You just have to change directory into your project folder and then run scrapyd command with “nohup” and that will make sure that scrapyd doesn’t get closed after you disconnect with server
cd /path/to/your/project && nohup scrapyd >& /dev/null &
I have a custom user class called MyUser. It works fine locally with registrations, logins and so on. I'm trying to deploy my application to AWS Elastic Beanstalk and I'm running into some problems with creating my superuser.
I tried making a script file and run it as the official AWS guide suggests. Didnt work well so I decided to try a secondary method suggested here and create a custom manage.py command to create my user.
When I deploy I get the following errors in the log.
[Instance: i-8a0a6d6e Module: AWSEBAutoScalingGroup ConfigSet: null] Command failed on instance. Return code: 1 Output: [CMD-AppDeploy/AppDeployStage0/EbExtensionPostBuild] command failed with error code 1: Error occurred during build: Command 02_createsu failed.
[2015-03-10T08:05:20.464Z] INFO [17937] : Command processor returning results:
{"status":"FAILURE","api_version":"1.0","truncated":"false","results":[{"status":"FAILURE","msg":"[CMD-AppDeploy/AppDeployStage0/EbExtensionPostBuild] command failed with error code 1: Error occurred during build: Command 02_createsu failed","returncode":1,"events":[]}]}
[2015-03-10T08:05:20.463Z] ERROR [17937] : Command execution failed: [CMD-AppDeploy/AppDeployStage0/EbExtensionPostBuild] command failed with error code 1: Error occurred during build: Command 02_createsu failed (ElasticBeanstalk::ActivityFatalError)
at /opt/elasticbeanstalk/lib/ruby/lib/ruby/gems/2.1.0/gems/beanstalk-core-1.1/lib/elasticbeanstalk/activity.rb:189:in `rescue in exec'
...
caused by: command failed with error code 1: Error occurred during build: Command 02_createsu failed (Executor::NonZeroExitStatus)
The code looks like the following:
This is my mysite.config file in .ebextensions/
01_syncdb and 03_collectstatic works fine.
container_commands:
01_syncdb:
command: "django-admin.py migrate --noinput"
leader_only: true
02_createsu:
command: "manage.py createsu"
leader_only: true
03_collectstatic:
command: "django-admin.py collectstatic --noinput"
option_settings:
- namespace: aws:elasticbeanstalk:container:python
option_name: WSGIPath
value: treerating/wsgi.py
- option_name: DJANGO_SETTINGS_MODULE
value: treerating.settings
This is my /profiles/management/commands/createsu.py file:
from django.core.management.base import BaseCommand
from django.contrib.auth.models import User
from profiles.models import MyUser
class Command(BaseCommand):
def handle(self, *args, **options):
if MyUser.objects.count() == 0:
MyUser.objects.create_superuser("admin", "treerating", "password")
And I have __init__.py files in both /management/ and /commands/ folders.
I tried this command locally from command line and it works fine and creates the user without errors. So there shouldnt be any issue with the command itself or the MyUser.objects.create_superuser().
EDIT: I tried changing my def handle(): function to only set a variable to True and I still get the same errors. So it seems like the problem is not related to the create_superuser function or the handle, but more something with using manage.py.
Any ideas?
EDIT 2:
I tried executing the command by SSH and failed. I then followed the instructions in this post and set the Python Path's manually with:
source /opt/python/run/venv/bin/activate
and
source /opt/python/current/env
I was then able to successfully create my user.
The official AWS Django Deployment guide does not mention anything about this. But I guess you are suppose to set your Python Path's in the .config file somehow. I'm not sure exactly how to do this so if someone still want to answer that, I will test it and accept it as answer if that will solve the deployment errors.
Double-check the link to your secondary method. You can set the python path in the option settings (.ebextensions/02_python.config) that you've created:
option_settings:
"aws:elasticbeanstalk:application:environment":
DJANGO_SETTINGS_MODULE: "iotd.settings"
"PYTHONPATH": "/opt/python/current/app/iotd:$PYTHONPATH"
"ALLOWED_HOSTS": ".elasticbeanstalk.com"
However, I've done this and am still experiencing the issue you've described, so you'll have to see if it fixes it.
EDIT: It turns out my issue was a file structure issue. I had the management directory in the project directory, when it should have been placed one level deeper in the directory of one of my apps.
This placed it one level deeper beneath my manage.py and settings.py than is shown in the example, but it is working fine now.
I know this could be late but I just wanted to share that I solved this issue by adding the file /profiles/management/commands/createsu.py into the app folder you are using.
In my case was:
easy/easyapp/management/commands/createsu.py
where easy is my project and easyapp my app.
Another alternative that worked for me is to just go directly into the config.yml file and change the wsgi path there. You can get access with the eb config command and just go down 50 lines or so, make your changes, escape and save. This is only an environment-specific solution though.
Could you please help me figure out what I'm doing wrong ? Here are the steps:
followed the portia install manual found here https://github.com/scrapinghub/portia - all ok
created a new project, entered an url, tagged an item - all ok
clicked "continue browsing", browsed through site, items were being extracted as expected - all ok
Next I wanted to deploy my spider:
1st try : I tried to run, as the docs specified, scrapyd-deploy your_scrapyd_target -p project_name - got error - scrapyd wasn't installed
fix: pip install scrapyd
2nd try : I launched scrapyd server, accessed http://localhost:6800/ -all ok
After a brief reading of scrapyd docs I found out I had to edit the file scrapy.cfg from my project : slyd/data/projects/new_project/scrapy.cfg
added the following :
[deploy:local]
url = http://localhost:6800/
Went back to the console, checked all is ok :
$:> scrapyd-deploy -l
local http://localhost:6800/
$:> scrapyd-deploy -L local
default
Seemed ok so i gave it another try :
$scrapyd-deploy local -p default
Packing version 1418722113
Deploying to project "default" in http://localhost:6800/addversion.json
Server response (200):
{"status": "error", "message": "IOError: [Errno 21] Is a directory: '/Users/Mike/www/portia/slyd/data/projects/new_project'"}
What am I missing ?
For anyone who stumbles upon this issue, the fix is to deploy scrapyd in another directory other than that of the project.
See details here : https://github.com/scrapinghub/portia/issues/128
I am trying to deploy scrapy project.
But getting error :
My scrapy.cfg file is :
[settings]
default = eScraper.settings
[deploy]
url = http://localhost:8680/
project = eScraper
And i used this command to deploy : scrapy deploy default -p eScraper
But getting error
Building egg of eScraper-1369325126
'build/scripts-2.7' does not exist -- can't clean it
zip_safe flag not set; analyzing archive contents...
Deploying eScraper-1369325126 to http://localhost:8680/addversion.json
Deploy failed: <urlopen error [Errno 111] Connection refused>
I tried changing port also but it didn't worked also i tried using above command with sudo but nothing .....can some one help me.......
If you're using lastest version of scrapy (mine 0.24.2) then
scrapy server
no longer exist, it was moved to separate package called scrapyd
simply run
scrapyd
to start the service
Please first run This command scrapy server and then run deploy command on the another terminal...