unable to deploy portia spider with scrapyd-deploy - python

Could you please help me figure out what I'm doing wrong ? Here are the steps:
followed the portia install manual found here https://github.com/scrapinghub/portia - all ok
created a new project, entered an url, tagged an item - all ok
clicked "continue browsing", browsed through site, items were being extracted as expected - all ok
Next I wanted to deploy my spider:
1st try : I tried to run, as the docs specified, scrapyd-deploy your_scrapyd_target -p project_name - got error - scrapyd wasn't installed
fix: pip install scrapyd
2nd try : I launched scrapyd server, accessed http://localhost:6800/ -all ok
After a brief reading of scrapyd docs I found out I had to edit the file scrapy.cfg from my project : slyd/data/projects/new_project/scrapy.cfg
added the following :
[deploy:local]
url = http://localhost:6800/
Went back to the console, checked all is ok :
$:> scrapyd-deploy -l
local http://localhost:6800/
$:> scrapyd-deploy -L local
default
Seemed ok so i gave it another try :
$scrapyd-deploy local -p default
Packing version 1418722113
Deploying to project "default" in http://localhost:6800/addversion.json
Server response (200):
{"status": "error", "message": "IOError: [Errno 21] Is a directory: '/Users/Mike/www/portia/slyd/data/projects/new_project'"}
What am I missing ?

For anyone who stumbles upon this issue, the fix is to deploy scrapyd in another directory other than that of the project.
See details here : https://github.com/scrapinghub/portia/issues/128

Related

404 error when deploying Scrapy

I am trying to deploy Django+Scrapy project on Ubuntu 16.04. When I run scrapyd-deploy, as it is described in the docs, - I get:
Packing version 1526639948
Deploying to project "first_scrapy" in http://my_ip/addversion.json
Deploy failed (404): <full HTML code of '404.html' page>
When I run scrapyd-deploy -l - I see:
default http://my_ip
My scrapy.cfg:
[settings]
default = first_scrapy.settings
[deploy]
url = http://my_ip
username = root
password = rootpassword
project = first_scrapy
What am I doing wrong?
UPDATE 1:
If I change in my scrapy.cfg url=http://my_ip:6800 - this still throws 404 error. Next I tried to run scrapyd in the second console and this was the first time I saw another answer - details are here.
So question now is - how to run scrapyd constantly so if I close the console - it will be still working?
You just have to change directory into your project folder and then run scrapyd command with “nohup” and that will make sure that scrapyd doesn’t get closed after you disconnect with server
cd /path/to/your/project && nohup scrapyd >& /dev/null &

Running Django locally with heroku fails due to missing Procfile

I have a Django 1.11/Python 3.5 app that I built and want to run on Heroku locally. It's a simple SPA using the Heroku Django template provided on GitHub (https://github.com/heroku/heroku-django-template). I followed a Heroku tutorial (https://devcenter.heroku.com/articles/deploying-python#how-to-keep-build-artifacts-out-of-git), but I cannot seem to run the app locally using the following command:
heroku local web
Running this produces the following error:
return binding.open(pathModule._makeLong(path), stringToFlags(flags), mode);
^
Error: EACCES: permission denied, open '.env'
at Object.fs.openSync (fs.js:584:18)
at Object.fs.readFileSync (fs.js:491:33)
at loadEnvsFile (/snap/heroku/414/lib/node_modules/heroku-cli/node_modules/foreman/lib/envs.js:133:15)
at Array.map (native)
at loadEnvs (/snap/heroku/414/lib/node_modules/heroku-cli/node_modules/foreman/lib/envs.js:148:30)
at Command.<anonymous> (/snap/heroku/414/lib/node_modules/heroku-cli/node_modules/foreman/nf.js:72:16)
at Command.listener (/snap/heroku/414/lib/node_modules/heroku-cli/node_modules/commander/index.js:301:8)
at emitTwo (events.js:106:13)
at Command.emit (events.js:194:7)
at Command.parseArgs (/snap/heroku/414/lib/node_modules/heroku-cli/node_modules/commander/index.js:615:12)
My .env file looks like this:
WEB_CONCURRENCY=3
SECRET_APP_KEY="xxxxxxxxxxxxxxxxxxxx"
I ran chmod 777 on .env but I get the same error.
When I run the following command:
heroku local
I get the following error:
[WARN] EACCES: permission denied, open 'Procfile'
[FAIL] No Procfile and no package.json file found in Current Directory - See run.js --help
▸ Cannot convert undefined or null to object
My Procfile looks like this:
web: gunicorn personal_website.wsgi
Now I cannot understand why when running "heroku local web" I get the previously mentioned error, especially after giving it the necessary permissions.
Also, others have had the same error when running "heroku local", but the answer is to "make sure the formatting of your Procfile is correct." Well mine is correct and have tried many variations to it.
I have seen others with this issue on SO and some have been resolved, unfortunately for me none of them worked.
What could I possibly be doing wrong here?

Django and Gunicorn: 403 Forbidden

I have a django application inside /home//my_app that I am trying to deploy using gunicorn:
sudo gunicorn --workers=2 -b :8081 tutorial.wsgi:application
After deploying the application with the command above, I log into another ssh instance (on the same server) and run the following command:
wget 127.0.0.1:8081
This returns a 403 FORBIDDEN.
Things I have tried:
1. Tried to chmod 755, and even 777, in app directory (Did not work)
2. Tried to move app directory to /etc/www/myapp (Did not work)
3. Tried to run all commands using root access (Did not work)
It is worth noting that I am not that familiar with linux and that this error is literally driving me crazy.
SOLVED IT:
after downloading cURL, in order to see the http header, it turned out that the service worked, but returned a 403 because a missing token authorization. Oops.
Please make sure you have coded views.py and urls.py to server GET requeat at /.

How to deploy scrapy project

I am trying to deploy scrapy project.
But getting error :
My scrapy.cfg file is :
[settings]
default = eScraper.settings
[deploy]
url = http://localhost:8680/
project = eScraper
And i used this command to deploy : scrapy deploy default -p eScraper
But getting error
Building egg of eScraper-1369325126
'build/scripts-2.7' does not exist -- can't clean it
zip_safe flag not set; analyzing archive contents...
Deploying eScraper-1369325126 to http://localhost:8680/addversion.json
Deploy failed: <urlopen error [Errno 111] Connection refused>
I tried changing port also but it didn't worked also i tried using above command with sudo but nothing .....can some one help me.......
If you're using lastest version of scrapy (mine 0.24.2) then
scrapy server
no longer exist, it was moved to separate package called scrapyd
simply run
scrapyd
to start the service
Please first run This command scrapy server and then run deploy command on the another terminal...

error in deploying a project using scrapyd

I had multiple spiders in my project folder and want to run all the spiders at once, so i decided to run them using scrapyd service.
I have started doing this by seeing here
First of all i am in current project folder
I had opened the scrapy.cfg file and uncommented the url line after
[deploy]
I had run scrapy server command, that works fine and scrapyd server runs
I tried this command scrapy deploy -l
Result : default http://localhost:6800/
when i tried this command scrapy deploy -L scrapyd i got following output
Result:
Usage
=====
scrapy deploy [options] [ [target] | -l | -L <target> ]
deploy: error: Unknown target: scrapyd
when i tried to deploy the project with this command scrapy deploy scrapyd -p default got following error
Usage
=====
scrapy deploy [options] [ [target] | -l | -L <target> ]
deploy: error: Unknown target: scrapyd
I am really unable to identify whey scrapyd is showing the above errors, can lead me in to a correct way of how to deploy a project in to scrapyd
Thanks in advance..........
Edited Code:
After seeing the answer of Peter Kirby,i named target in scrapy.cfg and tried the following command in my project folder,
command:
scrapy deploy ebsite -p ebsite
then i got the below error
Building egg of ebsite-1341808241
'build/lib' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-2.7' does not exist -- can't clean it
zip_safe flag not set; analyzing archive contents...
Deploying ebsite-1341808241 to http://localhost:6800/addversion.json
Deploy failed: <urlopen error [Errno 111] Connection refused>
How to solve this.....
From scrapyd service documentation: (http://scrapy.readthedocs.org/en/latest/topics/scrapyd.html?highlight=scrapyd)
You can define targets by adding them to your project’s scrapy.cfg
file... Here’s an example of defining a new target scrapyd2 with
restricted access through HTTP basic authentication:
[deploy:scrapyd2]
url = http://scrapyd.mydomain.com/api/scrapyd/
username = john
password = secret
Essentially what your error means is that your "target" name is not correct. If I remember correctly, the scrapy.cfg file sets the initial target name as "default". What you should be typing is something like:
scrapy deploy default -p project_name
Just type scrapy deploy if you have no named targets and left settings at default!
I got this error when I try to deploy my project without the scrapyd running, so simple run
scrapyd
on another terminal fixed the error
This is the scrapyd proc have no permission!
You need kill the proc,then use root user,Just type:
scrapy server
then the new scrapyd will run.then you can do as scrapyd documention says.

Categories