How to deploy scrapyd to a network

How to deploy scrapyd to a network - python

I currently have an instance of scrapyd up and running locally on my machine. This instance of scrapyd needs to be available to other PC's on my employers network. I've read about scrapy-cloud (https://doc.scrapinghub.com/scrapy-cloud.html) and other cloud based services. However I'd much rather host scrapyd on our network, since the spiders I've built pull data from csv files stored on our servers.
I've searched through the scrapyd documentation (https://scrapyd.readthedocs.io/en/stable/) and understand how to install and run scrapyd. I am also comfortable with uploading scrapy projects to scrapyd and running specific spiders.
What steps do I need to take in order to make my scrapyd instance available to other machines on our network? All of our PC's and servers run on a windows OS
The answer doesn't need to be a specific step by step guide. I'm just looking for someone to point me in the right direction, because I am unsure how to proceed.

if you are in a lan in the same range of ip.
you can follow the manual and check your ip
ifconfig in linux
ipconfig in windows
and run the commands in the manual
curl http://localhost:6800/addversion.json -F project=myproject -F version=r23 -F egg=#myproject.egg
and change the localhost with your ipaddress
for example if your ip is 192.168.1.10
you will run
in other pc }.
curl http://192.168.1.10:6800/addversion.json -F project=myproject -F version=r23 -F egg=#myproject.egg
You need open the port if you use firewalls, and if you don't use cURL in windows can download and install it:
How do I install/set up and use cURL on Windows?
More information about the api check the manual

Related

URL given by docker is not loading

I am trying to run my streamlit app via docker. Since I want to run my code in a linux system, I am trying first it it runs in my windows system.
So I ran my container and ran a command which gave me two URLs. But both of the urls are not working.
This is the terminal result
This the result in browser
Do I have to mention any port number? And if yes, then how to find my local system's port?
Thanks in advance

I think you are using the wrong port number, please try to use -p 8501:8501 in your docker command and then go to localhost:8501 in your browser.

First check if you are able to ping the internal IP of the docker container from the host machine.
then export multiple ports using the following args:-
docker run -p <host_port1>:<container_port1> -p <host_port2>:<container_port2>

How do I build docker container behind company proxy?

I am trying to build a simple python based docker container. I am working at a corporate behind a proxy, on Windows 10. Below is my docker file:
FROM python:3.7.9-alpine3.11
WORKDIR ./
RUN pip install --proxy=http://XXXXXXX:8080 -r requirements.txt
COPY . /
EXPOSE 5000
CMD ["python", "application.py"]
But it's giving me the following errors in cmd :
"failed to solve with frontend dockerfile.v0: failed to build LLB: failed to load cache key: failed to do request: Head https://registry-1.docker.io/v2/library/python/manifests/3.7.9-alpine3.11: proxyconnect tcp: EOF"
I've tried to figure out how to configure docker's proxy, using many links but they keep referring to a file "/etc/sysconfig/docker" which I cannot find anywhere under Windows 10 or maybe I'm not looking at the right place.
Also I'm not sure this is only a proxy issue since I've seen people running into this issue without using a proxy.
I would highly appreciate anyone's help. Working at this corporate already made me spend >10 hours doing something that took me 10 minutes to do on my Mac... :(
Thank you

You're talking about the most basic of Docker functionality. Normally, it has to connect to the Docker Hub on the internet to get base images. If you can't make this work with your proxy, you can either
preload your local cache with the necessary images
set up a Docker registry inside your firewall that contains all the images you'll need
Obviously, the easiest thing, probably by far, would be to figure out how to get Docker to connect to Docker Hub through your proxy.
In terms of getting Docker on Windows to work with your proxy, might this help? - https://learn.microsoft.com/en-us/virtualization/windowscontainers/manage-docker/configure-docker-daemon
Here's what it says about configuring a proxy:
To set proxy information for docker search and docker pull, create a Windows environment variable with the name HTTP_PROXY or HTTPS_PROXY, and a value of the proxy information. This can be completed with PowerShell using a command similar to this:
In PowerShell:
[Environment]::SetEnvironmentVariable("HTTP_PROXY", "http://username:password#proxy:port/", [EnvironmentVariableTarget]::Machine)
Once the variable has been set, restart the Docker service.
In PowerShell:
Restart-Service docker
For more information, see Windows Configuration File on Docker.com.
I've also seen it mentioned that Docker for Windows allows you to set proxy parameters in its configuration GUI interface.

There is no need to pass proxy information in the Dockerfile.
There are predefined ARGs which can be used for this purpose.
HTTP_PROXY
HTTPS_PROXY
FTP_PROXY
You can pass the details when building the image
https://docs.docker.com/engine/reference/builder/#predefined-args
I do not see any run time dependency of your container on the Internet. So running the container will work without an issue.

How to configure my docker pypi server to use pypiserver[cache]?

I'm using docker pypi server as my internal pip server. I have thousands of requests and sometimes my server fails (i.e. reaches 5 timeouts)
pypiserver specifies an option that can help with that: using cache. How can I make my docker run with this option enabled (OR is there another way to handle the request load better)?
The docker tutorial specifies a cache-related command : --cache-control AGE but it has nothing to do with the pypi caching I want.
Here is my docker run command: sudo docker run -p 80:8080 -v /home/bla/.pypi_server/packages:/data/packages pypiserver/pypiserver:latest

How to watch xvfb session that's inside a docker on remote server from my local browser?

I'm running a docker (That I built on my own), that's docker running E2E tests.
The browser is up and running but I want to have another nice to have feature, I want the ability of watching the session online.
My docker run command is:
docker run -p 4444:4444 --name ${DOCKER_TAG_NAME}
-e Some_ENVs
-v Volume:Volume
--privileged
-d "{docker-registry}" >> /dev/null 2>&1
I'm able to export screenshots but in some cases it's not enough and the ability of watching what is the exact state of the test would be amazing.
I tried a lot of options but I came to a dead end, Any help would be great.
My tests are in Python 2.7
My Docker base is ubuntu:14.04
My environment is in AWS (If that's matter)
The docker runs on Ubuntu servers.
I know it a duplicate of this but no one answered him so...

There is a recent tool called Selenoid. It is launching browsers in Docker containers (i.e. headless as you require). It has a standalone UI capable to show live session screen via VNC. So you can launch multiple sessions in parallel and then look and even intercept actions happening in target browser. All this stuff perfectly works in cloud environment.

I have faced the same issue before with vnc, you need to know your xvfb/vnc in which port is using then open that port on you aws secuirty group once you done with that then you should be able to connect.
On my case i was starting selenium docker "https://github.com/elgalu/docker-selenium" and used this command to start the docker machine "docker run -d --name=grid -p 4444:24444 -p 5900:25900 \
-v /dev/shm:/dev/shm -e VNC_PASSWORD=hola \
-e SCREEN_WIDTH=1920 -e SCREEN_HEIGHT=1480 \
elgalu/selenium"
The VNC port as per the command is "5900" so i opened that port on instance security group, and connected using VNC viewer on port 5900

Controlling VMs using Python scripts

I want to manage virtual machines (any flavor) using Python scripts. Example, create VM, start, stop and be able to access my guest OS's resources.
My host machine runs Windows. I have VirtualBox installed. Guest OS: Kali Linux.
I just came across a software called libvirt. Do any of you think this would help me ?
Any insights on how to do this? Thanks for your help.

For aws use boto.
For GCE use Google API Python Client Library
For OpenStack use the python-openstackclient and import its methods directly.
For VMWare, google it.
For Opsware, abandon all hope as their API is undocumented and has like 12 years of accumulated abandoned methods to dig through and an equally insane datamodel back ending it.
For direct libvirt control there are python bindings for libvirt. They work very well and closely mimic the c libraries.
I could go on.

follow the directions here to install docker https://docs.docker.com/windows/ (it includes Oracle VirtualBox (if you dont already have it)
#grab the immage
docker pull kalilinux/kali-linux-docker
#run a specific command
docker run kalilinux/kali-linux-docker <some_command>
#open interactive terminal to "docker image"
docker run -t -i kalilinux/kali-linux-docker /bin/bash
if you want to mount a local volume you can use the `-v dst src` switch in your run command
#mount local ./training/webapp directory into kali image # /webapp
docker run kalilinux/kali-linux-docker -v /webapp training/webapp <some_command>
note that these are run from the regular windows prompt to use python you would need to wrap them in subprocess calls ...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to deploy scrapyd to a network - python

Related

URL given by docker is not loading

How do I build docker container behind company proxy?

How to configure my docker pypi server to use pypiserver[cache]?

How to watch xvfb session that's inside a docker on remote server from my local browser?

Controlling VMs using Python scripts

Categories

Resources