Python Pandas hanging in Docker container for small volume of data - python

I am trying load the data worth of 44 mb in to a Pandas data frame. this is the code:
import pandas as pd
from sqlalchemy import create_engine
import cx_Oracle
import sqlalchemy
engineor = create_engine('oracle+cx_oracle://xxxxx:xxx#xxx:1521/?service_name=XXX')
sql = "select * from accounts where date >= '10-MAY-18 06.00.16.170000000 PM'"
do = pd.read_sql(sql, engineor)
do.info(memory_usage='deep')
The above query returns around 70k rows and the size is around 44 mb.
When I run this from my local machine (Win 7) in Anaconda, the data loads in to data frame without any issues in a minute or two. However, when I run the same thing in Docker container (Linux based) it just hangs.
I verified that docker container has sufficient memory, and memory doesn't grow over time (although the size is quite small ~44 mb). It just gets submitted and hangs indefinitely that I am unable to kill it by pressing control + c or control + z. I need to disconnect from the machine and login back.
I tried to match the version of Pandas from Anaconda that I am running on local machine. But it didn't help much, it is still hanging. The only thing that is now differing between my local machine and the Python version. In docker container it is 3.5.3 and in my local version it is 3.6.3, and that Anaconda is running from Windows and docker container is Linux based. I am not sure if these things make any difference.
Any suggestions on how to overcome this?

I just ran into same problem as yours. I think it's because of using socket communication inside bridge network container.
Docker uses bridge network as default, just appointed ports could be forwarded to the host. This usually works when you use container as server. But when your container runs as a client which creates sockets for communicating, there will be a problem.
The client needs to open up random ports to the server and bind them with sockets. Because only selected ports could be forwarded, those random ports could not be found outside. Receiving data will never find the client container, and the sockets inside container will be left hang forever.
My solution is to run containers using host network. You could run container using command like
docker run --rm -d --network host --name my_nginx nginx
or using docker-compose
version: '3.7'
services:
my_nginx:
...
network_mode: "host"
But beware using host network could cause port conflicts.
PS. Although using host network solved my problem, I'm not an expert on socket or networking. Please correct me if i'm wrong about the mechnisims. I read socket in this article: https://docs.oracle.com/javase/tutorial/networking/sockets/definition.html.

Related

Docker repository server gave HTTP response to HTTPS client from Ray

I receive an error similar to this when trying to start ray using a docker image from a local repository
The proposed solutions involve making some changes to the nodes' docker config files, and then restarting the docker daemon.
However, I believe that (please correct if I'm wrong) the setup commands in ray config files are run within the docker containers, rather than directly on the node machines. So I'm unsure of how to apply them when using Ray.
How can I avoid the error?

Run python commands in a docker container from python script on host

I have a docker image and and associated container than runs a jupyter-lab server. On this docker image I have a very specific python module that cannot be installed on the host. On my host, I have all my work environment that I don't want to run on the docker container.
I would like to use that module from python script running on the host. My first idea is to use docker-py (https://github.com/docker/docker-py) on the host like this:
import docker
client = docker.from_env()
container = client.container.run("myImage", detach=True)
container.exec_run("python -c 'import mymodule; # do stuff; print(something)'")
and get the output and keep working in my script.
Is there a better solution? Is there a way to connect to the jupyter server within the script on the host for example?
Thanks
First. As #dagnic states on his comment there are those 2 modules that let you execute docker runtime in you python script (there's probably more, another different question would be "which one is best").
Second. Without knowing anything about Jupiter, but since you call it "server" , it would mean to me that you are able to port mapping that server (remember -p 8080:80 or --publish 8080:80, yeah that's it!). after setting a port mapping for your container you would be able to ie use pycurl module an "talk" to that service.
Remember, if you "talk on a port" to your server, you might also want to do this using i.e with docker-py.
Since you asked if any better solution exists: This two method would be the more popular. First one would convenient for your script, second would launch a server and you can use pycurl from your host script as you asked (connect to the jupyter server) .ie if you launch jupyter server like:
docker run -p 9999:8888 -it -e JUPYTER_ENABLE_LAB=yes jupyter/base-notebook:latest
you can pycurl like:
import pycurl
from io import BytesIO
b_obj = BytesIO()
crl = pycurl.Curl()
# Set URL value
crl.setopt(crl.URL, 'https://localhost:8888')
# Write bytes that are utf-8 encoded
crl.setopt(crl.WRITEDATA, b_obj)
# Perform a file transfer
crl.perform()
# End curl session
crl.close()
# Get the content stored in the BytesIO object (in byte characters)
get_body = b_obj.getvalue()
# Decode the bytes stored in get_body to HTML and print the result
print('Output of GET request:\n%s' % get_body.decode('utf8'))
Update:
So you have two questions:
1. Is there a better solution?
Basically using docker-py module and run jupyter server in a docker container (and a few other options not involving docker I suppose)
2. Is there a way to connect to the jupyter server within the script on the host for example?
Here, there is an example how to run jupyter in docker.
enter link description here
The rest is use pycurl from your code to talk to that jupyther server from your host computer.

Docker image producing different results with Python script between different host OS

I have been having trouble executing my python script in a remote server. However, when I build the docker image in my local machine (macOS Mojave), the docker image executes my script fine. The same docker image built in the remote server has issues with running the python script in the docker image. I was under the impression that the docker image would produce the same results regardless of its host OS. Correct if I'm wrong and what I can do to produce the same results in the remote server VM. What exactly is the underlying problem causing this issue? Should I request a new remote VM with certain specifications?

How to I get the host address using dockerpy of container?

I am trying to figureout from where to get the hostname of a running docker container that was started using docker-py.
Based on presence of DOCKER_HOST= file the started docker container my be on a remove machine and not on the localhost (machine running docker-py code).
I looked inside the container object and I was not able to find any information that would be of use for as 'HostIp': '0.0.0.0' is the remote docker host.
I need an IP or DNS name of the remote machine.
I know that I could start parsing DOCKER_HOST myself and "guess" that but this would not really be a reliable way of doing it, especially as there are multiple protocols involved: ssh:// and tcp:// at least.
I guess it should be an API based way of getting this information.
PS. We would assume that the docker host does not have firewall.
For the moment I ended up creating a bug on https://github.com/docker/docker-py/issues/2254 as I failed to find that information with the library.
The best method is probably to use a website like wtfismyip.com.
You can use
curl wtfismyip.com
to print it in terminal, and can then extract the public ip from the output.

How to access a server(which is running in a docker container) from other machine?

I'm new to docker. I have deployed a python server in a docker container. And I'm able to access using my python application from my machine using virtual machine IP(192.168.99.100).
Ex: http://192.168.99.100:5000
How do I access my the application from the other machine which is in the same network?
I tried giving my machine IP but didn't work.
I run the application using "docker run -p 5000:80 myPythonApp"
An easy way out would be to forward port from your host machine to the virtual machine.
Solution may differ w.r.t the VM providers & Host OS that you use. Like for vagrant you can do something as below -
https://www.vagrantup.com/docs/networking/forwarded_ports.html

Categories