unknown diskIO operation

unknown diskIO operation - python

I have some big computation programs running, which does not involve disk operation, however, when I monitor the process in htop, it shows a lot of disk wait, and CPU utilization is only 10%, I looked up using iotop, I am not seeing any irregular disk read/write, what might be the possible reason for this and how do I debug it? The program is python3 and C, and I am running on Ubuntu 14.04 and Ubuntu 16.04.
Sorry that the program is too big and I don't know which part is malfunctioning, so I can't paste them here for testing.

Related

How to let a Python process use all Docker container memory without getting killed?

I have a Python process that does some heavy computations with Pandas and such - not my code so I basically don't have much knowledge on this.
The situation is this Python code used to run perfectly fine on a server with 8GB of RAM maxing all the resources available.
We moved this code to Kubernetes and we can't make it run: even increasing the allocated resources up to 40GB, this process is greedy and will inevitably try to get as much as it can until it gets over the container limit and gets killed by Kubernetes.
I know this code is probably suboptimal and needs rework on its own.
However my question is how to get Docker on Kubernetes mimic what Linux did on the server: give as much as resources as needed by the process without killing it?

I found out that running something like this seems to work:
import resource
import os
if os.path.isfile('/sys/fs/cgroup/memory/memory.limit_in_bytes'):
with open('/sys/fs/cgroup/memory/memory.limit_in_bytes') as limit:
mem = int(limit.read())
resource.setrlimit(resource.RLIMIT_AS, (mem, mem))
This reads the memory limit file from cgroups and set it as both hard and soft limit for the process' max area address space.
You can test by runnning something like:
docker run --it --rm -m 1G --cpus 1 python:rc-alpine
And then trying to allocate 1G of ram before and after running the script above.
With the script, you'll get a MemoryError, without it the container will be killed.

Using --oom-kill-disable option with a memory limit works for me (12GB memory) in a Docker container. Perhaps it applies to Kubernetes as well.
docker run -dp 80:8501 --oom-kill-disable -m 12g <image_name>
Hence
How to mimic "--oom-kill-disable=true" in kuberenetes?

Docker hyperkit process CPU usage going crazy. How to keep it under control?

Using docker (docker-compose) on macOS. When running the Docker containers and attaching Visual Studio Code (VSCode) to the active app container it can make the hyperkit process go crazy :( the macBook fans have to go at full speed to try to keep the temperature down.
When using VSCode on python files I noticed that actions, such as done by pylint, that result in scanning/parsing your file will increase the hyperkit CPU usage to the max and the macBook fans go on full speed :(. Hyperkit CPU usage goes down again when the action of pylint is finished.
When using VSCode to debug my Django Python app the hyperkit CPU usage goes to the max again. When actively debugging the hyperkit goes wild but it does settle down again afterwards.
I'm currently switching "bind mounts" to "volume mounts" I think I see some improvements but haven't done enough testing to say anything conclusive. I've only switched my source code to using "volume mount" instead of "bind mount" and will do the same for my static files and database and see if that results in improvements.
You can check out this stackoverflow post on Docker volumes for some more info on the subject.
Here is some post that I found regarding this issue:
https://code.visualstudio.com/docs/remote/containers?origin_team=TJ8BCJSSG
https://github.com/docker/for-mac/issues/1759
Any other ideas on how to keep the hyperkit process under control❓
[update 27 March] Docker debug mode was set to TRUE I've changed this to FALSE but I have not seen any significant improvements.
[update 27 March] Using "delegated" option for my source code (app) folder and first impressions are positive. I'm seeing significant performance improvements we'll have to see if it lasts 😀
FYI Docker docu on delegated: the container’s view is authoritative (permit delays before updates on the container appear in the host)
[update 27 March] I've also reduced the number of CPU cores Docker desktop can use (settings->advanced). Hopefully this prevents the CPU from getting too hot.

I "solved" this issue by using http://docker-sync.io to create volumes that I can mount without raising my CPU usage at all. I am currently running 8 containers (6 Python and 2 node) with file watchers on and my CPU is at 25% usage.

CPU load on Raspberry Pi Zero launching a python script is always 100%

we are running a python script that has a bit of RMS calculation and a TensorFlow model. As soon as I launch the python script the CPU load goes to 100% on a Raspberry Zero W. For the information, memory_load to 50% and disk usage to 45%.
Is there a way to find what resources are exactly taking 100% of the CPU?
Would using a faster uSD help here? (Assuming the CPU is spending lots of time reading from flash memory).

You can go to the task manager on your raspberry pi which is located in the main menu under accessories. If it is not there you can just click control - alt - delete. By doing this you can see which programs are running and how much CPU they are using as well as other information. You can try overclocking your Pi to make the CPU faster.

I noticed the same thing, and had to do a quick search and found this page. On my Raspberry Pi Zero W, without desktop and using one SSH session to show htop and the second to run a simple Python script in the interpreter like this:
while True:
pass
will bring the CPU usage up to 100% until I break the script, so I just assume that's the nature of the beast.

Python script gets killed in Ubuntu 12.04

I am currently trying to run a long running python script on Ubuntu 12.04. The machine is running on a Digital Ocean droplet. It has no visible memory leaks (top shows constant memory). After running without incident (there are no uncaught exceptions and the used memory does not increase) for about 12 hours, the script gets killed.
The only messages present in syslog relating to the script are
Sep 11 06:35:06 localhost kernel: [13729692.901711] select 19116 (python), adj 0, size 62408, to kill
Sep 11 06:35:06 localhost kernel: [13729692.901713] send sigkill to 19116 (python), adj 0, size 62408
I've encountered similar problems before (with other scripts) in Ubuntu 12.04 but the logs then contained the additional information that the scripts were killed by oom-killer.
Those scripts, as well as this one, occupy a maximum of 30% of available memory.
Since i can't find any problems with the actual code, could this be an OS problem? If so, how do i go about fixing it?

Your process was indeed killed by the oom-killer. The log message “select … to kill“ hints to that.
Probably your script didn’t do anything wrong, but it was selected to be killed because it used the most memory.
You have to provide more free memory, by adding more (virtual) RAM if you can, by moving other services from this machine to a different one, or by trying to optimize memory usage in your script.
See e.g. Debug out-of-memory with /var/log/messages for debugging hints. You could try to spare your script from being killed: How to set OOM killer adjustments for daemons permanently? But often killing some process at random may leave the whole machine in an unstable state. In the end you will have to sort out the memory requirements and then make sure enough memory for peak loads is available.

Python Multiprocessing pool is freezing at one computer as it is working at another?

I developed a multiprocessing python script with Pool and map functions. It uses Numpy. Interestingly, it freezes even with a single process Pool in my computer (on the dot product of two [20000, 36] dim matrices, of course one is transposed ) and it works on my remote server without any flaw. However, the code can run in my computer as well if it is sequential (with out Map function).
Server has very large memory compared to my computer. I thought the problem was memory and try single process pool only and it freezed again. In addition as I observe from system monitor, memory indicator did not overflow. Even the single process pool freezes. If I run the same function without pool and sequentially, it works without problem.
Any idea on how to trace the problem or what might be the reason?
My PC has python 2.7.6 and server has 2.7.5. If there is a known bug, please direct me.
Both machines are on Ubuntu.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.