A Problem on getting docker stats by Python

A Problem on getting docker stats by Python - python

I tried to use Python to get the docker stats, by using Python's docker module.
the code is:
import docker
cli = docker.from_env()
for container in cli.containers.list():
stream = container.stats()
print(next(stream))
I run 6 docker containers, but when I run the code, It needs a few second to get all containers' stats, so is there have some good methods to get the stats immediately?

Docker stats inherently takes a little while, a large part of this is waiting for the next value to come through the stream
$ time docker stats 1339f13154aa --no-stream
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
...
real 0m1.556s
user 0m0.020s
sys 0m0.015s
You could reduce the time it takes to execute by running the commands in parralell, as opposed to one at a time.
To achieve this, you could use the wonderful threading or multiprocessing library.
Digital Ocean provides a good tutorial on how to accomplish this with a ThreadPoolExecutor:
import requests
import concurrent.futures
def get_wiki_page_existence(wiki_page_url, timeout=10):
response = requests.get(url=wiki_page_url, timeout=timeout)
page_status = "unknown"
if response.status_code == 200:
page_status = "exists"
elif response.status_code == 404:
page_status = "does not exist"
return wiki_page_url + " - " + page_status
wiki_page_urls = [
"https://en.wikipedia.org/wiki/Ocean",
"https://en.wikipedia.org/wiki/Island",
"https://en.wikipedia.org/wiki/this_page_does_not_exist",
"https://en.wikipedia.org/wiki/Shark",
]
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = []
for url in wiki_page_urls:
futures.append(executor.submit(get_wiki_page_existence, wiki_page_url=url))
for future in concurrent.futures.as_completed(futures):
print(future.result())

This is what I use to get the stats directly for each running docker (still takes like 1 second per container, so I don't think it can be helped). Besides this if it helps https://docker-py.readthedocs.io/en/stable/containers.html documentation for the arguments. Hope it helps.
import docker
client = docker.from_env()
containers = client.containers.list()
for x in containers:
print(x.stats(decode=None, stream=False))
As the TheQueenIsDead suggested, it might require threading if you want to get it faster.

Related

How to run 'connect_get_namespaced_pod_exec' as root in python k8s client

from os import getenv, listdir, path
from kubernetes import client, config
from kubernetes.stream import stream
import constants, logging
from pprint import pprint
def listdir_fullpath(directory):
return [path.join(directory, file) for file in listdir(directory)]
def active_context(kubeConfig, cluster):
config.load_kube_config(config_file=kubeConfig, context=cluster)
def kube_exec(command, apiInstance, podName, namespace, container):
response = None
execCommand = [
'/bin/bash',
'-c',
command]
try:
response = apiInstance.read_namespaced_pod(name=podName,
namespace=namespace)
except ApiException as e:
if e.status != 404:
print(f"Unknown error: {e}")
exit(1)
if not response:
print("Pod does not exist")
exit(1)
try:
response = stream(apiInstance.connect_get_namespaced_pod_exec,
podName,
namespace,
container=container,
command=execCommand,
stderr=True,
stdin=False,
stdout=True,
tty=False,
_preload_content=True)
except Exception as e:
print("error in executing cmd")
exit(1)
pprint(response)
if __name__ == '__main__':
configPath = constants.CONFIGFILE
kubeConfigList = listdir_fullpath(configPath)
kubeConfig = ':'.join(kubeConfigList)
active_context(kubeConfig, "ort.us-west-2.k8s.company-foo.net")
apiInstance = client.CoreV1Api()
kube_exec("whoami", apiInstance, "podname-foo", "namespace-foo", "container-foo")
I run this code
and the response I get from running whoami is:'java\n'
how can I run as root? also, I can't find a good doc for this client anywhere (the docs on the git repo are pretty horrible) if you can link me to any it would be awesome
EDIT: I just tried on a couple of different pods and containers, looks like some of them default to root, would still like to be able to choose my user when I run a command so question is still relevant

some of them default to root, would still like to be able to choose my user when I run a command so question is still relevant
You have influence over the UID (not the user directly, as far as I know) when you launch the Pod, but from that point forward, there is no equivalent to docker exec -u in kubernetes -- you can attach to the Pod, running as whatever UID it was launched as, but you cannot change the UID
I would hypothesize that's a security concern in locked down clusters, since one would not want someone with kubectl access to be able to elevate privileges
If you need to run as root in your container, then you should change the value of securityContext: runAsUser: 0 and then drop privileges for running your main process. That way new commands (spawned by your exec command) will run as root, just as your initial command: does

Fastest pinging method via python?

I'm looking for the fastest pinging method via python. I need to ping over 100,000 servers and my current procedure below takes approximately 85 minutes to complete. I've read small snippets about scapy, along with general ICMP and python ping. I need to know a definitive method, or at least a solid way to test, which is the fastest. I cannot test python - ping from work as it is not an approved package. I also tried a code snippet for scapy, but got an error:
OSError: Windows native L3 Raw sockets are only usable as administrator !
Install 'Winpcap/Npcap to workaround !
So I'm admittedly looking for code snippets I can test at home or ways around that error from more experienced persons
To prove I've tried, here are some related posts, as well as my current code
Current code:
import pandas as pd
import subprocess
import threading
raw_list = []
raw_list2 = []
def ping(host):
raw_list.append(host+ ' '+ str((subprocess.run('ping -n 3 -w 800 '+host).returncode)))
with open(r"FILEPATH", "r") as server_list_file:
hosts = server_list_file.read()
hosts_list = hosts.split('\n')
num_threads = 100
num_threads2 = 10
num_threads3 = 1
number = 0
while number<len(hosts_list):
print(number)
if len(hosts_list)>number+num_threads:
for i in range(num_threads):
t = threading.Thread(target=ping, args=(hosts_list[number+i],))
t.start()
t.join()
number = number + num_threads
elif len(hosts_list)>(number+num_threads2):
for i in range(num_threads2):
t = threading.Thread(target=ping, args=(hosts_list[number+i],))
t.start()
t.join()
number = number + num_threads2
elif len(hosts_list)>(number+num_threads3-1):
for i in range(num_threads3):
t = threading.Thread(target=ping, args=(hosts_list[number+i],))
t.start()
t.join()
number = number + num_threads3
else:
number = number+1
for x in range(len(raw_list)):
if(raw_list[x][-1] == '0'):
raw_list2.append(raw_list[x][0:-2])
to_csv_list = pd.DataFrame(raw_list2)
to_csv_list.to_csv('ServersCsv.csv', index = False, header = False)
to_csv_list.to_csv(r'ANOTHERFILEPATH', index = False, header = False)
subprocess.call(r'C:\ProgramData\Anaconda3\python.exe "A_PROGRAM_THAT_INSERTS_INTO_SQL"')
This does exactly what I need, however, it does not do it quickly enough.
I've tried the very small snippet:
from scapy.all import *
packets = IP(dst=["www.google.com", "www.google.fr"])/ICMP()
results = sr(packets)
resulting in gaierror: [Errno 11001] getaddrinfo failed
I've also tried:
TIMEOUT = 2
conf.verb = 0
packet = IP("ASERVERNAME", ttl=20)/ICMP()
reply = sr1(packet, timeout=TIMEOUT)
if not (reply is None):
print(reply.dst + "is online")
else:
print("Timeout waiting for %s") % packet[IP].dst
resulting in:
OSError: Windows native L3 Raw sockets are only usable as administrator !
Install Winpcap/Npcap to workaround !
A few links I looked at but could not garner a solid answer from:
Ping a site in Python?
Fastest way to ping a host in python?

This only solves the Python part. The comments are very right.
OSError: Windows native L3 Raw sockets are only usable as administrator ! Install Winpcap/Npcap to workaround !
I find this pretty damn explicit. If you follow's Scapy documentation for windows it says you need to install Npcap.https://nmap.org/npcap/
Other than that,
packets = IP(dst=["www.google.com", "www.google.fr"])/ICMP()
results = sr(packets)
Is likely the cleanest way to go. Works on my machine.. make sure you're using the latest development version from GitHub (unzip it and install it via python setup.py install).
If you are using the latest version, you might even want to turn on threaded=True in sr() to send and receive packets on two threads, as pointed out by the comments. You might also want to use prn and store=False to not store the answers (100k is a lot)

Using Python Multiprocessing Queue Inside AWS Lambda Function

I have some python that creates multiple processes to complete a task much quicker. When I create these processes I pass in a queue. Inside the processes I use queue.put(data) so I am able to retrieve the data outside of the processes. It works fantastic on my local machine, but when I upload the zip to an AWS Lambda function (Python 3.8) it says the Queue() function has not been implemented.The project runs great in the AWS Lambda when I simply take out the queue functionality so I know this is the only hang up I currently have.
I ensured to install the multiprocessing package directly to my python project by using "pip install multiprocess -t ./" as well as "pip install boto3 -t ./".
I am new to python specifically as well as AWS but the research I have come across recently potentially points we to SQS.
Reading over these SQS docs I am not sure if this is exactly what I am looking for.
Here is the code I am running in the Lambda that works locally but not on AWS. See the *'s for important pieces:
from multiprocessing import Process, Queue
from craigslist import CraigslistForSale
import time
import math
sitesHold = ["sfbay", "seattle", "newyork", "(many more)..." ]
results = []
def f(sites, category, search_keys, queue):
local_results = []
for site in sites:
cl_fs = CraigslistForSale(site=site, category=category, filters={'query': search_keys})
for result in cl_fs.get_results(sort_by='newest'):
local_results.append(result)
if len(local_results) > 0:
print(local_results)
queue.put(local_results) # Putting data *********************************
def scan_handler(event, context):
started_at = time.monotonic()
queue = Queue()
print("Running...")
amount_of_lists = int(event['amountOfLists'])
list_length = int(len(sitesHold) / amount_of_lists)
extra_lists = math.ceil((len(sitesHold) - (amount_of_lists * list_length)) / list_length)
site_list = []
list_creator_counter = 0
site_counter = 0
for i in range(amount_of_lists + extra_lists):
site_list.append(sitesHold[list_creator_counter:list_creator_counter + list_length])
list_creator_counter += list_length
processes = []
for i in range(len(site_list)):
site_counter = site_counter + len(site_list[i])
processes.append(Process(target=f, args=(site_list[i], event['category'], event['searchQuery'], queue,))) # Creating processes and creating queues ***************************
for process in processes:
process.start() # Starting processes ***********************
for process in processes:
listings = queue.get() # Getting from queue ****************************
if len(listings) > 0:
for listing in listings:
results.append(listing)
print(f"Results: {results}")
for process in processes:
process.join()
total_time_took = time.monotonic() - started_at
print(f"Sites processed: {site_counter}")
print(f'Took {total_time_took} seconds long')
This is the error the Lambda function is giving me:
{
"errorMessage": "[Errno 38] Function not implemented",
"errorType": "OSError",
"stackTrace": [
" File \"/var/task/main.py\", line 90, in scan_handler\n queue = Queue()\n",
" File \"/var/lang/lib/python3.8/multiprocessing/context.py\", line 103, in Queue\n return Queue(maxsize, ctx=self.get_context())\n",
" File \"/var/lang/lib/python3.8/multiprocessing/queues.py\", line 42, in __init__\n self._rlock = ctx.Lock()\n",
" File \"/var/lang/lib/python3.8/multiprocessing/context.py\", line 68, in Lock\n return Lock(ctx=self.get_context())\n",
" File \"/var/lang/lib/python3.8/multiprocessing/synchronize.py\", line 162, in __init__\n SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)\n",
" File \"/var/lang/lib/python3.8/multiprocessing/synchronize.py\", line 57, in __init__\n sl = self._semlock = _multiprocessing.SemLock(\n"
]
}
Does Queue() work in an AWS Lambda? What is the best way to accomplish my goal?

doesn't look like it's supported -
https://blog.ruanbekker.com/blog/2019/02/19/parallel-processing-on-aws-lambda-with-python-using-multiprocessing/

From the AWS docs
If you develop a Lambda function with Python, parallelism doesn’t come
by default. Lambda supports Python 2.7 and Python 3.6, both of which
have multiprocessing and threading modules.
The multiprocessing module that comes with Python lets you run
multiple processes in parallel. Due to the Lambda execution
environment not having /dev/shm (shared memory for processes) support,
you can’t use multiprocessing.Queue or multiprocessing.Pool.
On the other hand, you can use multiprocessing.Pipe instead of
multiprocessing.Queue to accomplish what you need without getting any
errors during the execution of the Lambda function.

Executing program via pyvmomi creates a process, but nothing happens after that

I'm studying vCenter 6.5 and community samples help a lot, but in this particular situation I can't figure out, what's going on. The script:
from __future__ import with_statement
import atexit
from tools import cli
from pyVim import connect
from pyVmomi import vim, vmodl
def get_args():
*Boring args parsing works*
return args
def main():
args = get_args()
try:
service_instance = connect.SmartConnectNoSSL(host=args.host,
user=args.user,
pwd=args.password,
port=int(args.port))
atexit.register(connect.Disconnect, service_instance)
content = service_instance.RetrieveContent()
vm = content.searchIndex.FindByUuid(None, args.vm_uuid, True)
creds = vim.vm.guest.NamePasswordAuthentication(
username=args.vm_user, password=args.vm_pwd
)
try:
pm = content.guestOperationsManager.processManager
ps = vim.vm.guest.ProcessManager.ProgramSpec(
programPath=args.path_to_program,
arguments=args.program_arguments
)
res = pm.StartProgramInGuest(vm, creds, ps)
if res > 0:
print "Program executed, PID is %d" % res
except IOError, e:
print e
except vmodl.MethodFault as error:
print "Caught vmodl fault : " + error.msg
return -1
return 0
# Start program
if __name__ == "__main__":
main()
When I execute it in console, it successfully connects to the target virtual machine and prints
Program executed, PID is 2036
In task manager I see process with mentioned PID, it was created by the correct user, but there is no GUI of the process (calc.exe). RMB click does not allow to "Expand" the process.
I suppose, that this process was created with special parameters, maybe in different session.
In addition, I tried to run batch file to check if it actually executes, but the answer is no, batch file does not execute.
Any help, advices, clues would be awesome.
P.S. I tried other scripts and successfully transferred a file to the VM.
P.P.S. Sorry for my English.
Update: All such processes start in session 0.

Have you tried interactiveSession ?
https://github.com/vmware/pyvmomi/blob/master/docs/vim/vm/guest/GuestAuthentication.rst
This boolean argument passed to NamePasswordAuthentication and means the following:
This is set to true if the client wants an interactive session in the guest.

How do I list hosts using Ansible 1.x API

Ansible-playbook has a --list-hosts cli switch that just outputs the hosts affected by each play in a playbook. I am looking for a way to access to same information through the python API.
The (very) basic script I am using to test right now is
#!/usr/bin/python
import ansible.runner
import ansible.playbook
import ansible.inventory
from ansible import callbacks
from ansible import utils
import json
# hosts list
hosts = ["127.0.0.1"]
# set up the inventory, if no group is defined then 'all' group is used by default
example_inventory = ansible.inventory.Inventory(hosts)
pm = ansible.runner.Runner(
module_name = 'command',
module_args = 'uname -a',
timeout = 5,
inventory = example_inventory,
subset = 'all' # name of the hosts group
)
out = pm.run()
print json.dumps(out, sort_keys=True, indent=4, separators=(',', ': '))
I just can't figure out what to add to ansible.runner.Runner() to make it output affected hosts and exit.

I'm not sure what are you trying to achieve, but ansible.runner.Runner is actually one task and not playbook.
Your script is a more kind of ansible CLI and not ansible-playbook.
And ansible doesn't have any kind of --list-hosts, while ansible-playbook does.
You can see how listhosts is done here.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

A Problem on getting docker stats by Python - python

Related

How to run 'connect_get_namespaced_pod_exec' as root in python k8s client

Fastest pinging method via python?

Using Python Multiprocessing Queue Inside AWS Lambda Function

Executing program via pyvmomi creates a process, but nothing happens after that

How do I list hosts using Ansible 1.x API

Categories

Resources