Subprocess bad performance in unix shell script [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I made a server monitoring script that is monitoring mainly network drive usage and cluster's job status. It's really basic and mainly uses unix commands such as top, status, df and such.
I rely using subprocess which works well, but under heavy workload it starts to get really slow and use a lot of cpu capacity. Slowest part is where I grep users from status -a and they have thousands of jobs running.
Script runs over endless while loop.
So I'm searching for more effective solutions to do this and any help or hint will be appreciated. I'm using Python 2.7

I can suggest you to take a look to iotop, especially the source code as it is made in python.
The global philosophy behind this is to not use the unix tools (top, df...) but parse their source of informations that is /proc.
Opening a file (especially in a memory filesystem like the procfs) is much more faster than forking a process to launch an unix command.

Related

How are scheduled Python programs typically ran? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
Let's say I want to run some function once a day at 10 am.
Do I simply keep a script running in the background forever?
What if I don't want to keep my laptop open/on for many days at a time?
Will the process eat a lot of CPU?
Are the answers to these questions different if I use cron/launchd vs scheduling programmatically? Thanks!
The answer to this question will likely depend on your platform, the available facilities and your particular project needs.
First let me address system resources. If you want to use the fewest resources, just call time.sleep(NNN), where NNN is the number of seconds until the next instance of 10AM. time.sleep will suspend execution of your program and should consume zero (or virtually zero resources). The python GC may periodically wake up and do maintenance, but it's work should be negligible.
If you're on Unix, cron is the typical facility for scheduling future tasks. It implements a fairly efficient Franta–Maly event list manager. It will determine based on the list of tasks which will occurr next and sleep until then.
On Windows, you have the Schedule Manager. It's a Frankenstein of complexity -- but it's incredibly flexible and can handle running missed events due to power outages and laptop hibernates, etc...

Best solution for asynchronous socket programming [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Recently I've started working on Python socket server which handles raw UTF input from Java's streams and sends the result back on all of the currently connected servers, and that works fine, but I'm so pumped and worried about thread usage: you see, I'm using about 2 threads per each connection and I'm worried that CPU will die out that way soon, so, I need a better solution now so that my server could handle hundreds of connections.
I have two ideas for that:
Using a non-blocking IO
Having a fixed amount of thread pools (i.e. FixedThreadPool as it called in Java)
I have no idea which one is gonna work better, so I'd appreciate your advice and ideas.
Thanks!
I would advise not to invent a bicycle and to use some framework for async/streaming processing. For example Tornado.
Also if you can consider using Go language - a lot of developers (including me) are switching from Python to Go for this kind of tasks. It's designed from ground up to support async processing.

Performance review based python script on yocto linux [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I need to develop the performance review based python script , here is the scenario.
I need to send the logs to ElK (Elasticsearch, logstash , Kibana)
from yocto linux but only when system resources are free enough
So what I need here a python script which continuously monitor the
system performance and when system resources like CPU is less then 50%
start sending the logs and if CPU again goes above 50% PAUSE the logging
Now I am don't have idea we can pause any process with python
or not? This is because I want this for logs so when its start
again send the logs from where it stops last time
Yes, all your requirements are possible in Python.
In fact it's possible in basically any language because you're not asking for cutting edge stuff, this is basic scripting.
Sending logs to ES/Kibana
It's possible, Kibana, ES and Splunk all have public API's with good documentation on how to do it, so yes it's possible.
Pausing a process in Linux
Yes, also possible. If it's a external process simply find the PID of your process and send kill -STOP <PID> which would stop the process, to resume the process, do run kill -CONT <PID>. If it's your own process that you want to pause, simply enter a sleep cycle in your code (simple example while PAUSED: time.sleep(0.5).

Run Python script in background on remote server with Django view [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
What I want to achieve is to run python some script which will collect data and insert it to DB in a background.
So basically, a person opens Django view, clicks on a button and then closes the browser and Django launches this script on a server, the script then collects data in background while everything else goes on its own.
What is the best library, framework, module or package to achieve such functionality?
Celery is the most used tool for such tasks.
Celery is a good suggestion, but its a bit heavy solution and there more simple and straightforward solution exist unless you need full power of celery.
So i suggest to use rq and django integration of rq.
RQ inspired by the good parts of Celery, Resque , and has been created as a lightweight alternative to the heaviness of Celery or other AMQP-based queuing implementations.
I'd humbly reccomend the standard library module multiprocessing for this. As long as the background process can run on the same server as the one processing the requests, you'll be fine.
Although i consider this to be the simplest solution, this wouldn't scale well at all, since you'd be running extra processess on your server. If you expect these things to only happen once in a while, and not to last that long, it's a good quick solution.
One thing to keep in mind though: In the newly started process ALWAYS close your database connection before doing anything - this is because the forked process shares the same connection to the SQL server and might enter into data races with your main django process.

What are the proper security measures involved with opening a port with Python? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Consider a very basic python socket which opens up a port to the internet on the host computer and listens for incoming messages, which are displayed in a terminal.
Keeping a port wide open like this is considered fairly vulnerable, correct? What security features should I implement? Should incoming data be sanitized? What's the best way of going about this?
Thanks.
Why would this be vulnerable? Your program accepts connections from arbitrary people (potentially on the whole Internet), and lets them display arbitrary bytes to your terminal. There is only one attack vector here: your terminal itself. If your terminal has a bug that (for example) executes bytes instead of printing them, then the system could be compromised because of this setup.
However, that is unlikely -- in fact, one common technique for verifying that programs aren't totally broken is to pass arbitrary data into them and see if/how they explode. This is called fuzz testing, and if there was such a bug in your terminal when it was fuzz tested, the fuzz test would produce really interesting explosions, rather than just terminal garbage.
Just because something is accessible to the Internet on a port doesn't mean there's a vulnerability. You need an actual exploitable flaw, and in this case, there probably isn't one. (Although one never knows.)
What are you trying to secure? Using Python to listen on a socket isn't going to directly expose you to a vuln unless the Python interpreter has an unknown vuln.
Handling incoming messages is a different matter.
If you're writing to a terminal, does that mean the incoming data is expected to be in a specific format? How are you parsing incoming data? What happens if someone cats /dev/random into your port and leaves the connection open for a nice, long time?
Does the order or content of messages matter?
And so on. There aren't many specifics of the scenario to comment on, so the recommendations will be equally vague. As a start, take a look at OWASP secure coding principles for general concepts (they're applicable even if you're not dealing with HTTP or HTML).

Categories