Store database connection in python? [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a script A (python script) which opens the database and executes some queries and then closes the database connection.
I am not sure how long will script A run it all depends on the load.
I have an other script B (shell script) which runs the script A in a while loop. Which means that script A will be always running.
My database uses almost 100% or more of my CPU. I think it is because of repeatedly opening and closing connection.
Is there any way to improve performance?
I am using MYSQL database, planning to move to PostgreSQL.
I want to store the connection in some place and use the same if it is active or create a new one. I am not sure how to do it? Any ideas?

I think it is because of repeatedly opening and closing connection.
Based on what evidence? Done any tracing/profiling to try to trace it?
All the Python interpreter starts won't help either. Overall this all this sounds very inefficient.
Personally I recommend getting rid of the shell script wrapper; do it in the same Python script. Connect once in the outer loop and re-use the same connection in each inner iteration.
You can't "save" the connection. When the script terminates, the connection closes.
You could use a connection pooler like PgBouncer to reduce the overhead of creating and destroying all those connections but it won't be as good as just doing everything within the single script.

You can add a logical flag inside the script B and not execute A unless it has finish the previous run. You can activate the flag once you start script A and deactivate it at the end. This will prevent overlapping and executing A in parallel.

Related

Python. Threading [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Hi I have a Server/client model using SocketServer module. The server job is to receive test name from the clients and launch the test.
the test is launched using subprocess module.
I would like the server to keep answering clients and any new jobs to be stacked on a list or queue and launch one after the other, the only restriction I have is the server should not launch the test unless currently running one is completed.
Thanks
You can use the module multiprocessing for starting new processes. On the server-side, you would have a variable which refers to the current running process. You can still have your SocketServer running and accepting requests and storing them in a list. Every second (or whatever you want), in another thread, you would check if the current process is dead or not by calling isAlive(). If it is dead, then just simply run the next test on the list.
Another way to do it (better), is that on the third thread (the one that checks), you call .join() from the process so that it will only call the next line of code once the current process is dead. That way you don't have to keep checking every second or whatever and it is more efficient.
What you might want to do is:
Get test name in server socket, put it in a Queue
In a separate thread, read test names from the Queue one by one
Execute the process and wait for it to end using communicate()
Keep polling Queue for new tests, repeat steps 2, 3 if test names are available
Meanwhile server continues receiving and putting test names in Queue

Python - Howto manage a list of threads [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am using Python 2.7.6 and the threading module.
I am fairly new to python threading. I am trying to write a program to read files from a filesystem and store some hashes in my database. That are a lot of files and I would like to do it in threads. Like one thread for every folder that starts with a, one thread for every folder that starts with b. Since I want to use a database connection in the threads I don't want to generate 26 threads at once. So I would like to have 10 threads running and always if one of them finishes I want to start a new thread.
The main program should hold a list of threads with a specified max
amount of threads (e.g. 10)
The main program should start 10 threads
The main program should be notified when one thread finished
If a thread is finished start a new one
And so on ... until the job is done and every thread is finished
I am not quite sure how the main program has to look like. How can I manage this list of threads without a big overhead?
I'd like to indicate you that python doesn't manage well multi-threading : As you might know (or not) python comes with a Global Interpreter Lock (GIL), that doesn't allow real concurrency : Indeed, only one thread will execute at a time. (However you will not see the execution as a sequential one, thanks to the process scheduler of your machine)
Take a look here for more information : http://www.dabeaz.com/python/UnderstandingGIL.pdf
That said, if you still want to do it this way, take a look at semaphores : every thread will have to acquire it, and if you initialize this lock to 10, only 10 thread at a time will be able to acquire it.
https://docs.python.org/2/library/threading.html#threading.Semaphore
Hope it helps

What linux signals should I trap to make a good application [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm creating a program in python that auto-runs for an embedded system and want to handle program interruptions gracefully. That is if possible close resources and signal child processes to also exit gracefully before actually exiting. At which point my watchdog should notice this and respawn everything.
What signals can/should I expect to receive in a non-interactive program from linux? I'm using try/except blocks to handle i/o errors.
Is the system shutting down an event that is signaled? In addition to my watchdog I will also have an independent process monitoring a hardware line that gets set when my power supply detects a brownout. I have a supercap to provide some run-time to allow a proper shutdown.
Trap sigint, sigterm and make sure to clean up anything like sockets, files, locks, etc.
Trap other signals based on what you are doing. For instance if you have open pipes you might trap sigpipe.
Just remember signal handling opens you to race conditions. You probably want to use sigprocmask to disable signals while handling them.

Performance review based python script on yocto linux [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I need to develop the performance review based python script , here is the scenario.
I need to send the logs to ElK (Elasticsearch, logstash , Kibana)
from yocto linux but only when system resources are free enough
So what I need here a python script which continuously monitor the
system performance and when system resources like CPU is less then 50%
start sending the logs and if CPU again goes above 50% PAUSE the logging
Now I am don't have idea we can pause any process with python
or not? This is because I want this for logs so when its start
again send the logs from where it stops last time
Yes, all your requirements are possible in Python.
In fact it's possible in basically any language because you're not asking for cutting edge stuff, this is basic scripting.
Sending logs to ES/Kibana
It's possible, Kibana, ES and Splunk all have public API's with good documentation on how to do it, so yes it's possible.
Pausing a process in Linux
Yes, also possible. If it's a external process simply find the PID of your process and send kill -STOP <PID> which would stop the process, to resume the process, do run kill -CONT <PID>. If it's your own process that you want to pause, simply enter a sleep cycle in your code (simple example while PAUSED: time.sleep(0.5).

why passing Queue and Database connection as parameter in multithreading? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I was reading multi-threading priority queue here. In this I don't understand why workQueue is passed as parameter
to the self method in class myThread we could have directly used workQueue instead of
using self.q. So I wrote without it worked but then I tried to do the same for connecting to
database.I opened a common DB connection and allowed every thread to use it. But it did not work , ( my update was not reflected in database). I thought that as threads were pre-emptying
it was not possible for them to maintain a connection to execute the query. But then I gave every thread a DB connection which I initially passed to the self method.
Basically, I implemented this. And to my surprise this worked. How is it different from what I was doing?
in this I don't understand why workQueue is passed as parameter to the self method in class
myThread we could have directly used workQueue instead of using self.q
In this particular example, sure you could just reference the global workQueue variable.
But that's not a very general approach, global variables might often create a mess. What if you want the object to be able to work with several different work queues for different purposes ? Better to just pass the queue you want the object to work with, instead of having the object reference a global variable.
.I opened a common DB connection and allowed every thread to use it.
Database connections are not thread safe, so expect random stuff to happen when you do that.
As the documentation states:
The MySQL protocol can not handle multiple threads using the same
connection at once. ... The general upshot of this is: Don't share
connections between threads.
So what you should be doing, is use one connection per thread, which as you discovered works fine. This is different from how the Queue is used, which in the example code is properly locked when you access it.
According to the documentation:
The MySQL protocol can not handle multiple threads using the same connection at once.
That's why it doesn't works, you can't share a db connection (at least not for MySQL) between threads.
The example you linked to is creating a connection for each thread:
for thread in range(threads):
try:
connections.append(MySQLdb.connect(host=mysql_host, user=mysql_user, passwd=mysql_pass, db=mysql_db, port=mysql_port))

Categories