Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
This may seem like a dumb question, but I haven't found an answer in regards to this. I was coding a loop in Python using Sublime Text, and I accidentally set the incorrect conditions that lead to running an infinite loop.
After multiple bad attempts, I've noticed that my OS is slower. I was wondering if running loops would negatively harm the RAM or processing even though if I force quit the application - or maybe it was a coincidental occurrence.
Don't worry. Vaguely speaking, the CPU in your computer is always running whether you're running your application code or not. And the RAM chip is always powered on as long as the computer is running.
Actually, a DDR memory has to be periodically refreshed in order to work (think of this as periodic read-write cycles, although they're carried out by the chip itself.
So no, an infinite loop will not wear out your cpu or ram, but it could prevent some parts of them from entering low power modes, depending on the actual hardware an OS.
After multiple bad attempts, I've noticed that my OS is slower. I was wondering if running loops would negatively harm the RAM or processing even though if I force quit the application - or maybe it was a coincidental occurrence.
No. Unless the application has left behind execution artifacts (such as other zombie processes which happen to loop too), nothing will happen (after the execution of the process stops, the operating system reclaims all the resources it held).
I was wondering if running loops would negatively harm the RAM or processing even though if I force quit the application
As far as the CPU is concerned, an infinite loop is just a set of never ending (conditional or unconditional) jumps, that may or may not be followed by other meaningful instructions. It's completely harmless.
As far as I know, yes. Your RAM will start getting used up. I've had this happen to me as well and sometimes I had no choice but to force-quit the application too.
In the case of Sublime Text, just use ctrl+break to stop the execution. It may happen in certain code you write where it isn't immediately obvious that this is happening. However, you can easily check the RAM usage and you'll see it spiking!
Once the process/thread in which the loop was executing is destroyed, then the loop execution is also terminated.
There may be cleanup of resources that might need to still occur, which could have an 'impact' on depending on how those resources were managed... As far as I know python performs all the necessary cleanup preventing memory leaks, etc
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am very new to the concept of threading. I was going over the content in this site about threading and came across this claim that Tasks that spend much of their time waiting for external events are generally good candidates for threading. May I know why is this statement true.
Threading allows for efficient CPU usage. Tasks that spend a lot of time waiting for other events to finish can be put to sleep (this means temporarily stopped) with Threading.
By putting a thread to sleep, the CPU it was being executed with becomes free to execute other tasks while waiting for the thread to be woken up.
The ability to sleep and wake up allows:
(1) Faster computation without much overhead
(2) A reduction in wasted computational resources
Alternative viewpoint:
I don't know about Python specifically, but in many other programming languages/libraries, there will be some kind of "async I/O" mechanism, or "select" mechanism, or "event-driven" paradigm that enables a single-threaded program to actively do one thing while simultaneously waiting for multiple other things to happen.
The problem that threads solve comes from the fact that each independent source of input/events typically drives some independent "activity," and each of those activities has its own separate state. In an async/event-driven programming style, the state of each activity must be explicitly represented in some set of variables: When the program receives an event that drives activity X, it has to be able to examine those variables so that it can "pick up where it left off" last time it was working on activity X.
With threads, part or all of the state of activity X can be implicit in the X thread's context (that is, in the value of its program counter, in its registers, and in the contents of its call stack.) The code for a thread that handles one particular activity can look a lot like the pure-procedural code that that we all first learned to write when we were rank beginners—much more familiar looking than any system of "event handlers" and explicit state variables.
The down-side of using multiple threads, is that the familiar look and feel of the code can lull us into a false sense of security—we can easily overlook the possibility of deadlocks, and race conditions, and other hazards to which multi-threading exposes us. Multi-threaded code can be easy to read, but it can be much harder to write it without making subtle mistakes that are hard to catch in testing.
Perhaps this is a broad question, but I haven't found an answer elsewhere, so here goes.
The Python script I'm writing needs to run constantly (in a perfect world, I recognize this may not be exactly possible) on a deployed device. I've already dedicated time to adding "try...except" statements throughout so that, should an issue arise, the script will recover and continue to work.
The issue is that I'm not sure I can (nor should) handle every single possible exception that may be thrown. As such, I've decided it may be better to allow the script to die and to use systemd to restart it.
The three options:
Making no attempt to handle any exception, and just allowing systemd to restart it whenever it dies.
Meticulously creating handlers for every possible exception to guarantee that, short of loss of power, interpreter bug, or heat death of the universe, the script will always run.
A mix of the two -- making an effort to prevent crashes in some cases while allowing them in others and letting systemd restart the script.
The third choice seems the most reasonable to me. So the question is this: What factors should be considered when optimizing between "crash-proof" code and allowing a crash and restart by systemd?
For some more application specific information: there is a small but noticeable overhead involved with starting the script, the main portion will run between 50 to 100 times per second, it is not "mission critical" in that there will be no death/damage in the event of failure (just some data loss), and I already expect intermittent issues with the network it will be on.
All known exceptional cases should be handled. Any undefined behavior is a potential security issue.
As you suggest, it is also prudent to plan for unknown exceptions. Perhaps there's also a small memory leak that will also cause the application to crash even when it's running correctly. So, it's still prudent to have systemd automatically restart it if it fails, even when all expected failure modes have been handled.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am writing a simple daemon to receive data from N many mobile devices. The device will poll the server and send the data it needs as simple JSON. In generic terms the server will receive the data and then "do stuff" with it.
I know this topic has been beaten a bunch of times but I am having a hard time understanding the pros and cons.
Would threads or events (think Twisted in Python) work better for this situation as far as concurrency and scalability is concerned? The event model seems to make more sense but I wanted to poll you guys. Data comes in -> Process data -> wait for more data. What if the "do stuff" was something very computationally intensive? What if the "do stuff" was very IO intensive (such as inserting into a database). Would this block the event loop? What are the pros and drawbacks of each approach?
I can only answer in the context of Python, since that's where most of my experience is. The answer is actually probably a little different depending on the language you choose. Python, for example, is a lot better at parallelizing I/O intensive operations than CPU intensive operations.
Asynchronous programming libraries like twisted, tornado, gevent, etc. are really good at handling lots of I/O in parallel. If your workload involves many clients connecting, doing light CPU operations and/or lots of I/O operations (like db reads/writes), or if your clients are making long-lasting connections primarily used for I/O (think WebSockets), then an asynchronous library will work really well for you. Most of the asynchronous libraries for Python have asynchronous drivers for popular DBs, so you'll be able to interact with them without blocking your event loop.
If your server is going to be doing lots of CPU intensive work, you can still use asynchronous libraries, but have to understand that every time you're doing CPU work, the event loop will be blocked. No other clients are going to be able to anything at all. However, there are ways around this. You can use thread/process pools to farm the CPU work out, and just wait on the response asynchronously. But obviously that complicates your implementation a little bit.
With Python, using threads instead actually doesn't buy you all that much with CPU operations, because in most cases only one thread can run a time, so you're not really reaping the benefits of having a multi-core CPU (google "Python GIL" to learn more about this). Ignoring Python-specific issues with threads, threads will let you avoid the "blocked event loop" problem completely, and threaded code is usually easier to understand than asynchronous code, especially if you're not familiar with how asynchronous programming works already. But you also have to deal with the usual thread headaches (synchronizing shared state, etc.), and they don't scale as well as asynchronous I/O does with lots of clients (see http://en.wikipedia.org/wiki/C10k_problem)
Both approaches are used very successfully in production, so its really up to you to decide what fits your needs/preferences better.
I think your question is in the 'it depends' category.
Different languages have different strengths and weaknesses when it comes to threading/process/events (python having some special weaknesses in threading tied to the global interpreter lock)
Beyond that, operating systems ALSO have different strengths and weaknesses when you look at processes vs threads vs events. What is right on unix isn't going to be the same as windows.
With that said, the way that I sort out multifaceted IO projects is:
These projects are complex, no tool with simply make the complexity go away, therefor you have two choices on how you can deal:
Have the OS deal with as much complexity as possible, making life easier for the programers, but at the cost of machine efficiency
Have the programer take on as much complexity as is practical so that they can optimize the design and squeeze as much performance out machine as possible, at the cost of more complex code that requires significantly higher-end programers.
Option 1 is normally best accomplished by breaking apart the task into threads or processes with one blocking state-machine per thread/process
Option 2 is normally best accomplished by multiplexing all the tasks into one process and using the OS hook for an event system. (select/poll/epoll/kqueue/WaitForMultipleObjects/CoreFoundation/ libevent etc..)
In my experience project framework/internal-arch often come down to the skills of the programers at hand (and the budget the project has for hardware).
If you have programmers with a background in OS internals: Twisted will work great for python, Node.js will work great for Javascript, libevent/libev will work great for C or C++. You'll also end up with super efficient code that can scale easily, though you'll have a nightmare trying to hire more programmers
If you have newbie programers and you can dump money into lots of cloud services then breaking the project into many threads or processes will give you the best chance of getting something working, though scaling will eventually become a problem.
All-in-all I would say the sanest pattern for a project with multiple iterations is to prototype in simple blocking tools (flask) and then re-write into something harder/more-scalable (twisted), otherwise your falling in the classic Premature optimization is the root of all evil trap
The connection scheme is also important in the choice. How many concurrent connections do you expect ? How long will a client stay connected ?
If each connection is tied to a thread, many concurrent connections or very long lasting connections ( as with websockets ) will choke the system. For these scenarios an event loop based solution will be better.
When the connections are short and the heavy treatment comes in after the disconnection, both models weigh each other.
I have a couple of Python/Numpy programs that tend to cause the PC to freeze/run very slowly when they use too much memory. I can't even stop the scripts or move the cursor anymore, when it uses to much memory (e.g. 3.8/4GB)
Therefore, I would like to quit the program automatically when it hits a critical limit of memory usage, e.g. 3GB.
I could not find a solution yet. Is there a Pythonic way to deal with this, since I run my scripts on Windows and Linux machines.
You could limit the process'es memory limit, but that is OS specific.
Another solution would be checking value of psutil.virtual_memory(), and exiting your program if it reaches some point.
Though OS-independent, the second solution is not Pythonic at all. Memory management is one of the things we have operating systems for.
I'd agree that in general you want to do this from within the operating system - only because there's a reliability factor in having "possibly runaway code check itself for possibly runaway behavior"
If a hard and fast requirement is to do this WITHIN the script, then I think we'd need to know more about what you're actually doing. If you have a single large data structure that's consuming the majority of the memory, you can use sys.getsizeof to identify how large that structure is, and throw/catch an error if it gets larger than you want.
But without knowing at least a little more about the program structure, I think it'll be hard to help...
So I am an inexperienced Python coder, with what I have gathered might be a rather complicated need. I am a cognitive scientist and I need precise stimulus display and button press detection. I have been told that the best way to do this is by using real-time operating, but have no idea how to go about this. Ideally, with each trial, the program would operate in real-time, and then once the trial is over, the OS can go back to not being as meticulous. There would be around 56 trials. Might there be a way to code this from my python script?
(Then again, all I need to know is when a stimulus is actually displayed. The real-time method would assure me that the stimulus is displayed when I want it to be, a top-down approach. On the other hand, I could take a more bottom-up approach if it is easier to just know to record when the computer actually got a chance to display it.)
When people talk about real-time computing, what they mean is that the latency from an interrupt (most commonly set off by a timer) to application code handling that interrupt being run, is both small and predictable. This then means that a control process can be run repeatedly at very precise time intervals or, as in your case, external events can be timed very precisely. The variation in latency is usually called "jitter" - 1ms maximum jitter means that an interrupt arriving repeatedly will have a response latency that varies by at most 1ms.
"Small" and "predictable" are both relative terms and when people talk about real-time performance they might mean 1μs maximum jitter (people building inverters for power transmission care about this sort of performance, for instance) or they might mean a couple of milliseconds maximum jitter. It all depends on the requirements of the application.
At any rate, Python is not likely to be the right tool for this job, for a few reasons:
Python runs mostly on desktop operating systems. Desktop operating systems impose a lower limit on the maximum jitter; in the case of Windows, it is several seconds. Multiple-second events don't happen very often, every day or two, and you'd be unlucky to have one coincide with the thing you're trying to measure, but sooner or later it will happen; jitter in the several-hundred-milliseconds region happens more often, perhaps every hour, and jitter in the tens-of-milliseconds region is fairly frequent. The numbers for desktop Linux are probably similar, though you can apply different compile-time options and patch sets to the Linux kernel to improve the situation - Google PREEMPT_RT_FULL.
Python's stop-the-world garbage collector makes latency non-deterministic. When Python decides it needs to run the garbage collector, your program gets stopped until it finishes. You may be able to avoid this through careful memory management and carefully setting the garbage collector parameters, but depending on what libraries you are using, you may not, too.
Other features of Python's memory management make deterministic latency difficult. Most real-time systems avoid heap allocation (ie C's malloc or C++'s new) because the amount of time they take is not predictable. Python neatly hides this from you, making it very difficult to control latency. Again, using lots of those nice off-the-shelf libraries only makes the situation worse.
In the same vein, it is essential that real-time processes have all their memory kept in physical RAM and not paged out to swap. There is no good way of controlling this in Python, especially running on Windows (on Linux you might be able to fit a call to mlockall in somewhere, but any new allocation will upset things).
I have a more basic question though. You don't say whether your button is a physical button or one on the screen. If it's one on the screen, the operating system will impose an unpredictable amount of latency between the physical mouse button press and the event arriving at your Python application. How will you account for this? Without a more accurate way of measuring it, how will you even know whether it is there?
Python is not, by purist's standards, a real-time language- it has too many libraries and functions to be bare-bones fast. If you're already going through an OS though, as opposed to an embedded system, you've already lost a lot of true real time capability. (When I hear "real time" I think the time it takes VHDL code to flow through the wires of an FPGA. Other people use it to mean "I hit a button and it does something that is, from my slow human perspective, instantaneous". I'll assume you're using the latter interpretation of real time.)
By stimulus display and button press detection I assume you mean you have something (for example) like a trial where you show a person an image and have them click a button to identify the image or confirm that they've seen it- perhaps to test reaction speed. Unless you're worried about accuracy down to the millisecond (which should be negligible compared to the time for a human reaction) you would be able to do a test like this using python. To work on the GUI, look into Tkinter: http://www.pythonware.com/library/tkinter/introduction/. To work on the timing between stimulus and a button press, look at the time docs: http://docs.python.org/library/time.html
Good luck!
Because you are trying to get a scientific measurement on a time delay in millisecond precision, I cannot recommend any process that is subject to time slicing on a general purpose computer. Whether implemented in C, or Java, or Python, if it runs in a time-shared mode, then how can the result be verifiable? You could be challenged to prove that the CPU never interrupted the process during a measurement, thereby distorting the results.
It sounds like you may need to construct a dedicated device for this purpose, with a clock circuit that ticks at a known rate and can measure the discrete number of ticks that occur between stimulus and response. That device can then be controlled by software that has no such timing constraints. Maybe you should post this question to the Electrical Engineering exchange.
Without a dedicated device, you will have to develop truly real-time software that, it terms of modern operating systems, runs within the kernel and is not subject to task switching. This is not easy to do, and it takes a lot of effort to get it right. More time, I would guess, than you would spend building a dedicated software-controllable device for your purpose.
Most common operating systems' interrupts are variable enough to ruin timing in your experiment regardless of your programming language. Python adds it's own unreliability. Windows interrupts are especially bad. In Windows, most interrupts are serviced in about 4 milliseconds, but occasionally an interrupts last longer than 35 milliseconds! (Windows 7).
I would recommend trying the PsycoPy application to see if will work for you. It approaches the problem by trying to make the graphics card do the work in openGL, however some of it's code still runs outside the graphics card and is subject to the operating system's interrupts. Your existing python code may not be compatible with PsycoPy, but at least you would stay in Python. PsycoPy is especially good at showing visual stimulations without timing issues. See this page in their documentation to see how you would handle a button press: http://www.psychopy.org/api/event.html
To solve your problem the right way, you need a real-time operating system, such as LinuxRT or QNX. You could try your python application in one of those to see if running python in a real-time environment is good enough, but even python introduces variability. If python decides to garbage collect, you will have a glitch. Python itself isn't real time.
National Instruments sells a setup that allows you to program in real-time in a very easy-to-use programming language called LabviewRT. LabviewRT pushing your code into an FPGA daughter card that operates in real time. It's expensive.
I strongly suggest you don't just minimize this problem, but solve it, otherwise, your reviewers will be uncomfortable.
If you are running the Python code on Linux machine, make the kernel low latency (preemptive).
There is a flag for it when you compile the kernel.
Make sure that other processes running on the machine are minimum so they do not interrupt the kernel.
Assign higher task priority to your Python script.
Run the python interpreter on a real time operating system or tweaked linux.
Shut down the garbage collector during the experiments and back on afterward.
Maybe actively trigger a garbage collection round after the end of an experiment.
Additionally, keep in mind that showing an image is not instantaneous. You must synchronize your experiment with your monitor's vertical retrace phase (the pause between transmitting the last line of a frame of the display's content and the first line of the next frame).
I would start the timer at the beginning of the vsync phase after transmission of the frame containing whatever candidates are supposed to react to.
And one would have to keep in mind that the image is going to be ast least partially visible a bit earlier than that for purposes of getting absolute reaction times as opppsed to just well comparable results with ~ half a frame of offset due to the non-instantaneous appearance of the monitor's contents (~10 ms # 60Hz).