Starting an IPython cluster from the notebook with a delay

Starting an IPython cluster from the notebook with a delay - python

Our SGE cluster setup requires there to be a delay between controller and engines starting. If this delay is not there, some of the servers use "old" ipcontroller-client.json files and attempt to connect to previous (and not running) controllers. This is an NFS "feature", so to remedy, I set c.IPClusterStart.delay = 30 in the ipcluster_config.py file and things work well. The controller gets submitted to SGE, has enough time to start and write its json files, and then the engines can start correctly to the newly running controller. However, I'd like to also be able to start the cluster from the notebook. Unfortunately, it appears that this timeout is not used, the controller and engines start up at the same time (as seen with watch qstat), some of the engines connect (because the pick up the new settings from the json file) and some do not (because of NFS).
I ran an strace on the notebook and saw that it's using sge_controller and sge_engines scripts (created by the notebook when you press start) to start these processes.
I'm wondering if there's any way to implement a delay here, as well. It's starting the controller and engines the right way (SGE) so I know it's reading the ipcluster_config.py.
I've Googled around and searched this site, with no luck. Hoping maybe someone can shed some light on the deeper workings of this behavior.
Thanks,
Chris

Well, this is probably too late for the OP, but hopefully it helps someone.
If it is a timeout issue, just set c.EngineFactory.timeout and c.IPEngineApp.wait_for_url_file to some larger times.
If it is due to failure after the first run, it is probably due to lingering security files, which should be deleted ( ipcontroller-engine.json and ipcontroller-client.json ) from the relevant iPython profile using IPython.utils.path.get_security_file to get the full paths. To automate this and make it somewhat less painful, this deletion step can be tacked on to the beginning of the same profile's ipcluster_config.py.
These changes alone were enough for me to get the cluster running with the notebook easily.
If neither of these solve the problem, there are some other thoughts ( http://mail.scipy.org/pipermail/ipython-user/2011-November/008741.html ).

Related

How to resync time automatically with python on a windows server?

Ive been looking the net dry to find a solution to this and i hope you can help me.
The main goal is that i have my client which interacts with Bybit API servers, their servers has a strict time window offset that i need to be in the bounds of and to do it so, i choose the method of resyncing my time since that worked, but more options might be available for this so feel free to let me know if you got other suggestions.
What i am looking for, is a way for me to tell the python script to resync my time.
It Could be something like w32tm /resync, however all that ive found out after alot of testing is that any deployed script even adminstrator shell commands can not execute w32tm commands unless a typed password is used, and even trying with a typing emulator to automaticly fake the typing, that didnt work.
So is there another way for me to force a /resync of time?
Im looking forward to hearing your answers and hopefully you can stear me in the right direction.
Best regards.
Mathias.

Easier install NPT software o the machines and it runs a service.
These are accurate.
As suggested you let one machine in the network sync time from Meinberg and the rest of the network gets their time form the NTP Server.
https://www.meinbergglobal.com/english/sw/ntp.htm

Save a Script Variables inside code and reset them after reboot

in my vps i have run 4 Python Script and its been 60 days that i don't reboot my vps and now i have to, but if i reboot vps my python Variables & data will be removed because i don't store them in file and they are store in variables in python script.
my OS is Ubuntu Server 16.04 LTS and i was run my python codes with nohup command until they can run in background.
Now i need a way to stop my scripts without removing they variables and start them with same variables and data after i reboot my vps.
Is There Any Way That I Can Do This?
In Addition, I'm sorry for writing mistakes in my question.

Python doesn't provide any way of doing this.
But you might be able to use CRIU, or a similar tool, to freeze and snapshot the interpreter process. Then, after restart, you can resume the snapshot into a new process that just picks up exactly where you left off.
It may not work.1 But there's a good chance it will. This is essentially the same thing as a Live Migration in the CRIU docs, except that you're not migrating to a new computer/container/etc., just to the future of the same computer. So, start reading with that page, and follow the links from there.
You should probably test before you commit to it.
* Try it (obviously don't include the system restart, just kill -9 the executable) on a Python script that doesn't do anything important (maybe increments a counter, print it out, sleep for a second, repeat.
* Maybe try it on a script that does similar kinds of stuff to what yours are doing.
* If it's safe to have two copies of one of your programs running at the same time (they're not going to stomp all over each other writing to the same file, or fight over the same socket, or whatever), start a second copy and test dump/kill/resume that.
* Try it on one of your real processes, still without restart.
* Try it on all four.
* Cross your fingers, sacrifice a chicken, and do it for real.
If that doesn't pan out, the only option I can think of is to go through your scripts, manually figure out everything that needs to be saved and how it could be accessed from the top-level global, and do that in the debugger.
Ideally, you'll write a script that will automate accessing and saving all that stuff—plus another one to feed it into a new instance at restart. Then you just pdb the live interpreters and start dumping everything.
This is guaranteed to be a whole lot of work, and not much fun. On the plus side, it is guaranteed to work if you do it right. On the third hand, it's pretty easy to not do it right.
1. If you rely on open files, pipes, sockets, etc., CRIU does about as much as you could do, which is more than you might expect at first, but still not everything you could possibly want… Also, if you're using almost all of your RAM, it can be hard to wedge things back into exactly the same state. And there are probably other possible issues.

Python: Running Daemon Processes in Windows7

I had a program that Scraped certain data from certain Web-Pages, and when the Web-Pages changed, acted accordingly.
How would one set up the program so it continues to run in the background?
I don't need any specifics
I'm just really confused on this concept and would appreciate whatever help anybody has to offer.

start path-to-pythonw.exe your-code.py
pythonw means without console.
start means start on background.
if your python is installed system-wide, you can probably start your-code.pyw
.pyw is associated with pythonw.exe
remember you cannot use print (to stdout) in this case.

If you want to be able to just start your process and have it background itself and do a few more typical things that "daemon" processes do in Unix, look here: How do you create a daemon in Python?

There is no concept of "background" in Windows. But the UNIX shell concept of a background process can be reasonably emulated by running your Python script as a Windows service. There are a couple of suggestions in this question: Is it possible to run a Python script as a service in Windows? If possible, how?
For casual use, I suggest that you learn how to use srvany from the second answer.

You simply need to leave your program running! Please google "python daemon" and see how to implement a persistent background process in Python.
Now, you cannot know when a website changes unless you poll it. If the website is well designed, the page you are trying to poll will have a "Last-Modified" header, you can make a "HEAD" request every so often (be nice: don't poll like crazy) and act when Last-Modified is >= than the one on record. If the site is not well designed, it will not have a reliable Last-Modified or ETAG header, in that case you will have to parse manually and check for changes yourself.
Cheers.

Python watching for process start up?

Is there any way to watch for a new process with name 'X' starting in python (ideally) or bash? I know that I can look at running processes, but that is not fast enough for my needs. The only think that I can think of is some how hooking into the new process, and registering that, but how?
More background: I am part of a CCDC team (http://www.nationalccdc.org/) and am on the blue team. The premise of the competition is to give students a network to defend against professional pen testers to help the next generation of security experts be better. What I want to do is load this python script on the linux boxs and watch for certain commands that are being run, that likely would only be used by the red team, for example the 'chattr' command. Ideally I would like to be able to provide the script a list of processes to watch. I can figure out that part but do not know how to watch for a process spawning.
Any direction is appreciated. Thank you.

I know of no way for a process which does not have root privileges to be notified when a process is started via any means on a fully-running Linux system. If polling isn't fast enough, you're going to have to do some serious hackery.
If you've got root, this is possible. If not, I can't see it.
With root, you could set a system-wide replacement of the fork and exec system calls which provides you with your desired notification. This could be in the kernel, or it could be an LD_PRELOAD hack.
This applies not just to Python; even with a C program, I don't know of an "inotify for process creation".

I have not tested this idea, but on Linux each process is given a directory under /proc/<it's process id>/ If you opened an inotify on directory creation in /proc you might be able to track creation of process directories and then see if /proc/<dir>/cmdline matches the process your looking for. This is just a thought, hope it helps!

Testing for mysterious load errors in python/django

This is related to this Configure Apache to recover from mod_python errors, although I've since stopped assuming that this has anything to do with mod_python. Essentially, I have a problem that I wasn't able to reproduce consistently and I wanted some feedback on whether the proposed solution seems likely and some potential ways to try and reproduce this problem.
The setup: a django-powered site would begin throwing errors after a few days of use. They were always ImportErrors or ImproperlyConfigured errors, which amount to the same thing, since the message always specified trouble loading some module referenced in the settings.py file. It was not generally the same class. I am using preforked apache with 8 forked children, and whenever this problem would come up, one process would be broken and seven would be fine. Once broken, every request (with Debug On in the apache conf) would display the same trace every time it served a request, even if the failed load is not relevant to the particular request. An httpd restart always made the problem go away in the short run.
Noted problems: installation and updates are performed via svn with some post-update scripts. A few .pyc files accidentally were checked into the repository. Additionally, the project itself was owned by one user (not apache, although apache had permissions on the project) and there was a persistent plugin that ended up getting backgrounded as root. I call these noted problems because they would be wrong whether or not I noticed this error, and hence I have fixed them. The project is owned by apache and the plugin is backgrounded as apache. All .pyc files are out of the repository, and they are all force-recompiled after each checkout while the server and plugin have been stopped.
What I want to know is
Do these configuration disasters seem like a likely explanation for sporadic ImportErrors?
If there is still a problem somewhere else in my code, how would I best reproduce it?
As for 2, my approach thus far has been to write some stress tests that repeatedly request the same page so as to execute common code paths.
Incidentally, this has been running without incident for about 2 days since the fix, but the problem was observed with 1 to 10 day intervals between.

"Do these configuration disasters seem like a likely explanation for sporadic ImportErrors"
Yes. An old .pyc file is a disaster of the first magnitude.
We develop on Windows, but run production on Red Hat Linux. An accidentally moved .pyc file is an absolute mystery to debug because (1) it usually runs and (2) it has a Windows filename for the original source, making the traceback error absolutely senseless. I spent hours staring at logs -- on linux -- wondering why the file was "C:\This\N\That".
"If there is still a problem somewhere else in my code, how would I best reproduce it?"
Before reproducing errors, you should try to prevent them.
First, create unit tests to exercise everything.
Start with Django's tests.py testing. Then expand to unittest for all non-Django components. Then write yourself a "run_tests" script that runs every test you own. Run this periodically. Daily isn't often enough.
Second, be sure you're using logging. Heavily.
Third, wrap anything that uses external resources in generic exception-logging blocks like this.
try:
some_external_resource_processing()
except Exception, e:
logger.exception( e )
raise
This will help you pinpoint problems with external resources. Files and databases are often the source of bad behavior due to permission or access problems.
At this point, you have prevented a large number of errors. If you want to run cyclic load testing, that's not a bad idea either. Use unittest for this.
class SomeLoadtest( unittest.TestCase ):
def test_something( self ):
self.connection = urllib2.urlopen( "localhost:8000/some/path" )
results = self.connection.read()
This isn't the best way to do things, but it shows one approach. You might want to start using Selenium to test the web site "from the outside" as a complement to your unittests.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.