My plan is whenever a script is run, to take the current time and create a folder for the log files I generate. Each step of the script makes a new log file in this folder.
The issue is, I'm trying to figure out how to use the same folder for the duration of the script. What's happening now is each time a logger is created it gets the current time then, making a new folder each time.
Here's what the module looks like:
logger.py
...
CURRENT_TIME = # get current time
LOG_FOLDER = "logs/%s/" % CURRENT_TIME
...
def get_logger(name):
# Create a log file with the given name in the LOG_FOLDER
What happens is each time I import logger.py it recalculates CURRENT_TIME. I figured the way to avoid this is to do from logger.py import * which executes the code once for all the modules, making them have the same log folder.
The main issue is the script calls other python scripts, spawning new processes and the like. When these new processes import logger, they haven't imported it yet so they will regenerate CURRENT_TIME.
So what's a good way to fix this? One solution I thought of is have the logger use a constant folder called temp and when the main script finishes rename it to the current time (or store the current time at the beginning of the script). I don't really like this solution as if there is an exception or error and the script dies, the folder will remain temp when I want it in that case to be the time of the script so I know when it failed.
The classic solution would be to use an environment variable which will be shared between your process and all children.
Demo (save as x.py, warning: will call itself over and over again):
import os
import random
from subprocess import call
def get_logger():
return os.environ.setdefault("LOG_DIR", logger_name())
def logger_name(): return "log" + str(random.random())
print "Process " + str(os.getpid()) + " uses " + get_logger()
call(["python", "x.py"])
References: os.environ docs, the concept of environment variables
A better approach would be to implement a RotatingFileHandler and execute doRollover() at application startup.
when you first start your script write the starttime into a file in the temp directory.
When the logger runs it looks in the file in the temp directory for the time value.
If your script crashed then when you start the main script again it will replace the previous value so the logs for this run will be in the correct location.
You could even have a cleanup script remove the starttime from the temp directory.
Related
I have 4 python files in one project folder. main.py first.py second.py variables.py
I only run main.py. This file sequentially calls first.py, then second.py. Then, main.py, first.py and second.py imports variables.py.
The content of variables.py is simply the declaration of a "shared" variable across the three.
I wanted first.py to modify this shared variable and then I want this change to be carried over when the process goes back to main.py (after returning from first.py) and when the second.py is finally called.
I initially thought I would be able to do this since the variable was declared in a separate py file, but its not working.
My understanding of what's happening is:
first.py imports variables.py. This action causes the variable to be declared with it's value set to initial.
first.py modifies this shared variable.
first.py execution ends and goes back to main.py. At this point, I see that the value of shared variable is back to initial. Why is that? Is it because first.py execution ends? But why did it happen even if the shared variable is declared in another python file?
I appreciate anyone who can enlighten me on what's happening (how the shared variable is stored in memory, what script will determine is lifetime, ending which script will end this variable, etc..). And I appreciate suggestions on how do I go about this. At this point, I am already considering on simply writing the modified shared variable value (on first.py) to an external text file then simply read and re-initialize when second.py is called later.
My codes are below. To run the project, simply run main.py
main.py
import subprocess
import os
import variables
programs = ['first.py', 'second.py']
path=os.getcwd() + '\\running multiple py with shared variables\\'
for program in programs:
subprocess.run(['python', path+program])
print('running main.py')
print(variables.shared_variable)
first.py
import variables
print('running first.py')
variables.shared_variable = 'First modification'
print(variables.shared_variable)
second.py
import variables
print('running second.py')
print(variables.shared_variable)
variables.py
shared_variable = "Initial value"
Output of program on terminal:
running first.py
First modification
running main.py
Initial value -> I really want this to be "First
modification"
running second.py
Initial value -> I really want this to be "First
modification"
running main.py
Initial value -> I don't really care here but I honestly
expected this to be "First modification"
as well` ```
There's no shmem() shared memory going on here.
for program in programs:
subprocess.run(['python', path + program])
You spawned a pair of child processes.
Which each computed a result.
And then called exit(), discarding the result.
If the child doesn't serialize a result
which the parent parses, then
the result is gone forever.
The symptoms you report are exactly what is expected.
The good news is that you have excellent skills
for thinking about this problem analytically,
and reporting
to others what happened.
So I am confident you will soon implement a
satisfactory solution.
Credits to #J_H for mentioning shared memory concept.
I was able to solve my problem using shared_memory class from multiprocessing module.
I am sharing the modified code. With this implementation, the file variables.py is not even needed since you need to track only the decided name of the shared memory block (shared_memory1 here). If you don't specify it, it will be auto-generated which you can check in the object attributes.
main.py
import subprocess
import os
import variables
from multiprocessing import shared_memory
shm = shared_memory.SharedMemory(create=True, size=500, name='shared_memory1')
programs = ['first.py', 'second.py']
path=os.getcwd() + '\\running multiple py with shared variables\\'
for program in programs:
subprocess.run(['python', path+program])
print('running main.py')
ctr = 0
retrieved_val = ''
while (shm.buf[ctr] != 0):
retrieved_val = retrieved_val + chr(shm.buf[ctr])
ctr = ctr + 1
variables.shared_variable = retrieved_val
print(variables.shared_variable)
shm.close()
shm.unlink()
first.py
import variables
from multiprocessing import shared_memory
shm = shared_memory.SharedMemory(name='shared_memory1')
print('running first.py')
variables.shared_variable = 'First modification'
ctr = 0
for val in variables.shared_variable:
shm.buf[ctr] = ord(val)
ctr = ctr + 1
print(variables.shared_variable)
shm.close()
second.py
import variables
from multiprocessing import shared_memory
shm = shared_memory.SharedMemory(name='shared_memory1')
ctr = 0
retrieved_val = ''
while (shm.buf[ctr] != 0):
retrieved_val = retrieved_val + chr(shm.buf[ctr])
ctr = ctr + 1
print('running second.py')
variables.shared_variable = retrieved_val
print(variables.shared_variable)
shm.close()
variables.py
shared_variable = "Initial value"
Terminal output after running main.py:
running first.py
First modification
running main.py
First modification
running second.py
First modification
running main.py
First modification
I have two Python files (main.py and main_test.py). The file main_test.py is executed within main.py. When I do not use a log file this is what gets printed out:
Main file: 17:41:18
Executed file: 17:41:18
Executed file: 17:41:19
Executed file: 17:41:20
When I use a log file and execute main.py>log, then I get the following:
Executed file: 17:41:18
Executed file: 17:41:19
Executed file: 17:41:20
Main file: 17:41:18
Also, when I use python3 main.py | tee log to print out and log the output, it waits and prints out after finishing everything. In addition, the problem of reversing remains.
Questions
How can I fix the reversed print out?
How can I print out results simultaneously in terminal and log them in a correct order?
Python files for replication
main.py
import os
import time
import datetime
import pytz
python_file_name = 'main_test'+'.py'
time_zone = pytz.timezone('US/Eastern') # Eastern-Time-Zone
curr_time = datetime.datetime.now().replace(microsecond=0).astimezone(time_zone).time()
print(f'Main file: {curr_time}')
cwd = os.path.join(os.getcwd(), python_file_name)
os.system(f'python3 {cwd}')
main_test.py
import pytz
import datetime
import time
time_zone = pytz.timezone('US/Eastern') # Eastern-Time-Zone
for i in range(3):
curr_time = datetime.datetime.now().replace(microsecond=0).astimezone(time_zone).time()
print(f'Executed file: {curr_time}')
time.sleep(1)
When you run a script like this:
python main.py>log
The shell redirects output from the script to a file called log. However, if the script launches other scripts in their own subshell (which is what os.system() does), the output of that does not get captured.
What is surprising about your example is that you'd see anything at all when redirecting, since the output should have been redirected and no longer echo - so perhaps there's something you're leaving out here.
Also, tee waits for EOF on standard in, or for some error to occur, so the behaviour you're seeing there makes sense. This is intended behaviour.
Why bother with shells at all though? Why not write a few functions to call, and import the other Python module to call its functions? Or, if you need things to run in parallel (which they didn't in your example), look at multiprocessing.
In direct response to your questions:
"How can I fix the reversed print out?"
Don't use redirection, and write to file directly from the script, or ensure you use the same redirection when calling other scripts from the first (that will get messy), or capture the output from the subprocesses in the subshell and pipe it to the standard out of your main script.
"How can I print out results simultaneously in terminal and log them in a correct order?"
You should probably just do it in the script, otherwise this is not a really a Python question and you should try SuperUser or similar sites to see if there's some way to have tee or similar tools write through live.
In general though, unless you have really strong reasons to have the other functionality running in other shells, you should look at solving your problems in the Python script. And if you can't, use you can use something like Popen or derivatives to capture the subscript's output and do what you need instead of relying on tools that may or may not be available on the host OS running your script.
i want to run multiple python scripts (let say they are all start
from a main script called main.py) which will write to one log
file contains the date and time creation, also, I need the
logger to be written to the console and the file. python 2.
I tried many different ways with no success,
Example:
main.py : run scripts python1.py and python2.py one after one, and
the three python scripts will write to the same log file which have
a date and time in it's name, and the log showed on the console
while running.
Also, Is something like this can be done through a python script
which is apareted from these files? for example 4th file called log_to_one_file.py?
if somebody know how to make it happen - i will be glad to know...
From the docs:
Multiple calls to logging.getLogger('someLogger') return a reference to the same logger object. This is true not only within the same module, but also across modules as long as it is in the same Python interpreter process. It is true for references to the same object;
That sounds like a homework exercise, which assumes the non-existence of ready python modules.
In that case it will be better to check the python file operations and create a fuction that receives as arguments the log content and the log file path, which appends to that file the contents, after prefixing the date and time. Then you can create a file in the same folder called log_to_one_file.py, place the function there, and then import it to your file you need logging using from log_to_one_file import function
In case a more plug-in solution is needed, a more advanced answer is that you could ovewrite the default sys.STDOUT variable and append a file streamer to that like that:
import sys
import sys
class Logger(object):
def __init__(self, path):
self.terminal = sys.stdout
self.log = open(path, "a")
def write(self, message):
self.terminal.write(message)
self.log.write(message)
def flush(self):
pass
sys.stdout = Logger(path)
In the above solution wou can modify the message with a date and time prefix inside write method of the Logger class.
Then, after this piece of code has run, whenever you call print in your program, it will automatically write to the file and log to your console, using the formatting you have set it to have.
You can have a reading on Python context managers , if you want to apply the above solution in a more limited environment
Is it possible to have a script run on a file when it's created if it has a specific extension?
let's call that extension "bar"
if I create the file "foo.bar", then my script will run with that file as an input.
Every time this file is saved, it would also run on the file.
Can I do that? If yes, how? If not, why?
note: If there is some technicality of why this is impossible, but I can do very close, that works too!
If you are using linux use pyinotify described on the website as follows: Pyinotify: monitor filesystem events with Python under Linux.
If you also want it to work using Mac OS X and Windows, you can have a look at this answer or this library.
You could do what Werkzeug does (this code copied directly from the link):
def _reloader_stat_loop(extra_files=None, interval=1):
"""When this function is run from the main thread, it will force other
threads to exit when any modules currently loaded change.
Copyright notice. This function is based on the autoreload.py from
the CherryPy trac which originated from WSGIKit which is now dead.
:param extra_files: a list of additional files it should watch.
"""
from itertools import chain
mtimes = {}
while 1:
for filename in chain(_iter_module_files(), extra_files or ()):
try:
mtime = os.stat(filename).st_mtime
except OSError:
continue
old_time = mtimes.get(filename)
if old_time is None:
mtimes[filename] = mtime
continue
elif mtime > old_time:
_log('info', ' * Detected change in %r, reloading' % filename)
sys.exit(3)
time.sleep(interval)
You'll have to spawn off a separate thread (which is what Werkzeug does), but that should work for you if you if you don't want to add pyinotify.
I need to create a folder that I use only once, but need to have it exist until the next run. It seems like I should be using the tmp_file module in the standard library, but I'm not sure how to get the behavior that I want.
Currently, I'm doing the following to create the directory:
randName = "temp" + str(random.randint(1000, 9999))
os.makedirs(randName)
And when I want to delete the directory, I just look for a directory with "temp" in it.
This seems like a dirty hack, but I'm not sure of a better way at the moment.
Incidentally, the reason that I need the folder around is that I start a process that uses the folder with the following:
subprocess.Popen([command], shell=True).pid
and then quit my script to let the other process finish the work.
Creating the folder with a 4-digit random number is insecure, and you also need to worry about collisions with other instances of your program.
A much better way is to create the folder using tempfile.mkdtemp, which does exactly what you want (i.e. the folder is not deleted when your script exits). You would then pass the folder name to the second Popen'ed script as an argument, and it would be responsible for deleting it.
What you've suggested is dangerous. You may have race conditions if anyone else is trying to create those directories -- including other instances of your application. Also, deleting anything containing "temp" may result in deleting more than you intended. As others have mentioned, tempfile.mkdtemp is probably the safest way to go. Here is an example of what you've described, including launching a subprocess to use the new directory.
import tempfile
import shutil
import subprocess
d = tempfile.mkdtemp(prefix='tmp')
try:
subprocess.check_call(['/bin/echo', 'Directory:', d])
finally:
shutil.rmtree(d)
"I need to create a folder that I use only once, but need to have it exist until the next run."
"Incidentally, the reason that I need the folder around is that I start a process ..."
Not incidental, at all. Crucial.
It appears you have the following design pattern.
mkdir someDirectory
proc1 -o someDirectory # Write to the directory
proc2 -i someDirectory # Read from the directory
if [ %? == 0 ]
then
rm someDirectory
fi
Is that the kind of thing you'd write at the shell level?
If so, consider breaking your Python application into to several parts.
The parts that do the real work ("proc1" and "proc2")
A Shell which manages the resources and processes; essentially a Python replacement for a bash script.
A temporary file is something that lasts for a single program run.
What you need is not, therefore, a temporary file.
Also, beware of multiple users on a single machine - just deleting anything with the 'temp' pattern could be anti-social, doubly so if the directory is not located securely out of the way.
Also, remember that on some machines, the /tmp file system is rebuilt when the machine reboots.
You can also automatically register an function to completely remove the temporary directory on any exit (with or without error) by doing :
import atexit
import shutil
import tempfile
# create your temporary directory
d = tempfile.mkdtemp()
# suppress it when python will be closed
atexit.register(lambda: shutil.rmtree(d))
# do your stuff...
subprocess.Popen([command], shell=True).pid
tempfile is just fine, but to be on a safe side you'd need to safe a directory name somewhere until the next run, for example pickle it. then read it in the next run and delete directory. and you are not required to have /tmp for the root, tempfile.mkdtemp has an optional dir parameter for that. by and large, though, it won't be different from what you're doing at the moment.
The best way of creating the temporary file name is either using tempName.TemporaryFile(mode='w+b', suffix='.tmp', prifix='someRandomNumber' dir=None)
or u can use mktemp() function.
The mktemp() function will not actually create any file, but will provide a unique filename (actually does not contain PID).