Python test if an element is in a dictionary - python

I am trying to make a python program call notepad.exe it not already running. Right now, I'm using psutil with
pinfo = proc.as_dict(attrs=['name'])
if ("{'name': %s}"%SomeProcess) != str(pinfo):
subprocess.call("%s"%SomeProcess, shell=True)
However, this won't work because the subprocess.call will call for every name on the list besides the one it is looking for.
Knowing how to use subprocess and some of psutil I should know this, but is there any way to see if a dictionary has a preset string in it with one line of code? Something like
pinfo = proc.as_dict(attrs=['name'])
if pinfo contains "somename":
do something
If this is possible, (99% sure it is.) can it be done without a loop? (I want to update the process list every second or so.)
Thanks!
Edit:
Okay, probably should have given slightly more code, as that would have been relevant.
proc.as_dict()
by itself won't do anything, so I have this is a "for" loop.
for proc in psutil.process_iter():
pinfo = proc.as_dict(attrs=['name'])
How could I change that to output a dictionary, rather than a single line*, and would pinfo.get('somename') work on that created dictionary?
*If I use print(pinfo) it outputs one line. Something like {'name': 'pythonw.exe'}.

I think, you want to do something like that:
names = set()
for proc in psutil.process_iter():
names.add(proc.name())
if SomeProcess not in names:
subprocess.call("%s"%SomeProcess, shell=True)
It does not make to much sense, to create a dictionary with one entry and what you really wanted to do was to scan all processes in your system (I guess). So you already mentioned the iter, only your question was a little weird. Using the set (names) makes the decision a little more performant. You also could use a list here.

Related

Confusion about returning values when files finish getting executed

So I have 2 files, fish_life_simulator.py and menu.py. fish_life_simulator.py is the main file and executes other files like menu.py depending on what happens. So here is the code and how it should work:
import os
os.chdir(os.path.dirname(__file__))
result = exec(open(r'menu.py', encoding='utf-8').read())
print(result)
So at first when the code arrives to result = exec(open(r'menu.py', encoding='utf-8').read()) it executes menu.py and all is fine, but it could stop for several reasons:
The player exit the game
The player entered settings
The player pressed play
So what I decided to do, is when menu.py will stop running it will return a value, like 1, 2 or 3, so I tried several methods that have been included in here:
Best way to return a value from a python script
like using return or sys.exit("some value here"), but even though I did the part inside of menu.py, neither of them worked, as when I tried return, result from result = exec(open(r'menu.py', encoding='utf-8').read()) always was None for some reason and when I tried sys.exit(1) for example, result didn't get printed at all, so I was just wandering if it was something I was missing inside of fish_life_simulator.py, because the part with sending the value should be fine, but the part of receiving it is problematic.
Just define a function in menu.py:
def do_stuff_in_menu():
...
return result
and in fish_life_simulator.py you just call that function:
import menu
result = menu.do_stuff_in_menu()
print(result)
I agree with everyone who says exec() is not the best way to do this, however, since that's not your question, here's an answer for you.
The exec() function always returns None (see docs). If you need the return code, you could use os.system() or one of the various methods from the subprocess library. Unlike exec(), however, both of these alternatives would create a child process.
That said, I personally would not use any of those methods, but would instead modify menu.py to allow you to import it. It's much more natural and direct.

I want to update a list that I'm using in a while loop, while the loop is still running. Is there a way to do this?

I have a function that uses a while loop, which I ideally want to set up and then run in the background. In this while loop, I use a list. What I want to do is if I think of something else to put in this list, I can simply edit the list, and then the next time the loop begins the updated list is used. At the moment, I can't seem to find a way to do this.
I tried to define the list in a separate programme, and then import it at the start of each loop. I have then updated the list in the separate programme, but this hasn't been reflected in the output.
import time
while True:
from list_test import sample_list
print(sample_list)
time.sleep(30)
When I update sample_list, the output doesn't change. Does anyone know why this is? Apologies if the solution is simple, I'm quite new to programming in general!
As already stated in the comments it generally not adviced to update a list you are iterating over (even though you are just printing it in your example). That said you could use importlib with the reload method. For example like this:
import time
import importlib
list_module = importlib.import_module("list_test")
sample_list = list_module.sample_list
while True:
sample_list = importlib.reload(list_module).sample_list
print(sample_list)
time.sleep(5)
Note that you have to update the list by hand to see the changes. Updating the list with another program at runtime will not work.

RAM is not freed after a Python function is invoked

I'm using an in-house Python library for scientific computing. I need to consecutively copy an object, modify it, and then delete it. The object is huge which causes my machine to run out of memory after a few cycles.
The first problem is that I use python's del to delete the object, which apparently only dereferences the object, rather than freeing up RAM.
The second problem is that even when I encapsulate the whole process in a function, after the function is invoked, the RAM is still not freed up. Here's a code snippet to better explain the issue.
ws = op.core.Workspace()
net = op.network.Cubic(shape=[100,100,100], spacing=1e-6)
proj = net.project
def f():
for i in range(5):
clone = ws.copy_project(proj)
result = do_something_with(clone)
del clone
f()
gc.collect()
>>> ws
{'sim_01': [<openpnm.network.Cubic object at 0x7fed1c417780>],
'sim_02': [<openpnm.network.Cubic object at 0x7fed1c417888>],
'sim_03': [<openpnm.network.Cubic object at 0x7fed1c417938>],
'sim_04': [<openpnm.network.Cubic object at 0x7fed1c417990>],
'sim_05': [<openpnm.network.Cubic object at 0x7fed1c4179e8>],
'sim_06': [<openpnm.network.Cubic object at 0x7fed1c417a40>]}
My question is how do I completely delete a Python object?
Thanks!
PS. In the code snippet, each time ws.copy_project is called, a copy of proj is stored in ws dictionary.
There are some really smart python people on here. They may be able to tell you better ways to keep your memory clear, but I have used leaky libraries before, and found one (so-far) foolproof way to guarantee that your memory gets cleared after use: execute the memory hog in another process.
To do this, you'd need to arrange for an easy way to make your long calculation be executable separately. I have done this by adding special flags to my existing python script that tells it just to run that function; you may find it easier to put that function in a separate .py file, e.g.:
do_something_with.py
import sys
def do_something_with(i)
# Your example is still too vague. Clearly, something differentiates
# each do_something_with, otherwise you're just taking the
# same inputs 5 times over.
# Whatever the difference is, pass it in as an argument to the function
ws = op.core.Workspace()
net = op.network.Cubic(shape=[100,100,100], spacing=1e-6)
proj = net.project
# You may not even need to clone anymore?
clone = ws.copy_project(proj)
result = do_something_with(clone)
# Whatever arg(s) you need to get to the function, just pass it in on the command line
if __name__ == "__main__":
sys.exit(do_something_with(sys.args[1:]))
You can do this using any of the python tools that handle subprocesses. In python 3.5+, the recommended way to do this is subprocess.run. You could change your bigger function to something like this:
import subprocess
invoke_do_something(i):
completed_args = subprocess.run(["python", "do_something_with.py", str(i)], check=False)
return completed_args.returncode
results = map(invoke_do_something, range(5))
You'll obviously need to tailor this to fit your own situation, but by running in a subprocess, you're guaranteed to not have to worry about the memory getting cleaned up. As an added bonus, you could potentially use multiprocess.Pool.map to use multiple processors at one time. (I deliberately coded this to use map to make such a transition simple. You could still use your for loop if you prefer, and then you don't need the invoke... function.) Multiprocessing could speed up your processing, but since you're already worried about memory, is almost certainly a bad idea - with multiple processes of the big memory hog, your system itself will likely quickly run out of memory and kill your process.
Your example is fairly vague, so I've written this at a high level. I can answer some questions if you need.

Why does my generator hang instead of throwing exception?

I have a generator that returns lines from a number of files, through a filter. It looks like this:
def line_generator(self):
# Find the relevant files
files = self.get_files()
# Read lines
input_object = fileinput.input(files)
for line in input_object:
# Apply filter and yield if it is not *None*
filtered = self.__line_filter(input_object.filename(), line)
if filtered is not None:
yield filtered
input_object.close()
The method self.get_files() returns a list of file paths or an empty list.
I have tried to do s = fileinput.input([]), and then call s.next(). This is where it hangs, and I cannot understand why. I'm trying to be pythonic, and not handling all errors myself, but I guess this is one where there is no way around. Or is there?
Unfortunately I have no means of testing this on Linux right now, but could someone please try the following on Linux, and comment what they get?
import fileinput
s = fileinput.input([])
s.next()
I'm on Windows with Python 2.7.5 (64 bit).
All in all, I'd really like to know:
Is this a bug in Python, or me that is doing something wrong?
Shouldn't .next() always return something, or raise a StopIteration?
fileinput defaults to stdin if the list is empty, so it's just waiting for you to type something.
An obvious fix would be to get rid of fileinput (is not terribly useful anyway) and to be explicit, as python zen suggests:
for path in self.get_files():
with open(path) as fp:
for line in fp:
etc
As others already have answered, I try to answer one specific sub-item:
Shouldn't .next() always return something, or raise a StopIteration?
Yes, but it is not specified when this return is supposed to happen: within some milliseconds, seconds or even longer.
If you have a blocking iterator, you can define some wrapper around it so that it runs inside a different thread, filling a list or something, and the originating thread gets an interface to determine if there are data, if there are currently no data or if the source is exhausted.
I can elaborate on this even more if needed.

Python Mock Process for Unit Testing

Background:
I am currently writing a process monitoring tool (Windows and Linux) in Python and implementing unit test coverage. The process monitor hooks into the Windows API function EnumProcesses on Windows and monitors the /proc directory on Linux to find current processes. The process names and process IDs are then written to a log which is accessible to the unit tests.
Question:
When I unit test the monitoring behavior I need a process to start and terminate. I would love if there would be a (cross-platform?) way to start and terminate a fake system process that I could uniquely name (and track its creation in a unit test).
Initial ideas:
I could use subprocess.Popen() to open any system process but this runs into some issues. The unit tests could falsely pass if the process I'm using to test is run by the system as well. Also, the unit tests are run from the command line and any Linux process I can think of suspends the terminal (nano, etc.).
I could start a process and track it by its process ID but I'm not exactly sure how to do this without suspending the terminal.
These are just thoughts and observations from initial testing and I would love it if someone could prove me wrong on either of these points.
I am using Python 2.6.6.
Edit:
Get all Linux process IDs:
try:
processDirectories = os.listdir(self.PROCESS_DIRECTORY)
except IOError:
return []
return [pid for pid in processDirectories if pid.isdigit()]
Get all Windows process IDs:
import ctypes, ctypes.wintypes
Psapi = ctypes.WinDLL('Psapi.dll')
EnumProcesses = self.Psapi.EnumProcesses
EnumProcesses.restype = ctypes.wintypes.BOOL
count = 50
while True:
# Build arguments to EnumProcesses
processIds = (ctypes.wintypes.DWORD*count)()
size = ctypes.sizeof(processIds)
bytes_returned = ctypes.wintypes.DWORD()
# Call enum processes to find all processes
if self.EnumProcesses(ctypes.byref(processIds), size, ctypes.byref(bytes_returned)):
if bytes_returned.value &lt size:
return processIds
else:
# We weren't able to get all the processes so double our size and try again
count *= 2
else:
print "EnumProcesses failed"
sys.exit()
Windows code is from here
edit: this answer is getting long :), but some of my original answer still applies, so I leave it in :)
Your code is not so different from my original answer. Some of my ideas still apply.
When you are writing Unit Test, you want to only test your logic. When you use code that interacts with the operating system, you usually want to mock that part out. The reason being that you don't have much control over the output of those libraries, as you found out. So it's easier to mock those calls.
In this case, there are two libraries that are interacting with the sytem: os.listdir and EnumProcesses. Since you didn't write them, we can easily fake them to return what we need. Which in this case is a list.
But wait, in your comment you mentioned:
"The issue I'm having with it however is that it really doesn't test
that my code is seeing new processes on the system but rather that the
code is correctly monitoring new items in a list."
The thing is, we don't need to test the code that actually monitors the processes on the system, because it's a third party code. What we need to test is that your code logic handles the returned processes. Because that's the code you wrote. The reason why we are testing over a list, is because that's what your logic is doing. os.listir and EniumProcesses return a list of pids (numeric strings and integers, respectively) and your code acts on that list.
I'm assuming your code is inside a Class (you are using self in your code). I'm also assuming that they are isolated inside their own methods (you are using return). So this will be sort of what I suggested originally, except with actual code :) Idk if they are in the same class or different classes, but it doesn't really matter.
Linux method
Now, testing your Linux process function is not that difficult. You can patch os.listdir to return a list of pids.
def getLinuxProcess(self):
try:
processDirectories = os.listdir(self.PROCESS_DIRECTORY)
except IOError:
return []
return [pid for pid in processDirectories if pid.isdigit()]
Now for the test.
import unittest
from fudge import patched_context
import os
import LinuxProcessClass # class that contains getLinuxProcess method
def test_LinuxProcess(self):
"""Test the logic of our getLinuxProcess.
We patch os.listdir and return our own list, because os.listdir
returns a list. We do this so that we can control the output
(we test *our* logic, not a built-in library's functionality).
"""
# Test we can parse our pdis
fakeProcessIds = ['1', '2', '3']
with patched_context(os, 'listdir', lamba x: fakeProcessIds):
myClass = LinuxProcessClass()
....
result = myClass.getLinuxProcess()
expected = [1, 2, 3]
self.assertEqual(result, expected)
# Test we can handle IOERROR
with patched_context(os, 'listdir', lamba x: raise IOError):
myClass = LinuxProcessClass()
....
result = myClass.getLinuxProcess()
expected = []
self.assertEqual(result, expected)
# Test we only get pids
fakeProcessIds = ['1', '2', '3', 'do', 'not', 'parse']
.....
Windows method
Testing your Window's method is a little trickier. What I would do is the following:
def prepareWindowsObjects(self):
"""Create and set up objects needed to get the windows process"
...
Psapi = ctypes.WinDLL('Psapi.dll')
EnumProcesses = self.Psapi.EnumProcesses
EnumProcesses.restype = ctypes.wintypes.BOOL
self.EnumProcessses = EnumProcess
...
def getWindowsProcess(self):
count = 50
while True:
.... # Build arguments to EnumProcesses and call enun process
if self.EnumProcesses(ctypes.byref(processIds),...
..
else:
return []
I separated the code into two methods to make it easier to read (I believe you are already doing this). Here is the tricky part, EnumProcesses is using pointers and they are not easy to play with. Another thing is, that I don't know how to work with pointers in Python, so I couldn't tell you of an easy way to mock that out =P
What I can tell you is to simply not test it. Your logic there is very minimal. Besides increasing the size of count, everything else in that function is creating the space EnumProcesses pointers will use. Maybe you can add a limit to the count size but other than that, this method is short and sweet. It returns the windows processes and nothing more. Just what I was asking for in my original comment :)
So leave that method alone. Don't test it. Make sure though, that anything that uses getWindowsProcess and getLinuxProcess get's mocked out as per my original suggestion.
Hopefully this makes more sense :) If it doesn't let me know and maybe we can have a chat session or do a video call or something.
original answer
I'm not exactly sure how to do what you are asking, but whenever I need to test code that depends on some outside force (external libraries, popen or in this case processes) I mock out those parts.
Now, I don't know how your code is structured, but maybe you can do something like this:
def getWindowsProcesses(self, ...):
'''Call Windows API function EnumProcesses and
return the list of processes
'''
# ... call EnumProcesses ...
return listOfProcesses
def getLinuxProcesses(self, ...):
'''Look in /proc dir and return list of processes'''
# ... look in /proc ...
return listOfProcessses
These two methods only do one thing, get the list of processes. For Windows, it might just be a call to that API and for Linux just reading the /proc dir. That's all, nothing more. The logic for handling the processes will go somewhere else. This makes these methods extremely easy to mock out since their implementations are just API calls that return a list.
Your code can then easy call them:
def getProcesses(...):
'''Get the processes running.'''
isLinux = # ... logic for determining OS ...
if isLinux:
processes = getLinuxProcesses(...)
else:
processes = getWindowsProcesses(...)
# ... do something with processes, write to log file, etc ...
In your test, you can then use a mocking library such as Fudge. You mock out these two methods to return what you expect them to return.
This way you'll be testing your logic since you can control what the result will be.
from fudge import patched_context
...
def test_getProcesses(self, ...):
monitor = MonitorTool(..)
# Patch the method that gets the processes. Whenever it gets called, return
# our predetermined list.
originalProcesses = [....pids...]
with patched_context(monitor, "getLinuxProcesses", lamba x: originalProcesses):
monitor.getProcesses()
# ... assert logic is right ...
# Let's "add" some new processes and test that our logic realizes new
# processes were added.
newProcesses = [...]
updatedProcesses = originalProcessses + (newProcesses)
with patched_context(monitor, "getLinuxProcesses", lamba x: updatedProcesses):
monitor.getProcesses()
# ... assert logic caught new processes ...
# Let's "kill" our new processes and test that our logic can handle it
with patched_context(monitor, "getLinuxProcesses", lamba x: originalProcesses):
monitor.getProcesses()
# ... assert logic caught processes were 'killed' ...
Keep in mind that if you test your code this way, you won't get 100% code coverage (since your mocked methods won't be run), but this is fine. You're testing your code and not third party's, which is what matters.
Hopefully this might be able to help you. I know it doesn't answer your question, but maybe you can use this to figure out the best way to test your code.
Your original idea of using subprocess is a good one. Just create your own executable and name it something that identifies it as a testing thing. Maybe make it do something like sleep for a while.
Alternately, you could actually use the multiprocessing module. I've not used python in windows much, but you should be able to get process identifying data out of the Process object you create:
p = multiprocessing.Process(target=time.sleep, args=(30,))
p.start()
pid = p.getpid()

Categories