Global Reliance Making Unit Testing Impossible

Global Reliance Making Unit Testing Impossible - python

Background:
I am a hobbyist Python coder whose first and only project is a text-based Adventure Game Engine (https://github.com/zig13/scenzig).
An Adventure Game is a common first project but I realised I didn't have the creativity to write an adventure so decided instead to write a framework so someone else could do the adventure bit.
It started out as a monolithic script with no functions - at one point 712 lines long - but as I have learnt Python it has been repeatedly rewritten.
I recently learned of the concepts of unit testing and logging and I think they would be really helpful to me going forward but I have hit a snag.
The majority of my functions need access to the Adventure files and/or Character file to function.
These are accessed as a dictionary (thanks to ConfigObj) and a reference to the dictionaries is passed among the modules like this:
def GiveAdv(a) :
global adv
adv = a
efunc.GiveAdv(a)
def GiveChar(c) :
global char
char = c
argsolve.GiveChar(char)
efunc.GiveChar(c)
i.e. After importing another module, a module passes on it's reference to the adventure files and character files. Later in the same module adv.f['Actions'][str(action)] is used to get the information on a specific action from the Actions file for example.
I've never liked this but it has been fine until now but is a problem now because:
I am trying to merge two modules rewriting much from scratch in the process and I have no way of testing as I go as everything is so inter-connected
Unit testing relies on a function taking arguments and producing outputs but most of mine also expect to have access to information in addition to their arguments
As I see it, Logging (using the logging module) would require somehow passing a reference to the log file around and I just can't believe that people would resort to my horrible "Give" function solution so there must be a better way.
What is the Pythonic way of giving disparate modules access to the same information, class instances and loggers?

Related

Splitting code into modules (conventions)

So I've been searching for a bit and couldn't find anything on Google or PEP discussing this.
I am doing a project with tkinter and I had a file, that is part of a project, that was only 200 lines of code (excluding all the commented out code). While the entire file was related to the GUI portion of the project, it felt a bit long and a bit broad to me.
I ended up splitting the file into 4 different files that each has its own portion of the GUI.
Basically, the directory looks like this:
project/
guiclasses/
statisticsframe.py
textframes.py
windowclass.py
main_gui.py
...
statisticsframe has a class of a frame that shows statistics about stuff.
textframes holds 3 classes of frames holding textareas, one of them inherits Frame, the others inherit the first one.
windowclass basically creates the root of the window and all the general initialization for a tkinter GUI.
main_gui isn't actually the name but it simply combines all the above three and runs the mainloop()
Overall, each file is now 40-60 lines of code.
I am wondering if there are any conventions regarding this. The rule of thumb in most languages is that if you can reuse the functions/ classes elsewhere then you should split, though in Python it is less of a problem since you can import specific classes and functions from modules.
Sorry if it isn't coherent enough, nearly 3AM here and it is simply sitting in the back of my head.

I'm not familiar with tkinter, so my advice would be rather broad.
You can use any split into modules which you feel is better, but
as readability counts try making names coherent and do not repeat yourself: guiclasses - your enire progarm is about GUI, and there obviously classes somewhere, why repeath that in a name? imagine typing all that in in import, make it meaningful to type
flat structure is better than nested, three modules do not have to go to submodule
best split is across layers of abstraction (this is probably hardest and specific to tkinter)
anything in a module shoudl be rather self-sufficient and quite isolated from other parts of the program
modules should make good entitites for unit testing (eg share same fixtures)
can you write an understandable docstring for a module? then it's a good one.
try learning by example, I often seek wisdom for naming and package structure in Barry Warsaw mailman, maybe you can try finding some reputable repo with tkinter to follow (eg IDLE?).
From purely syntatic view I would have named the modules as:
- <package_name>
- baseframe
- textframe
- window
- main

Tracking changes in python source files?

I'm learning python and came into a situation where I need to change the behvaviour of a function. I'm initially a java programmer so in the Java world a change in a function would let Eclipse shows that a lot of source files in Java has errors. That way I can know which files need to get modified. But how would one do such a thing in python considering there are no types?! I'm using TextMate2 for python coding.
Currently I'm doing the brute-force way. Opening every python script file and check where I'm using that function and then modify. But I'm sure this is not the way to deal with large projects!!!
Edit: as an example I define a class called Graph in a python script file. Graph has two objects variables. I created many objects (each with different name!!!) of this class in many script files and then decided that I want to change the name of the object variables! Now I'm going through each file and reading my code again in order to change the names again :(. PLEASE help!
Example: File A has objects x,y,z of class C. File B has objects xx,yy,zz of class C. Class C has two instance variables names that should be changed Foo to Poo and Foo1 to Poo1. Also consider many files like A and B. What would you do to solve this? Are you serisouly going to open each file and search for x,y,z,xx,yy,zz and then change the names individually?!!!

Sounds like you can only code inside an IDE!
Two steps to free yourself from your IDE and become a better programmer.
Write unit tests for your code.
Learn how to use grep
Unit tests will exercise your code and provide reassurance that it is always doing what you wanted it to do. They make refactoring MUCH easier.
grep, what a wonderful tool grep -R 'my_function_name' src will find every reference to your function in files under the directory src.
Also, see this rather wonderful blog post: Unix as an IDE.

Whoa, slow down. The coding process you described is not scalable.
How exactly did you change the behavior of the function? Give specifics, please.
UPDATE: This all sounds like you're trying to implement a class and its methods by cobbling together a motley patchwork of functions and local variables - like I wrongly did when I first learned OO coding in Python. The code smell is that when the type/class of some class internal changes, it should generally not affect the class methods. If you're refactoring all your code every 10 mins, you're doing something seriously wrong. Step back and think about clean decomposition into objects, methods and data members.
(Please give more specifics if you want a more useful answer.)
If you were only changing input types, there might be no need to change the calling code.
(Unless the new fn does something very different to the old one, in which case what was the argument against calling it a different name?)
If you changed the return type, and you can't find a common ancestor type or container (tuple, sequence etc.) to put the return values in, then yes you need to change its caller code. However...
...however if the function should really be a method of a class, declare that class and the method already. The previous paragraph was a code smell that your function really should have been a method, specifically a polymorphic method.
Read about code smells, anti-patterns and When do you know you're dealing with an anti-pattern?. There e.g. you will find a recommendation for the video "Recovery from Addiction - A taste of the Python programming language's concision and elegance from someone who once suffered an addiction to the Java programming language." - Sean Kelly
Also, sounds like you want to use Test-Driven Design and add some unittests.
If you give us the specifics we can critique it better.

You won't get this functionality in a text editor. I use sublime text 3, and I love it, but it doesn't have this functionality. It does however jump to files and functions via its 'Goto Anything' (Ctrl+P) functionality, and its Multiple Selections / Multi Edit is great for small refactoring tasks.
However, when it comes to IDEs, JetBrains pycharm has some of the amazing re-factoring tools that you might be looking for.
The also free Python Tools for Visual Studio (see free install options here which can use the free VS shell) has some excellent Refactoring capabilities and a superb REPL to boot.
I use all three. I spend most of my time in sublime text, I like pycharm for refactoring, and I find PT4VS excellent for very involved prototyping.
Despite python being a dynamically typed language, IDEs can still introspect to a reasonable degree. But, of course, it won't approach the level of Java or C# IDEs. Incidentally, if you are coming over from Java, you may have come across JetBrains IntelliJ, which PyCharm will feel almost identical to.
One's programming style is certainly different between a statically typed language like C# and a dynamic language like python. I find myself doing things in smaller, testable modules. The iteration speed is faster. And in a dynamic language one relies less on IDE tools and more on unit tests that cover the key functionality. If you don't have these you will break things when you refactor.

One answer only specific to your edit:
if your old code was working and does not need to be modified, you could just keep old names as alias of the new ones, resulting in your old code not to be broken. Example:
class MyClass(object):
def __init__(self):
self.t = time.time()
# creating new names
def new_foo(self, arg):
return 'new_foo', arg
def new_bar(self, arg):
return 'new_bar', arg
# now creating functions aliases
foo = new_foo
bar = new_bar
if your code need rework, rewrite your common code, execute everything, and correct any failure. You could also look for any import/instantiation of your class.

One of the tradeoffs between statically and dynamically typed languages is that the latter require less scaffolding in the form of type declarations, but also provide less help with refactoring tools and compile-time error detection. Some Python IDEs do offer a certain level of type inference and help with refactoring, but even the best of them will not be able to match the tools developed for statically typed languages.
Dynamic language programmers typically ensure correctness while refactoring in one or more of the following ways:
Use grep to look for function invocation sites, and fix them. (You would have to do that in languages like Java as well if you wanted to handle reflection.)
Start the application and see what goes wrong.
Write unit tests, if you don't already have them, use a coverage tool to make sure that they cover your whole program, and run the test suite after each change to check that everything still works.

Understand programmatically a python code without executing it

I am implementing a workflow management system, where the workflow developer overloads a little process function and inherits from a Workflow class. The class offers a method named add_component in order to add a component to the workflow (a component is the execution of a software or can be more complex).
My Workflow class in order to display status needs to know what components have been added to the workflow. To do so I tried 2 things:
execute the process function 2 times, the first time allow to gather all components required, the second one is for the real execution. The problem is, if the workflow developer do something else than adding components (add element in a databases, create a file) this will be done twice!
parse the python code of the function to extract only the add_component lines, this works but if some components are in a if / else statement and the component should not be executed, the component apears in the monitoring!
I'm wondering if there is other solution (I thought about making my workflow being an XML or something to parse easier but this is less flexible).

You cannot know what a program does without "executing" it (could be in some context where you mock things you don't want to be modified but it look like shooting at a moving target).
If you do a handmade parsing there will always be some issues you miss.
You should break the code in two functions :
a first one where the code can only add_component(s) without any side
effects, but with the possibility to run real code to check the
environment etc. to know which components to add.
a second one that
can have side effects and rely on the added components.
Using an XML (or any static format) is similar except :
you are certain there are no side effects (don't need to rely on the programmer respecting the documentation)
much less flexibility but be sure you need it.

When should a Python script be split into multiple files/modules?

In Java, this question is easy (if a little tedious) - every class requires its own file. So the number of .java files in a project is the number of classes (not counting anonymous/nested classes).
In Python, though, I can define multiple classes in the same file, and I'm not quite sure how to find the point at which I split things up. It seems wrong to make a file for every class, but it also feels wrong just to leave everything in the same file by default. How do I know where to break a program up?

Remember that in Python, a file is a module that you will most likely import in order to use the classes contained therein. Also remember one of the basic principles of software development "the unit of packaging is the unit of reuse", which basically means:
If classes are most likely used together, or if using one class leads to using another, they belong in a common package.

As I see it, this is really a question about reuse and abstraction. If you have a problem that you can solve in a very general way, so that the resulting code would be useful in many other programs, put it in its own module.
For example: a while ago I wrote a (bad) mpd client. I wanted to make configuration file and option parsing easy, so I created a class that combined ConfigParser and optparse functionality in a way I thought was sensible. It needed a couple of support classes, so I put them all together in a module. I never use the client, but I've reused the configuration module in other projects.
EDIT: Also, a more cynical answer just occurred to me: if you can only solve a problem in a really ugly way, hide the ugliness in a module. :)

In Java ... every class requires its own file.
On the flipside, sometimes a Java file, also, will include enums or subclasses or interfaces, within the main class because they are "closely related."
not counting anonymous/nested classes
Anonymous classes shouldn't be counted, but I think tasteful use of nested classes is a choice much like the one you're asking about Python.
(Occasionally a Java file will have two classes, not nested, which is allowed, but yuck don't do it.)

Python actually gives you the choice to package your code in the way you see fit.
The analogy between Python and Java is that a file i.e., the .py file in Python is
equivalent to a package in Java as in it can contain many related classes and functions.
For good examples, have a look in the Python built-in modules.
Just download the source and check them out, the rule of thumb I follow is
when you have very tightly coupled classes or functions you keep them in a single file
else you break them up.

Testing GUI code: should I use a mocking library?

Recently I've been experimenting with TDD while developing a GUI application in Python. I find it very reassuring to have tests that verify the functionality of my code, but it's been tricky to follow some of the recommened practices of TDD. Namely, writing tests first has been hard. And I'm finding it difficult to make my tests readable (due to extensive use of a mocking library).
I chose a mocking library called mocker. I use it a lot since much of the code I'm testing makes calls to (a) other methods in my application that depend on system state or (b) ObjC/Cocoa objects that cannot exist without an event loop, etc.
Anyway, I've got a lot of tests that look like this:
def test_current_window_controller():
def test(config):
ac = AppController()
m = Mocker()
ac.iter_window_controllers = iwc = m.replace(ac.iter_window_controllers)
expect(iwc()).result(iter(config))
with m:
result = ac.current_window_controller()
assert result == (config[0] if config else None)
yield test, []
yield test, [0]
yield test, [1, 0]
Notice that this is actually three tests; all use the same parameterized test function. Here's the code that is being tested:
def current_window_controller(self):
try:
# iter_window_controllers() iterates in z-order starting
# with the controller of the top-most window
# assumption: the top-most window is the "current" one
wc = self.iter_window_controllers().next()
except StopIteration:
return None
return wc
One of the things I've noticed with using mocker is that it's easier to write the application code first and then go back and write the tests second, since most of the time I'm mocking many method calls and the syntax to write the mocked calls is much more verbose (thus harder to write) than the application code. It's easier to write the app code and then model the test code off of that.
I find that with this testing method (and a bit of discipline) I can easily write code with 100% test coverage.
I'm wondering if these tests are good tests? Will I regret doing it this way down the road when I finally discover the secret to writing good tests?
Am I violating the core principles of TDD so much that my testing is in vain?

If you are writing your tests after you've written your code and making them pass, you are not doing TDD (nor are you getting any benefits of Test-First or Test-Driven development.. check out SO questions for definitive books on TDD)
One of the things I've noticed with
using mocker is that it's easier to
write the application code first and
then go back and write the tests
second, since most of the time I'm
mocking many method calls and the
syntax to write the mocked calls is
much more verbose (thus harder to
write) than the application code. It's
easier to write the app code and then
model the test code off of that.
Of course, its easier because you are just testing that the sky is orange after you made it orange by painting it with a specific kind of brush.
This is retrofitting tests (for self-assurance). Mocks are good but you should know how and when to use them - Like the saying goes 'When you have a hammer everything looks like a nail' It's also easy to write a whole load of unreadable and not-as-helpful-as-can-be tests. The time spent understanding what the test is about is time lost that can be used to fix broken ones.
And the point is:
Read Mocks aren't stubs - Martin Fowler if you haven't already. Google out some documented instances of good ModelViewPresenter patterned GUIs (Fake/Mock out the UIs if necessary).
Study your options and choose wisely. I'll play the guy with the halo on your left shoulder in white saying 'Don't do it.' Read this question as to my reasons - St. Justin is on your right shoulder. I believe he has also something to say:)

Unit tests are really useful when you refactor your code (ie. completely rewrite or move a module). As long as you have unit tests before you do the big changes, you'll have confidence that you havent forgotten to move or include something when you finish.

Please remember that TDD is not a panaceum. It's hard, it's supposed to be hard, and it's especially hard to write mocking tests "in advance".
So I would say - do what works for you. Even it's not "certified TDD". I do basically the same thing.
You may want to provide your own API for GUI that would sit between controller code and GUI library code. That could be easier to mock, or you can even add some testing hooks to it.
Last but not least, your code doesn't look too unreadable to me. Code using mocks is generally harder to understand. Fortunately in Python mocking is much easier and cleaner than i n other languages.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.