Protection from accidentally misnaming object attributes in Python?

Protection from accidentally misnaming object attributes in Python? - python

A friend was "burned" when starting to learn Python, and now sees the language as perhaps fatally flawed.
He was using a library and changed the value of an object's attribute (the class being in the library), but he used the wrong abbreviation for the attribute name. It took him "forever" to figure out what was wrong. His objection to Python is thus that it allows one to accidentally add attributes to an object.
Unit tests don't provide a solution to this. One doesn't write unit tests against an API being used. One may have a mock for the class, but the mock could have the same typo or incorrect assumption about the attribute name.
It's possible to use __setattr__() to guard against this, but (as far as I know), no one does.
The only thing I've been able to tell my friend is that after several years of writing Python code full-time, I don't recall ever being burned by this. What else can I tell him?

"changed the value of an object's attribute" Can lead to problems. This is pretty well known. You know it, now, also. That doesn't indict the language. It simply says that you've learned an important lesson in dynamic language programming.
Unit testing absolutely discovers this. You are not forced to mock all library classes. Some folks say it's only a unit test when it's tested in complete isolation. This is silly. You have to trust the library modules -- it's a feature of your architecture. Rather than mock them, just use them. (It is important to write mocks for your own newly-developed libraries. It's also important to mock libraries that make expensive API calls.)
In most cases, you can (and should) test your classes with the real library modules. This will find the misspelled attribute name.
Also, now that you know that attributes are dynamic, it's really easy to verify that the attribute exists. How?
Use interactive Python to explore the classes before writing too much code.
Remember, Python is not Java and it's not C. You can execute Python interactively and determine immediately if you've spelled something wrong. Writing a lot of code without doing any interactive confirmation is -- simply -- the wrong way to use Python.
A little interactive exploration will find misspelled attribute names.
Finally -- for your own classes -- you can wrap updatable attributes as properties. This makes it easier to debug any misspelled attribute names. Again, you know to check for this. You can use interactive development to confirm the attribute names.
Fussing around with __setattr__ creates problems. In some cases, we actually need to add attributes to an object. Why? It's simpler than creating a whole subclass for one special case where we have to maintain more state information.
Other things you can say:
I was burned by a C program that absolutely could not be made to work because of ______. [Insert any known C-language problem you want here. No array bounds checking, for example] Does that make C fatally flawed?
I was burned by a DBA who changed a column name and all the SQL broke. It's painful to unit test all of it. Does that make the relational database fatally flawed?
I was burned by a sys admin who changed a directory's permissions and my application broke. It was nearly impossible to find. Does that make the OS fatally flawed?
I was burned by a COBOL program where someone changed the copybook, forgot to recompile the program, and we couldn't debug it because the source looked perfect. COBOL, however, actually is fatally flawed, so this isn't a good example.

There are code analyzers like pylint that will warn you if you add a attribute outside of __init__. PyDev has nice support for it. Such errors are very easy to find with a debugger too.

If the possibility to make mistakes is enough for him to consider a language "fatally flawed", I don't think you can convince him otherwise. The more you can do with a language, the more you can do wrong with the language. It's a caveat of flexibility—but that's true for any language.

You can use the __slots__ class attribute to limit the attributes that instances have. Attempting to set an attribute that's not expliticly listed will raise an AttributeError. There are some complications that arise with subclassing. See the Python data model reference for details.

A tool like pylint or pychecker may be able to detect this.

He's effectively ruling out an entire class of programming languages -- dynamically-typed languages -- because of one hard lesson learned. He can use only statically-typed languages if he wishes and still have a very productive career as a programmer, but he is certainly going to have deep frustrations with them as well. Will he then conclude that they are fatally-flawed?

I think your friend has misplaced his frustration in the language. His real problem is lack of debugging techniques. teach him how to break down a program into small pieces to examine the output. like a manual unit test, this way any inconsistency is found and any assumptions are proven or discarded.

I had a similar bad experience with Python when I first started ... took me 3 months to get over it. Having a tool which warns would be nice back then ...

Related

How to apply Closed-Open and Inversion of Control principles in Python?

Building out a new application now and struggling a lot with the implementation part of "Closed-Open" and "Inversion of Control" principles I following after reading Clean Architecture book by Uncle Bob.
How can I implement them in Python?
Usually, these two principles coming hand in hand and depicted in the UML as an Interface reversing control from module/package A to B.
I'm confused because:
Python does not possess Interfaces as Java and C++ do. Yes, there are ABC and #abstractmethod, but it is not a Pythonic style and redundant from my point of view if you are not developing a framework
Passing a class to the method of another one (I understood that it is a way to implement open-closed principle) is a little bit dangerous in Python, since it does not have a compiler which is catching issues may (and will) happen if one of two loosely coupled objects change
After neglecting interfaces and passing a top-level class to lower-level ones... I still need to import everything somewhere at the top module. And by that, the whole thing is violated.
So, as you can see I'm super confused and having a hard time programming according to my design. I came up with. Can you help me, please?

You just pass an object that implements the methods you need it to implement.
True, there is no "Interface" to define what those methods have to be, but that's just the way it is in python.
You pass around arguments all the time that have to be lists, maps, tuples, or whatever, and none of these are type-checked. You can write code that calls whatever you want on these things and python will not notice any kind of problem until that code is actually executed.
It's exactly the same when you need those arguments to implement whatever IoC interface you're using. Make sure you detail the requirements in comments.
Yes, this is all pretty dangerous. That's why we prefer statically typed languages for large systems that have complex interfaces.

What does R0902 of Pylint mean? Why do we have this limit?

What is behind the R0902 limit? Will too many instance attributes slow down the python interpreter or is it simply because too many instance attributes will make the class harder to understand?

From the (DOCS):
too-many-instance-attributes (R0902):
Too many instance attributes (%s/%s) Used when class has too many instance attributes, try to reduce this to get a simpler (and so easier to use) class.
And then from the (Tutorial):
... but I have run into error messages that left me with no clue about what went wrong, simply because I was unfamiliar with the underlying mechanism of code theory. One error that puzzled my newbie mind was:
:too-many-instance-attributes (R0902): *Too many instance attributes (%s/%s)*
I get it now thanks to Pylint pointing it out to me. If you don’t get that one, pour a fresh cup of coffee and look into it - let your programmer mind grow!
So, yes that is not so clear.
But, as stated in the description above, with methods, classes and modules, smaller is generally better for understand-ability, re-usability and therefore manageability. When things get too large, it is often a sign that things could be refactored, to make them smaller. There is really no way to place a hard limit on this, and as disccussed in this question about how to turn this message off, pylint should not have the last word.
So, just take the message as a hint to study that section of code a bit more.

Tracking changes in python source files?

I'm learning python and came into a situation where I need to change the behvaviour of a function. I'm initially a java programmer so in the Java world a change in a function would let Eclipse shows that a lot of source files in Java has errors. That way I can know which files need to get modified. But how would one do such a thing in python considering there are no types?! I'm using TextMate2 for python coding.
Currently I'm doing the brute-force way. Opening every python script file and check where I'm using that function and then modify. But I'm sure this is not the way to deal with large projects!!!
Edit: as an example I define a class called Graph in a python script file. Graph has two objects variables. I created many objects (each with different name!!!) of this class in many script files and then decided that I want to change the name of the object variables! Now I'm going through each file and reading my code again in order to change the names again :(. PLEASE help!
Example: File A has objects x,y,z of class C. File B has objects xx,yy,zz of class C. Class C has two instance variables names that should be changed Foo to Poo and Foo1 to Poo1. Also consider many files like A and B. What would you do to solve this? Are you serisouly going to open each file and search for x,y,z,xx,yy,zz and then change the names individually?!!!

Sounds like you can only code inside an IDE!
Two steps to free yourself from your IDE and become a better programmer.
Write unit tests for your code.
Learn how to use grep
Unit tests will exercise your code and provide reassurance that it is always doing what you wanted it to do. They make refactoring MUCH easier.
grep, what a wonderful tool grep -R 'my_function_name' src will find every reference to your function in files under the directory src.
Also, see this rather wonderful blog post: Unix as an IDE.

Whoa, slow down. The coding process you described is not scalable.
How exactly did you change the behavior of the function? Give specifics, please.
UPDATE: This all sounds like you're trying to implement a class and its methods by cobbling together a motley patchwork of functions and local variables - like I wrongly did when I first learned OO coding in Python. The code smell is that when the type/class of some class internal changes, it should generally not affect the class methods. If you're refactoring all your code every 10 mins, you're doing something seriously wrong. Step back and think about clean decomposition into objects, methods and data members.
(Please give more specifics if you want a more useful answer.)
If you were only changing input types, there might be no need to change the calling code.
(Unless the new fn does something very different to the old one, in which case what was the argument against calling it a different name?)
If you changed the return type, and you can't find a common ancestor type or container (tuple, sequence etc.) to put the return values in, then yes you need to change its caller code. However...
...however if the function should really be a method of a class, declare that class and the method already. The previous paragraph was a code smell that your function really should have been a method, specifically a polymorphic method.
Read about code smells, anti-patterns and When do you know you're dealing with an anti-pattern?. There e.g. you will find a recommendation for the video "Recovery from Addiction - A taste of the Python programming language's concision and elegance from someone who once suffered an addiction to the Java programming language." - Sean Kelly
Also, sounds like you want to use Test-Driven Design and add some unittests.
If you give us the specifics we can critique it better.

You won't get this functionality in a text editor. I use sublime text 3, and I love it, but it doesn't have this functionality. It does however jump to files and functions via its 'Goto Anything' (Ctrl+P) functionality, and its Multiple Selections / Multi Edit is great for small refactoring tasks.
However, when it comes to IDEs, JetBrains pycharm has some of the amazing re-factoring tools that you might be looking for.
The also free Python Tools for Visual Studio (see free install options here which can use the free VS shell) has some excellent Refactoring capabilities and a superb REPL to boot.
I use all three. I spend most of my time in sublime text, I like pycharm for refactoring, and I find PT4VS excellent for very involved prototyping.
Despite python being a dynamically typed language, IDEs can still introspect to a reasonable degree. But, of course, it won't approach the level of Java or C# IDEs. Incidentally, if you are coming over from Java, you may have come across JetBrains IntelliJ, which PyCharm will feel almost identical to.
One's programming style is certainly different between a statically typed language like C# and a dynamic language like python. I find myself doing things in smaller, testable modules. The iteration speed is faster. And in a dynamic language one relies less on IDE tools and more on unit tests that cover the key functionality. If you don't have these you will break things when you refactor.

One answer only specific to your edit:
if your old code was working and does not need to be modified, you could just keep old names as alias of the new ones, resulting in your old code not to be broken. Example:
class MyClass(object):
def __init__(self):
self.t = time.time()
# creating new names
def new_foo(self, arg):
return 'new_foo', arg
def new_bar(self, arg):
return 'new_bar', arg
# now creating functions aliases
foo = new_foo
bar = new_bar
if your code need rework, rewrite your common code, execute everything, and correct any failure. You could also look for any import/instantiation of your class.

One of the tradeoffs between statically and dynamically typed languages is that the latter require less scaffolding in the form of type declarations, but also provide less help with refactoring tools and compile-time error detection. Some Python IDEs do offer a certain level of type inference and help with refactoring, but even the best of them will not be able to match the tools developed for statically typed languages.
Dynamic language programmers typically ensure correctness while refactoring in one or more of the following ways:
Use grep to look for function invocation sites, and fix them. (You would have to do that in languages like Java as well if you wanted to handle reflection.)
Start the application and see what goes wrong.
Write unit tests, if you don't already have them, use a coverage tool to make sure that they cover your whole program, and run the test suite after each change to check that everything still works.

When coding in Python, how do I achieve guarantees of correctness similar to those I get with Haskell's type system?

Using Haskell's type system I know that at some point in the program, a variable must contain say an Int of a list of strings. For code that compiles, the type checker offers certain guarantees that for instance I'm not trying to add an Int and a String.
Are there any tools to provide similar guarantees for Python code?
I know about and practice TDD.

The quick answer is "not really". While tools like PyLint (which is very good BTW) will give you a lot of help and good advice on what constitutes good Python style, that isn't exactly what you're looking for and it certainly isn't a real substitute for things like HM type inference.
There are some interesting research projects in this area, notably Gradual Typing by Jeremy Siek and colleagues and some really interesting ideas like the blame calculus of Wadler and Findler.
Practically speaking, I think the best you can achieve is by using some sensibly chosen runtime methods. Use the inspect module to test the type of an object (but remember to be true to Python's duck typing and so on). Use assert statements liberally. Or (possible 'And') use something like Design by Contract using decorators. There are lots of ways to implement these idioms, but this is typically done on a per-project basis. You may want to think about whether and how such methods affect the performance and resource usage of your programs, if this is critical for you. There have, however, been some efforts to standardise techniques like DBC for Python, but these haven't (yet) been pushed into the cPython trunk. Here's hoping though :)

Python is dynamic and strongly typed programming language. What that means is that you can define a variable without explicitly stating its type, but when you first use that variable it becomes bound to a certain type.
For example,
x = 5 is an integer, and so now you cannot concatenate it with string, e.g. x+"hello"

Is it OK to inspect properties beginning with underscore?

I've been working on a very simple crud generator for pylons. I came up with something that inspects
SomeClass._sa_class_manager.mapper.c
Is it ok to inspect this (or to call methods begining with underscore)? I always kind of assumed this is legal though frowned upon as it relies heavily on the internal structure of a class/object. But hey, since python does not really have interfaces in the Java sense maybe it is OK.

It is intentional (in Python) that there are no "private" scopes. It is a convention that anything that starts with an underscore should not ideally be used, and hence you may not complain if its behavior or definition changes in a next version.

In general, this usually indicates that the method is effectively internal, rather than part of the documented interface, and should not be relied on. Future versions of the library are free to rename or remove such methods, so if you care about future compatability without having to rewrite, avoid doing it.

If it works, why not? You could have problems though when _sa_class_manager gets restructured, binding yourself to this specific version of SQLAlchemy, or creating more work to track the changes. As SQLAlchemy is a fast moving target, you may be there in a year already.
The preferable way would be to integrate your desired API into SQLAlchemy itself.

It's generally not a good idea, for reasons already mentioned. However, Python deliberately allows this behaviour in case there is no other way of doing something.
For example, if you have a closed-source compiled Python library where the author didn't think you'd need direct access to a certain object's internal state—but you really do—you can still get at the information you need. You have the same problems mentioned before of keeping up with different versions (if you're lucky enough that it's still maintained) but at least you can actually do what you wanted to do.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.