how to read python code better? [closed]

how to read python code better? [closed] - python

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
without explicit (type) declaration I struggle to try to figure out how things work --- are there some good thumbs of rule/tips that you may have for reading python code better? Thanks!

In spite of the first impression that this question gives, I think it is indeed really intelligent because it reveals that you are subconscious of something that should interest any Python's developper but that I find very neglected in general and in explanations in particular, if not misunderstood.
I mean that IMO the base of Python is terrificly quaint and intelligent: it's the data model on which it has been conceived.
In this Python's data model, there are no variables in the sense of "chunks of memory whose contents can change", contrary to other languages, and in the sense that we don't manage this precise kind of variables in Python.
More precisely, all is object in Python, and every object is named and designed with an identifier, but neither the object nor the identifier are 'variables' in the said sense.
That doesn't mean that there are no little boxes, so called variables in other languages, temporarily hosting values that go in and out of them, in the depthes of the implementation.
.
Say an object is designed with the identifier XYA2.
Personally I use this appearance of letters to designate any identifier. An identifier is nothing else than a word written in a code. It is what appears in a code.
Note that this appearance of letters is the one used by this stackoverflow.com site to represent a code sample inside text, by clicking on the button {}. That's easy to remind.
Now, the object whose name is XYA2 is a real thing, a concrete set of bits lying in the memory of the computer to represent the desired conceptual value that it stands for.
This set is defined in C language in which Python is implemented.
Personnaly, I bold the letters when I want to designate an object.
Then the object of name XYA2 is, for me, refered to by XYA2
The identifier is XYA2
It is linked to an underlying and inaccessible pointer that points to the object.
This link is done by means of the symbol table. You will see very few references or allusions to symbol table in general, here on stackoverflow or elsewhere. However it's very important, I think.
The pointer linked to the identifier XYA2 points to the object XYA2
So, XYA2 is directly linked to the pointer and indirectly linked to the object.
Instead of saying "indirectly linked", we say "assigned". An object and its identifier are reciprocally assigned one to the other, but the medium of this link is the underlying pointer.
.
And now, something important.
Strictly speaking, a variable is a "chunk of memory whose content can change".
I personally do efforts to never use the word 'variable' in an other sense that this one.
The problem is that, because of the use of the word 'variable' in mathematics, this word is very often used indiscriminately and thrown in all the wind's directions by many developpers (not all) even when it isn't justified.
Thereby, it is commonly used by nearly everybody to designates the names, aka the identifiers in a code. But this practice is horribly confusing. It should be carefully avoided.
That said, an object in Python is not only an instance of some class, it is above all a concrete set of bits; set which IS NOT, as far as I know, a variable, in the sense of "chunk of memory whose content can change".
Hence my opinion that there aren't variables in Python, since the only entities we can access to and manipulate are identifiers and objects.
However, the processes under the hood in an executed Python program use quantities of pointers that are, as far as I know, real variables in the strict sense of this word.
So, in a sense, it could be said that my affirmation 'There are no variables in Python" is false.
It's a matter of point of view.
As a developer in Python, conceptually speaking, I don't manage variables. When I think to an algorithm, I don't think at the level of the pointers, even if I know they exist and that it's very important to know they exist. Being not at the level of the variables, but at the level of the Python's data model, I don't see why I should accept to believe that there are variables in a Python program. There are variables at the machine low-level, and Python is a very-high-level language.
.
Why did I write all this ?
1)
because the nature of the Python's data model has quantities of consequences that can't be understood if this data model isn't known. Among these consequences, some are interesting because they give incredible possibilities, others are traps (a well known example is: modifying an element in a copied list modifies also the element in the original list). That's why it's of first importance to learn about this data model.
For that, I recommend you to read these parts of the documentation:
3.1 of objects-values-and-types
4.1 of naming-and-binding
.
2)
To justify my answer to your perplexity: don't struggle about what happens under the hood:
there's a garbage colector, a reference counter, wagons of underlying dictionaries-like entities, a thunderous ballet of values in the secret of the underlying pointers, many verifications made by the interpreter... When something doesn't fit well , warning is given in the form of exception's messages.
Python has all the machinery under control
The only concern you must have is to think about the algorithm you want to achieve, and for that, knowing the data model is essential.
Welcome in the Python universe
Warning
I don't consider myself as a very skilled Python developper, I'm just an amateur who had a lot of problems before understanding some essential things about Python.
All the above description is my personal views about the data model of Python. If any point is incorrect in this description, I will be happy to learn more about it if the teaching is done with developped argumentation.
But I underline the fact that this vision of things allows me to understand and to answer to a lot of tough problems and to achieve some tricky mechanisms that Python is capable of. So, all can't be false in this above description.

You should take a look at PEP8 documentation This describes the Python formatting and style.

Read up on Duck Typing. One of the purposes of Duck Typing is that you shouldn't be thinking too much about the type of something anyway. What really concerns you is that the the variable can be used the way that you want it.
In Python, you don't need a type declaration because the name you assign is just a pointer to an object, and furthermore it can change at any time.
a = None
a = 1+5
a = my_function() # calls my function and assigns the return object to a
a = my_function # Assigns the function itself to a. You could actually pass it as a parameter
a = MyClass() # Runs the __init__() function of the class and assigns the return value to a
a = MyClass # Assigns the class itself to a.
This is all valid Python. You could run this sequentially, although changing up the type is frowned upon unless its totally clear as to why.

if you know the c++11 then it is similer to auto type.
The variable type is decided on the bases of its assignment.

Related

Why true immutability is impossible in Python?

I was reading the documentation for attrs. It says:
Please note that true immutability is impossible in Python
I am wondering what is the reason for that. Why someone cannot have an immutable list in Python while it is possible in C++? What is the main difference here?

TLDR; "True Immutability" is only possible on an impervious stone tablet, but it's counter-productive to the discussion of mutability, and why it is used / is important. It's not worth being technically correct at the expense of being practically wrong.
This is a bad argument of semantics. Python allows re-defining variable names with different types in an otherwise strongly typed language, which is where some of the confusion comes from, but to be clear the object a variable name refers to can very much be properly immutable.
Take for instance a tuple with a few numbers in it:
>>> tup_A = (1,2,3)
It is not possible to change the values of any of the objects in the tuple:
>>> tup_A[0] = 10
TypeError: 'tuple' object does not support item assignment
It is possible to overwrite the variable name tup_A with some other value, but then it will be a different object entirely even if it was related to the original. For example a slice of a tuple creates an entirely new object rather than a view of the original:
>>> id(tup_A)
2887473131072
>>> tup_A = tup_A[:1]
>>> id(tup_A)
2887473037616
I believe the article mentioned may also be somewhat referring to the possibility of creating custom immutable types (classes). This is also a bad argument because there are plenty of mechanisms to enforce immutability. In particular, the tools for customizing attribute access, and the #property function can be used to great effect for this. Once these methods are used to implement immutability, one would have to intentionally break the class to mutate data which was not meant to be mutated. This is of course possible because python is primarily distributed as source code, but the same could theoretically be said for the python c api. Tuples don't have to be immutable if you re-write python, but that's so far beyond the point, it's fair to say it's just wrong.
Immutability is a tool with a specific purpose. It is a good idea to use it whenever possible so an accidental slip-up will produce an error message rather than a silent bug. If you encounter errors like this, you should never ask "how can I mutate this value which was intended to be immutable?", but rather ask "why is this value not meant to be mutated, and how am I intended to utilize it?"
P.S. You could probably even mutate a tuple without editing cpython using the ctypes library by getting the actual memory locations of the objects contained within it, and overwriting the pointers, but this would break lots of things (like garbage collection ref counting). Don't do this. It's another one of those "so far beyond the point" things.

Actually other languages like C++ treat variable like storage containers. But Python treats them as a reference to memory address. Lists can be modified in place, i.e. in same memory location. But we have tuples, whose values can't be updates in place.
I don't know exactly why python treats variable in this way, but I think it is necessary for
dynamic typing. True immutability isn't possible, may refers to dynamic typing feature. You can Google to know more.
Please let me know, if this is what you wanted.

True immutability is impossible if your memory is mutable.
Think of immutability as checks, but not a hard guarantee that everything will stay the same.

Why doesn't Python support numeric constant? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I come from C background and am learning Python. The lack of explicit type-safety is disturbing, but I am getting used to it. The lack of built-in contract-based programming (pure abstract classes, interfaces) is something to get used to, in the face of all the advantages of a dynamic language.
However, the inability to request const-cortectness is driving me crazy! Why are there no constants in Python? Why are class-level constants discouraged?

C and Python belongs to two different classes of languages.
The former one is statically typed. The latter is dynamic.
In a statically typed language, the type checker is able to infer the type of each expression and check if this match the given declaration during the "compilation" phase.
In a dynamically typed language, the required type information is not available until run-time. And the type of an expression may vary from one run to an other. Of course, you could add type checking during program execution. This is not the choice made in Python. This has for advantage to allow "duck typing". The drawback is the interpreter is not able to check for type correctness.
Concerning the const keyword. This is a type modifier. Restricting the allowed use of a variable (and sometime modifying allowed compiler optimization). It seems quite inefficient to check that at run-time for a dynamic language. At first analysis, that would imply to check if a variable is const or not for each affectation. This could be optimized, but even so, does it worth the benefit?
Beyond technical aspects, don't forget that each language has its own philosophy. In Python the usual choice is to favor "convention" instead of "restriction". As an example, constant should be spelled in all caps. There is no technical enforcement of that. It is just a convention. If you follow it, your program will behave as expected by "other programmers". If you decide to modify a "constant", Python won't complain. But you should feel like your are doing "something wrong". You break a convention. Maybe you have your reasons for doing so. Maybe you shouldn't have. Your responsibility.
As a final note, in dynamic languages, the "correctness" of a program is much more of the responsibility of your unit testings, than in the hand of the compiler. If you really have difficulties to made the step, you will find around some "code checkers". Those are PyLint, PyChecker, PyFlakes...

I don't know why this design decision was made but my personal guess is that there's no explicit const keyword because the key benefits of constants are already available:
Constants are good for documentation purposes. If you see a constant, you know that you can't change it. This is also possible by naming conventions.
Constants are useful for function calls. If you pass a constant as a parameter to a function, you can be sure that it isn't changed. In Python functions are "call-by-value" but since python variables are references you effectively pass a copy of a reference. Inside of the function you can mutate the reference but if you reassign it, the changes do not persist outside of the function scope. Therefore, if you pass a number as a variable, it is actually passed "like" a constant. You can assign a new value to the variable. But outside of the function, you still got the old number
Moreover if there was a const keyword, it would create an asymmetry: variables are declared without keyword but consts are declared with a keyword. The logical consequence would be to create a second keyword named var. This is probably a matter of taste. Personally I prefer the minimalistic approach to variable declarations.
You can probably achieve a little more type safety, if you work with immutable data structures like tuples. Be careful however, the tuple itself can not be modified. But if it contains references to mutable objects, these are still mutable even if they belong to a tuple.
Finally you might want to take a look at this snippet: http://code.activestate.com/recipes/65207-constants-in-python/?in=user-97991 I'm not sure if this is an implementation of "class-level constants". But I thought it might be useful.

Why has Python decided against constant references?

Note: I'm not talking about preventing the rebinding of a variable. I'm talking about preventing the modification of the memory that the variable refers to, and of any memory that can be reached from there by following the nested containers.
I have a large data structure, and I want to expose it to other modules, on a read-only basis. The only way to do that in Python is to deep-copy the particular pieces I'd like to expose - prohibitively expensive in my case.
I am sure this is a very common problem, and it seems like a constant reference would be the perfect solution. But I must be missing something. Perhaps constant references are hard to implement in Python. Perhaps they don't quite do what I think they do.
Any insights would be appreciated.
While the answers are helpful, I haven't seen a single reason why const would be either hard to implement or unworkable in Python. I guess "un-Pythonic" would also count as a valid reason, but is it really? Python does do scrambling of private instance variables (starting with __) to avoid accidental bugs, and const doesn't seem to be that different in spirit.
EDIT: I just offered a very modest bounty. I am looking for a bit more detail about why Python ended up without const. I suspect the reason is that it's really hard to implement to work perfectly; I would like to understand why it's so hard.

It's the same as with private methods: as consenting adults authors of code should agree on an interface without need of force. Because really really enforcing the contract is hard, and doing it the half-assed way leads to hackish code in abundance.
Use get-only descriptors, and state clearly in your documentation that these data is meant to be read only. After all, a determined coder could probably find a way to use your code in different ways you thought of anyways.

In PEP 351, Barry Warsaw proposed a protocol for "freezing" any mutable data structure, analogous to the way that frozenset makes an immutable set. Frozen data structures would be hashable and so capable being used as keys in dictionaries.
The proposal was discussed on python-dev, with Raymond Hettinger's criticism the most detailed.
It's not quite what you're after, but it's the closest I can find, and should give you some idea of the thinking of the Python developers on this subject.

There are many design questions about any language, the answer to most of which is "just because". It's pretty clear that constants like this would go against the ideology of Python.
You can make a read-only class attribute, though, using descriptors. It's not trivial, but it's not very hard. The way it works is that you can make properties (things that look like attributes but call a method on access) using the property decorator; if you make a getter but not a setter property then you will get a read-only attribute. The reason for the metaclass programming is that since __init__ receives a fully-formed instance of the class, you actually can't set the attributes to what you want at this stage! Instead, you have to set them on creation of the class, which means you need a metaclass.
Code from this recipe:
# simple read only attributes with meta-class programming
# method factory for an attribute get method
def getmethod(attrname):
def _getmethod(self):
return self.__readonly__[attrname]
return _getmethod
class metaClass(type):
def __new__(cls,classname,bases,classdict):
readonly = classdict.get('__readonly__',{})
for name,default in readonly.items():
classdict[name] = property(getmethod(name))
return type.__new__(cls,classname,bases,classdict)
class ROClass(object):
__metaclass__ = metaClass
__readonly__ = {'a':1,'b':'text'}
if __name__ == '__main__':
def test1():
t = ROClass()
print t.a
print t.b
def test2():
t = ROClass()
t.a = 2
test1()

While one programmer writing code is a consenting adult, two programmers working on the same code seldom are consenting adults. More so if they do not value the beauty of the code but them deadlines or research funds.
For such adults there is some type safety, provided by Enthought's Traits.
You could look into Constant and ReadOnly traits.

For some additional thoughts, there is a similar question posed about Java here:
Why is there no Constant feature in Java?
When asking why Python has decided against constant references, I think it's helpful to think of how they would be implemented in the language. Should Python have some sort of special declaration, const, to create variable references that can't be changed? Why not allow variables to be declared a float/int/whatever then...these would surely help prevent programming bugs as well. While we're at it, adding class and method modifiers like protected/private/public/etc. would help enforce compile-type checking against illegal uses of these classes. ...pretty soon, we've lost the beauty, simplicity, and elegance that is Python, and we're writing code in some sort of bastard child of C++/Java.
Python also currently passes everything by reference. This would be some sort of special pass-by-reference-but-flag-it-to-prevent-modification...a pretty special case (and as the Tao of Python indicates, just "un-Pythonic").
As mentioned before, without actually changing the language, this type of behaviour can be implemented via classes & descriptors. It may not prevent modification from a determined hacker, but we are consenting adults. Python didn't necessarily decide against providing this as an included module ("batteries included") - there was just never enough demand for it.

Python syntax reasoning (why not fall back for . the way django template syntax does?)

My karate instructor is fond of saying, "a block is a lock is a throw is a blow." What he means is this: When we come to a technique in a form, although it might seem to look like a block, a little creativity and examination shows that it can also be seen as some kind of joint lock, or some kind of throw, or some kind of blow.
So it is with the way the django template syntax uses the dot (".") character. It perceives it first as a dictionary lookup, but it will also treat it as a class attribute, a method, or list index - in that order. The assumption seems to be that, one way or another, we are looking for a piece of knowledge. Whatever means may be employed to store that knowledge, we'll treat it in such a way as to get it into the template.
Why doesn't python do the same? If there's a case where I might have assigned a dictionary term spam['eggs'], but know for sure that spam has an attribute eggs, why not let me just write spam.eggs and sort it out the way django templates do?
Otherwise, I have to except an AttributeError and add three additional lines of code.
I'm particularly interested in the philosophy that drives this setup. Is it regarded as part of strong typing?

django templates and python are two, unrelated languages. They also have different target audiences.
In django templates, the target audience is designers, who proabably don't want to learn 4 different ways of doing roughly the same thing ( a dictionary lookup ). Thus there is a single syntax in django templates that performs the lookup in several possible ways.
python has quite a different audience. developers actually make use of the many different ways of doing similar things, and overload each with distinct meaning. When one fails it should fail, because that is what the developer means for it to do.

JUST MY correct OPINION's opinion is indeed correct. I can't say why Guido did it this way but I can say why I'm glad that he did.
I can look at code and know right away if some expression is accessing the 'b' key in a dict-like object a, the 'b' attribute on the object a, a method being called on or the b index into the sequence a.
Python doesn't have to try all of the above options every time there is an attribute lookup. Imagine if every time one indexed into a list, Python had to try three other options first. List intensive programs would drag. Python is slow enough!
It means that when I'm writing code, I have to know what I'm doing. I can't just toss objects around and hope that I'll get the information somewhere somehow. I have to know that I want to lookup a key, access an attribute, index a list or call a method. I like it that way because it helps me think clearly about the code that I'm writing. I know what the identifiers are referencing and what attributes and methods I'm expecting the object of those references to support.
Of course Guido Van Rossum might have just flipped a coin for all I know (He probably didn't) so you would have to ask him yourself if you really want to know.
As for your comment about having to surround these things with try blocks, it probably means that you're not writing very robust code. Generally, you want your code to expect to get some piece of information from a dict-like object, list-like object or a regular object. You should know which way it's going to do it and let anything else raise an exception.
The exception to this is that it's OK to conflate attribute access and method calls using the property decorator and more general descriptors. This is only good if the method doesn't take arguments.

The different methods of accessing
attributes do different things. If
you have a function foo the two lines
of code
a = foo,
a = foo()
do two
very different things. Without
distinct syntax to reference and call
functions there would be no way for
python to know whether the variable
should be a reference to foo or the
result of running foo. The () syntax removes the ambiguity.
Lists and dictionaries are two very different data structures. One of the things that determine which one is appropriate in a given situation is how its contents can be accessed (key Vs index). Having separate syntax for both of them reinforces the notion that these two things are not the same and neither one is always appropriate.
It makes sense for these distinctions to be ignored in a template language, the person writing the html doesn't care, the template language doesn't have function pointers so it knows you don't want one. Programmers who write the python that drive the template however do care about these distinctions.

In addition to the points already posted, consider this. Python uses special member variables and functions to provide metadata about the object. Both the interpreter and programmers make heavy use of these. For example, both dicts and lists have a __len__ member function. Now, if a dict's data were accessed by using the . operator, a potential ambiguity arises if the dict has a key called __len__. You could special-case these, but many objects have a __dict__ attribute which is a mapping of member names and values. If that object happened to be a container, which also defined a __len__ attribute, you would end up with an utter mess.
Problems like this would end up turning Python into a mishmash of special cases that the programmer would have to constantly be aware of. This would detract from the reason why many people use Python in the first place, i.e., its elegant simplicity.
Now, consider that new users often shadow built-ins (if the code in SO questions is any indication) and having something like this starts to look like a really bad idea, since it would exacerbate the problem many-fold.

In addition to the responses above, it's not practical to merge dictionary lookup and object lookup in general because of the restrictions on object members.
What if your key has whitespace? What if it's an int, or a frozenset, etc.? Dot notation can't account for these discrepancies, so while it's an acceptable tradeoff for a templating language, it's unacceptable for a general-purpose programming language like Python.

Is late binding consistent with the philosophy of "readibility counts"?

I am sorry all - I am not here to blame Python. This is just a reflection on whether what I believe is right. Being a Python devotee for two years, I have been writing only small apps and singing Python's praises wherever I go. I recently had the chance to read Django's code, and have started wondering if Python really follows its "readability counts" philosophy. For example,
class A:
a = 10
b = "Madhu"
def somemethod(self, arg1):
self.c = 20.22
d = "some local variable"
# do something
....
...
def somemethod2 (self, arg2):
self.c = "Changed the variable"
# do something 2
...
It's difficult to track the flow of code in situations where the instance variables are created upon use (i.e. self.c in the above snippet). It's not possible to see which instance variables are defined when reading a substantial amount of code written in this manner. It becomes very frustrating even when reading a class with just 6-8 methods and not more than 100-150 lines of code.
I am interested in knowing if my reading of this code is skewed by C++/Java style, since most other languages follow the same approach as them. Is there a Pythonic way of reading this code more fluently? What made Python developers adopt this strategy keeping "readability counts" in mind?

The code fragment you present is fairly atypical (which might also because you probably made it up):
you wouldn't normally have an instance variable (self.c) that is a floating point number at some point, and a string at a different point. It should be either a number or a string all the time.
you normally don't bring instance variables into life in an arbitrary method. Instead, you typically have a constructor (__init__) that initializes all variables.
you typically don't have instance variables named a, b, c. Instead, they have some speaking names.
With these fixed, your example would be much more readable.

A sufficiently talented miscreant can write unreadable code in any language. Python attempts to impose some rules on structure and naming to nudge coders in the right direction, but there's no way to force such a thing.
For what it's worth, I try to limit the scope of local variables to the area where they're used in every language that i use - for me, not having to maintain a huge mental dictionary makes re-familiarizing myself with a bit of code much, much easier.

I agree that what you have seen can be confusing and ought to be accompanied by documentation. But confusing things can happen in any language.
In your own code, you should apply whatever conventions make things easiest for you to maintain the code. With respect to this particular issue, there are a number of possible things that can help.
Using something like Epydoc, you can specify all the instance variables a class will have. Be scrupulous about documenting your code, and be equally scrupulous about ensuring that your code and your documentation remain in sync.
Adopt coding conventions that encourage the kind of code you find easiest to maintain. There's nothing better than setting a good example.
Keep your classes and functions small and well-defined. If they get too big, break them up. It's easier to figure out what's going on that way.
If you really want to insist that instance variables be declared before referenced, there are some metaclass tricks you can use. e.g., You can create a common base class that, using metaclass logic, enforces the convention that only variables that are declared when the subclass is declared can later be set.

This problem is easily solved by specifying coding standards such as declaring all instance variables in the init method of your object. This isn't really a problem with python as much as the programmer.

If what the code is doing becomes mysterious for some reason .. there should either be comments or the function names should make it obvious.
This is just my opinion though.

I personally think not having to declare variables is one of the dangerous things in Python, especially when doing classes. It is all too easy to accidentally create a variable by simple mistyping and then boggle at the code at length, unable to find the mistake.

Adding a property just before you need it will prevent you from using it before it's got a value. Personally, I always find classes hard to follow just from reading source - I read the documentation and find out what it's supposed to do, and then it usually makes sense when I read the source again.

The fact that such stuff is allowed is only useful in rare times for prototyping; while Javascript tends to allow anything and maybe such an example could be considered normal (I don't really know), in Python this is mostly a negative byproduct of omission of type declaration, which can help speeding up development - if you at some point change your mind on the type of a variable, fixing type declarations can take more time than the fixes to actual code, in some cases, including the renaming of a type, but also cases where you use a different type with some similar methods and no superclass/subclass relationship.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.