As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I come from C background and am learning Python. The lack of explicit type-safety is disturbing, but I am getting used to it. The lack of built-in contract-based programming (pure abstract classes, interfaces) is something to get used to, in the face of all the advantages of a dynamic language.
However, the inability to request const-cortectness is driving me crazy! Why are there no constants in Python? Why are class-level constants discouraged?
C and Python belongs to two different classes of languages.
The former one is statically typed. The latter is dynamic.
In a statically typed language, the type checker is able to infer the type of each expression and check if this match the given declaration during the "compilation" phase.
In a dynamically typed language, the required type information is not available until run-time. And the type of an expression may vary from one run to an other. Of course, you could add type checking during program execution. This is not the choice made in Python. This has for advantage to allow "duck typing". The drawback is the interpreter is not able to check for type correctness.
Concerning the const keyword. This is a type modifier. Restricting the allowed use of a variable (and sometime modifying allowed compiler optimization). It seems quite inefficient to check that at run-time for a dynamic language. At first analysis, that would imply to check if a variable is const or not for each affectation. This could be optimized, but even so, does it worth the benefit?
Beyond technical aspects, don't forget that each language has its own philosophy. In Python the usual choice is to favor "convention" instead of "restriction". As an example, constant should be spelled in all caps. There is no technical enforcement of that. It is just a convention. If you follow it, your program will behave as expected by "other programmers". If you decide to modify a "constant", Python won't complain. But you should feel like your are doing "something wrong". You break a convention. Maybe you have your reasons for doing so. Maybe you shouldn't have. Your responsibility.
As a final note, in dynamic languages, the "correctness" of a program is much more of the responsibility of your unit testings, than in the hand of the compiler. If you really have difficulties to made the step, you will find around some "code checkers". Those are PyLint, PyChecker, PyFlakes...
I don't know why this design decision was made but my personal guess is that there's no explicit const keyword because the key benefits of constants are already available:
Constants are good for documentation purposes. If you see a constant, you know that you can't change it. This is also possible by naming conventions.
Constants are useful for function calls. If you pass a constant as a parameter to a function, you can be sure that it isn't changed. In Python functions are "call-by-value" but since python variables are references you effectively pass a copy of a reference. Inside of the function you can mutate the reference but if you reassign it, the changes do not persist outside of the function scope. Therefore, if you pass a number as a variable, it is actually passed "like" a constant. You can assign a new value to the variable. But outside of the function, you still got the old number
Moreover if there was a const keyword, it would create an asymmetry: variables are declared without keyword but consts are declared with a keyword. The logical consequence would be to create a second keyword named var. This is probably a matter of taste. Personally I prefer the minimalistic approach to variable declarations.
You can probably achieve a little more type safety, if you work with immutable data structures like tuples. Be careful however, the tuple itself can not be modified. But if it contains references to mutable objects, these are still mutable even if they belong to a tuple.
Finally you might want to take a look at this snippet: http://code.activestate.com/recipes/65207-constants-in-python/?in=user-97991 I'm not sure if this is an implementation of "class-level constants". But I thought it might be useful.
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
without explicit (type) declaration I struggle to try to figure out how things work --- are there some good thumbs of rule/tips that you may have for reading python code better? Thanks!
In spite of the first impression that this question gives, I think it is indeed really intelligent because it reveals that you are subconscious of something that should interest any Python's developper but that I find very neglected in general and in explanations in particular, if not misunderstood.
I mean that IMO the base of Python is terrificly quaint and intelligent: it's the data model on which it has been conceived.
In this Python's data model, there are no variables in the sense of "chunks of memory whose contents can change", contrary to other languages, and in the sense that we don't manage this precise kind of variables in Python.
More precisely, all is object in Python, and every object is named and designed with an identifier, but neither the object nor the identifier are 'variables' in the said sense.
That doesn't mean that there are no little boxes, so called variables in other languages, temporarily hosting values that go in and out of them, in the depthes of the implementation.
.
Say an object is designed with the identifier XYA2.
Personally I use this appearance of letters to designate any identifier. An identifier is nothing else than a word written in a code. It is what appears in a code.
Note that this appearance of letters is the one used by this stackoverflow.com site to represent a code sample inside text, by clicking on the button {}. That's easy to remind.
Now, the object whose name is XYA2 is a real thing, a concrete set of bits lying in the memory of the computer to represent the desired conceptual value that it stands for.
This set is defined in C language in which Python is implemented.
Personnaly, I bold the letters when I want to designate an object.
Then the object of name XYA2 is, for me, refered to by XYA2
The identifier is XYA2
It is linked to an underlying and inaccessible pointer that points to the object.
This link is done by means of the symbol table. You will see very few references or allusions to symbol table in general, here on stackoverflow or elsewhere. However it's very important, I think.
The pointer linked to the identifier XYA2 points to the object XYA2
So, XYA2 is directly linked to the pointer and indirectly linked to the object.
Instead of saying "indirectly linked", we say "assigned". An object and its identifier are reciprocally assigned one to the other, but the medium of this link is the underlying pointer.
.
And now, something important.
Strictly speaking, a variable is a "chunk of memory whose content can change".
I personally do efforts to never use the word 'variable' in an other sense that this one.
The problem is that, because of the use of the word 'variable' in mathematics, this word is very often used indiscriminately and thrown in all the wind's directions by many developpers (not all) even when it isn't justified.
Thereby, it is commonly used by nearly everybody to designates the names, aka the identifiers in a code. But this practice is horribly confusing. It should be carefully avoided.
That said, an object in Python is not only an instance of some class, it is above all a concrete set of bits; set which IS NOT, as far as I know, a variable, in the sense of "chunk of memory whose content can change".
Hence my opinion that there aren't variables in Python, since the only entities we can access to and manipulate are identifiers and objects.
However, the processes under the hood in an executed Python program use quantities of pointers that are, as far as I know, real variables in the strict sense of this word.
So, in a sense, it could be said that my affirmation 'There are no variables in Python" is false.
It's a matter of point of view.
As a developer in Python, conceptually speaking, I don't manage variables. When I think to an algorithm, I don't think at the level of the pointers, even if I know they exist and that it's very important to know they exist. Being not at the level of the variables, but at the level of the Python's data model, I don't see why I should accept to believe that there are variables in a Python program. There are variables at the machine low-level, and Python is a very-high-level language.
.
Why did I write all this ?
1)
because the nature of the Python's data model has quantities of consequences that can't be understood if this data model isn't known. Among these consequences, some are interesting because they give incredible possibilities, others are traps (a well known example is: modifying an element in a copied list modifies also the element in the original list). That's why it's of first importance to learn about this data model.
For that, I recommend you to read these parts of the documentation:
3.1 of objects-values-and-types
4.1 of naming-and-binding
.
2)
To justify my answer to your perplexity: don't struggle about what happens under the hood:
there's a garbage colector, a reference counter, wagons of underlying dictionaries-like entities, a thunderous ballet of values in the secret of the underlying pointers, many verifications made by the interpreter... When something doesn't fit well , warning is given in the form of exception's messages.
Python has all the machinery under control
The only concern you must have is to think about the algorithm you want to achieve, and for that, knowing the data model is essential.
Welcome in the Python universe
Warning
I don't consider myself as a very skilled Python developper, I'm just an amateur who had a lot of problems before understanding some essential things about Python.
All the above description is my personal views about the data model of Python. If any point is incorrect in this description, I will be happy to learn more about it if the teaching is done with developped argumentation.
But I underline the fact that this vision of things allows me to understand and to answer to a lot of tough problems and to achieve some tricky mechanisms that Python is capable of. So, all can't be false in this above description.
You should take a look at PEP8 documentation This describes the Python formatting and style.
Read up on Duck Typing. One of the purposes of Duck Typing is that you shouldn't be thinking too much about the type of something anyway. What really concerns you is that the the variable can be used the way that you want it.
In Python, you don't need a type declaration because the name you assign is just a pointer to an object, and furthermore it can change at any time.
a = None
a = 1+5
a = my_function() # calls my function and assigns the return object to a
a = my_function # Assigns the function itself to a. You could actually pass it as a parameter
a = MyClass() # Runs the __init__() function of the class and assigns the return value to a
a = MyClass # Assigns the class itself to a.
This is all valid Python. You could run this sequentially, although changing up the type is frowned upon unless its totally clear as to why.
if you know the c++11 then it is similer to auto type.
The variable type is decided on the bases of its assignment.
Using Haskell's type system I know that at some point in the program, a variable must contain say an Int of a list of strings. For code that compiles, the type checker offers certain guarantees that for instance I'm not trying to add an Int and a String.
Are there any tools to provide similar guarantees for Python code?
I know about and practice TDD.
The quick answer is "not really". While tools like PyLint (which is very good BTW) will give you a lot of help and good advice on what constitutes good Python style, that isn't exactly what you're looking for and it certainly isn't a real substitute for things like HM type inference.
There are some interesting research projects in this area, notably Gradual Typing by Jeremy Siek and colleagues and some really interesting ideas like the blame calculus of Wadler and Findler.
Practically speaking, I think the best you can achieve is by using some sensibly chosen runtime methods. Use the inspect module to test the type of an object (but remember to be true to Python's duck typing and so on). Use assert statements liberally. Or (possible 'And') use something like Design by Contract using decorators. There are lots of ways to implement these idioms, but this is typically done on a per-project basis. You may want to think about whether and how such methods affect the performance and resource usage of your programs, if this is critical for you. There have, however, been some efforts to standardise techniques like DBC for Python, but these haven't (yet) been pushed into the cPython trunk. Here's hoping though :)
Python is dynamic and strongly typed programming language. What that means is that you can define a variable without explicitly stating its type, but when you first use that variable it becomes bound to a certain type.
For example,
x = 5 is an integer, and so now you cannot concatenate it with string, e.g. x+"hello"
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Python (and Python C API): new versus init
I'm at college just now and the lecturer was using the terms constructors and initializers interchangeably. I'm pretty sure that this is wrong though.
I've tried googling the answer but not found the answer I'm looking for.
In most OO languages, they are the same step, so he's not wrong for things like java, c++, etc. In python they are done in two steps: __new__ is the constructor; __init__ is the initializer.
Here is another answer that goes into more detail about the differences between them.
In almost all usual cases, Python does not have constructors in the same sense used by other OO languages because manually managing memory is generally discouraged. Instead, what you should usually do is define an __init__ method on the class. This method is called to initialize the new instance object automatically, first thing after it is constructed. Thus, it is not really a constructor, and talking about it as a constructor might confuse some people.
Of course some people want to call it a constructor because it is used a little bit like a constructor - fundamentally you can call it whatever you want as long as everyone understands what you are actually referring to. But in general, to be explicit and make yourself understood, call it an init method or something other than a constructor. Fundamentally, different languages just come with somewhat different terminology and speaking very clearly will always require adjustment to your subject matter and audience.
In Python it is possible to manage instance creation and destruction at a finer granularity, though you won't want to unless you know what you're doing. This is done by defining __new__ and __del__ methods to hook object instantiation and del statements. Whether these qualify as constructors and destructors precisely is a little more debatable (Python docs call the del method a destructor, but tend to be vaguer on what constitutes a constructor, e.g. including many functions which return object instances). I'd still encourage you to use the specific terminology for the language at hand, and in comparative discussions to define your terms up front. As always, your choice of terms while speaking involves tradeoffs between the audience being able to easily follow you and the audience potentially being led into confusion: if you are talking about memory management probably be as specific as possible, but if you are talking loosely then just use some word your audience understands and be ready to clarify.
Your instructor is being unclear at worst, I'm not aware of any one canonical definition of these terms but they might cause confusion for people who have learned very specific definitions from other languages.
http://docs.python.org/reference/datamodel.html#basic-customization
__new__ - constructor.
__init__ - initializer.
Note: I'm not talking about preventing the rebinding of a variable. I'm talking about preventing the modification of the memory that the variable refers to, and of any memory that can be reached from there by following the nested containers.
I have a large data structure, and I want to expose it to other modules, on a read-only basis. The only way to do that in Python is to deep-copy the particular pieces I'd like to expose - prohibitively expensive in my case.
I am sure this is a very common problem, and it seems like a constant reference would be the perfect solution. But I must be missing something. Perhaps constant references are hard to implement in Python. Perhaps they don't quite do what I think they do.
Any insights would be appreciated.
While the answers are helpful, I haven't seen a single reason why const would be either hard to implement or unworkable in Python. I guess "un-Pythonic" would also count as a valid reason, but is it really? Python does do scrambling of private instance variables (starting with __) to avoid accidental bugs, and const doesn't seem to be that different in spirit.
EDIT: I just offered a very modest bounty. I am looking for a bit more detail about why Python ended up without const. I suspect the reason is that it's really hard to implement to work perfectly; I would like to understand why it's so hard.
It's the same as with private methods: as consenting adults authors of code should agree on an interface without need of force. Because really really enforcing the contract is hard, and doing it the half-assed way leads to hackish code in abundance.
Use get-only descriptors, and state clearly in your documentation that these data is meant to be read only. After all, a determined coder could probably find a way to use your code in different ways you thought of anyways.
In PEP 351, Barry Warsaw proposed a protocol for "freezing" any mutable data structure, analogous to the way that frozenset makes an immutable set. Frozen data structures would be hashable and so capable being used as keys in dictionaries.
The proposal was discussed on python-dev, with Raymond Hettinger's criticism the most detailed.
It's not quite what you're after, but it's the closest I can find, and should give you some idea of the thinking of the Python developers on this subject.
There are many design questions about any language, the answer to most of which is "just because". It's pretty clear that constants like this would go against the ideology of Python.
You can make a read-only class attribute, though, using descriptors. It's not trivial, but it's not very hard. The way it works is that you can make properties (things that look like attributes but call a method on access) using the property decorator; if you make a getter but not a setter property then you will get a read-only attribute. The reason for the metaclass programming is that since __init__ receives a fully-formed instance of the class, you actually can't set the attributes to what you want at this stage! Instead, you have to set them on creation of the class, which means you need a metaclass.
Code from this recipe:
# simple read only attributes with meta-class programming
# method factory for an attribute get method
def getmethod(attrname):
def _getmethod(self):
return self.__readonly__[attrname]
return _getmethod
class metaClass(type):
def __new__(cls,classname,bases,classdict):
readonly = classdict.get('__readonly__',{})
for name,default in readonly.items():
classdict[name] = property(getmethod(name))
return type.__new__(cls,classname,bases,classdict)
class ROClass(object):
__metaclass__ = metaClass
__readonly__ = {'a':1,'b':'text'}
if __name__ == '__main__':
def test1():
t = ROClass()
print t.a
print t.b
def test2():
t = ROClass()
t.a = 2
test1()
While one programmer writing code is a consenting adult, two programmers working on the same code seldom are consenting adults. More so if they do not value the beauty of the code but them deadlines or research funds.
For such adults there is some type safety, provided by Enthought's Traits.
You could look into Constant and ReadOnly traits.
For some additional thoughts, there is a similar question posed about Java here:
Why is there no Constant feature in Java?
When asking why Python has decided against constant references, I think it's helpful to think of how they would be implemented in the language. Should Python have some sort of special declaration, const, to create variable references that can't be changed? Why not allow variables to be declared a float/int/whatever then...these would surely help prevent programming bugs as well. While we're at it, adding class and method modifiers like protected/private/public/etc. would help enforce compile-type checking against illegal uses of these classes. ...pretty soon, we've lost the beauty, simplicity, and elegance that is Python, and we're writing code in some sort of bastard child of C++/Java.
Python also currently passes everything by reference. This would be some sort of special pass-by-reference-but-flag-it-to-prevent-modification...a pretty special case (and as the Tao of Python indicates, just "un-Pythonic").
As mentioned before, without actually changing the language, this type of behaviour can be implemented via classes & descriptors. It may not prevent modification from a determined hacker, but we are consenting adults. Python didn't necessarily decide against providing this as an included module ("batteries included") - there was just never enough demand for it.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
A while ago, when I was learning Javascript, I studied Javascript: the good parts, and I particularly enjoyed the chapters on the bad and the ugly parts. Of course, I did not agree with everything, as summing up the design defects of a programming language is to a certain extent subjective - although, for instance, I guess everyone would agree that the keyword with was a mistake in Javascript. Nevertheless, I find it useful to read such reviews: even if one does not agree, there is a lot to learn.
Is there a blog entry or some book describing design mistakes for Python? For instance I guess some people would count the lack of tail call optimization a mistake; there may be other issues (or non-issues) which are worth learning about.
You asked for a link or other source, but there really isn't one. The information is spread over many different places. What really constitutes a design mistake, and do you count just syntactic and semantic issues in the language definition, or do you include pragmatic things like platform and standard library issues and specific implementation issues? You could say that Python's dynamism is a design mistake from a performance perspective, because it makes it hard to make a straightforward efficient implementation, and it makes it hard (I didn't say completely impossible) to make an IDE with code completion, refactoring, and other nice things. At the same time, you could argue for the pros of dynamic languages.
Maybe one approach to start thinking about this is to look at the language changes from Python 2.x to 3.x. Some people would of course argue that print being a function is inconvenient, while others think it's an improvement. Overall, there are not that many changes, and most of them are quite small and subtle. For example, map() and filter() return iterators instead of lists, range() behaves like xrange() used to, and dict methods like dict.keys() return views instead of lists. Then there are some changes related to integers, and one of the big changes is binary/string data handling. It's now text and data, and text is always Unicode. There are several syntactic changes, but they are more about consistency than revamping the whole language.
From this perspective, it appears that Python has been pretty well designed on the language (syntax and sematics) level since at least 2.x. You can always argue about indentation-based block syntax, but we all know that doesn't lead anywhere... ;-)
Another approach is to look at what alternative Python implementations are trying to address. Most of them address performance in some way, some address platform issues, and some add or make changes to the language itself to more efficiently solve certain kinds of tasks. Unladen swallow wants to make Python significantly faster by optimizing the runtime byte-compilation and execution stages. Stackless adds functionality for efficient, heavily threaded applications by adding constructs like microthreads and tasklets, channels to allow bidirectional tasklet communication, scheduling to run tasklets cooperatively or preemptively, and serialisation to suspend and resume tasklet execution. Jython allows using Python on the Java platform and IronPython on the .Net platform. Cython is a Python dialect which allows calling C functions and declaring C types, allowing the compiler to generate efficient C code from Cython code. Shed Skin brings implicit static typing into Python and generates C++ for standalone programs or extension modules. PyPy implements Python in a subset of Python, and changes some implementation details like adding garbage collection instead of reference counting. The purpose is to allow Python language and implementation development to become more efficient due to the higher-level language. Py V8 bridges Python and JavaScript through the V8 JavaScript engine – you could say it's solving a platform issue. Psyco is a special kind of JIT that dynamically generates special versions of the running code for the data that is currently being handled, which can give speedups for your Python code without having to write optimised C modules.
Of these, something can be said about the current state of Python by looking at PEP-3146 which outlines how Unladen Swallow would be merged into CPython. This PEP is accepted and is thus the Python developers' judgement of what is the most feasible direction to take at the moment. Note it addresses performance, not the language per se.
So really I would say that Python's main design problems are in the performance domain – but these are basically the same challenges that any dynamic language has to face, and the Python family of languages and implementations are trying to address the issues. As for outright design mistakes like the ones listed in Javascript: the good parts, I think the meaning of "mistake" needs to be more explicitly defined, but you may want to check out the following for thoughts and opinions:
FLOSS Weekly 11: Guido van Rossum (podcast August 4th, 2006)
The History of Python blog
Is there a blog entry or some book describing design mistakes for Python?
Yes.
It's called the Py3K list of backwards-incompatible changes.
Start here: http://docs.python.org/release/3.0.1/whatsnew/3.0.html
Read all the Python 3.x release notes for additional details on the mistakes in Python 2.
My biggest peeve with Python - and one which was not really addressed in the move to 3.x - is the lack of proper naming conventions in the standard library.
Why, for example, does the datetime module contain a class itself called datetime? (To say nothing of why we have separate datetime and time modules, but also a datetime.time class!) Why is datetime.datetime in lower case, but decimal.Decimal is upper case? And please, tell me why we have that terrible mess under the xml namespace: xml.sax, but xml.etree.ElementTree - what is going on there?
Try these links:
http://c2.com/cgi/wiki?PythonLanguage
http://c2.com/cgi/wiki?PythonProblems
Things that frequently surprise inexperienced developers are candidate mistakes. Here is one, default arguments:
http://www.deadlybloodyserious.com/2008/05/default-argument-blunders/
A personal language peeve of mine is name binding for lambdas / local functions:
fns = []
for i in range(10):
fns.append(lambda: i)
for fn in fns:
print(fn()) # !!! always 9 - not what I'd naively expect
IMO, I'd much prefer looking up the names referenced in a lambda at declaration time. I understand the reasons for why it works the way it does, but still...
You currently have to work around it by binding i into a new name whos value doesn't change, using a function closure.
This is more of a minor problem with the language, rather than a fundamental mistake, but: Property overriding. If you override a property (using getters and setters), there is no easy way of getting the parent class' property.
Yeah, it's strange but I guess that's what you get for having mutable variables.
I think the reason is that the "i" refers to a box which has a mutable value and the "for" loop will change that value over time, so reading the box value later gets you the only value there is left.
I don't know how one would fix that short of making it a functional programming language without mutable variables (at least without unchecked mutable variables).
The workaround I use is creating a new variable with a default value (default values being evaluated at DEFINITION time in Python, which is annoying at other times) which causes copying of the value to the new box:
fns = []
for i in range(10):
fns.append(lambda j=i: j)
for fn in fns:
print(fn()) # works
I find it surprising that nobody mentioned the global interpreter lock.
One of the things I find most annoying in Python is using writelines() and readlines() on a file. readlines() not only returns a list of lines, but it also still has the \n characters at the end of each line, so you have to always end up doing something like this to strip them:
lines = [l.replace("\n", "").replace("\r", "") for l in f.readlines()]
And when you want to use writelines() to write lines to a file, you have to add \n at the end of every line in the list before you write them, sort of like this:
f.writelines([l + "\n" for l in lines])
writelines() and readlines() should take care of endline characters in an OS independent way, so you don't have to deal with it yourself.
You should just be able to go:
lines = f.readlines()
and it should return a list of lines, without \n or \r characters at the end of the lines.
Likewise, you should just be able to go:
f.writelines(lines)
To write a list of lines to a file, and it should use the operating systems preferred enline characters when writing the file, you shouldn't need to do this yourself to the list first.
My biggest dislike is range(), because it doesn't do what you'd expect, e.g.:
>>> for i in range(1,10): print i,
1 2 3 4 5 6 7 8 9
A naive user coming from another language would expect 10 to be printed as well.
You asked for liks; I have written a document on that topic some time ago: http://segfaulthunter.github.com/articles/biggestsurprise/
I think there's a lot of weird stuff in python in the way they handle builtins/constants. Like the following:
True = "hello"
False = "hello"
print True == False
That prints True...
def sorted(x):
print "Haha, pwned"
sorted([4, 3, 2, 1])
Lolwut? sorted is a builtin global function. The worst example in practice is list, which people tend to use as a convenient name for a local variable and end up clobbering the global builtin.