Python - mechanism of () and [] - python

It looks e for e in [1, 2, 3, 4, 5] is a generator expression and (e for e in [1, 2, 3, 4, 5]) is evaluated as an generator object. Hence, I think (...) is evaluation in Python.
I suppose list(e for e in [1, 2, 3, 4, 5]) is telling the Python runtime to evaluate the iterable expression, generate its object(s), and call the list function to invoke yield until it runs out of elements.
print(list(e for e in [1, 2, 3, 4, 5]))
---
[1, 2, 3, 4, 5]
Question
What actually is [...] in the code below and what is its mechanism? [ e for e in [1, 2, 3, 4, 5] ] generates a list object, hence I suppose it is a combination of an evaluation on e for e in [1, 2, 3, 4, 5] to create a generator object and invoking a call to the generator object. Is it a alias of a function call to list(...)?
print([ e for e in [1, 2, 3, 4, 5] ])
---
[1, 2, 3, 4, 5]
For the list access with a slice object, I suppose [1:3] is telling Python to evaluate the 1:3 expression to generate a slice object.
print([1,2,3][1:3])
print([1,2,3][slice(1,3,1)])
---
[2, 3]
[2, 3]
[(1:3)] fails because it tries to evaluate already evaluated 1:3?
print([1,2,3][(1:3)])
---
File "<ipython-input-167-c20e211025dc>", line 1
print([1,2,3][(1:3)])
^
SyntaxError: invalid syntax

[1, 2, 3, 4, 5] is a list literal.
value for item in iterable is a generator comprehension. list(value for item in iterable) would be calling the list() constructor with a generator, which of course just produces a list. For the purpose of reducing ambiguity, the generator comprehension cannot be used naked. But it can be used either inside a set of parentheses, or inside another expression, such as a parameter in a function call. A similar limitation applies to the := (walrus) operator added in Python 3.8.
[value for item in iterable] is a list comprehension. Note that Python treats this as an entirely separate syntactic construct.
The implementations are probably about the same, but as far as I'm aware the Python compiler detects the generator comprehension and the list comprehension separately while it processes the code, and the list comprehension is not defined as a subset/special case of either a generator comprehension or a list literal.
I'm pretty sure a similar thing applies to slice syntax - it's defined as its own syntax, specifically in the context of list indexing, and in no other context. lst[1:3] getting compiled into lst.__getitem__(slice(1, 3)) is part of the compilation process, and is not a general thing for the syntax 1:3 (as that's ambiguous).
In other words, if I remember correctly, lst[x:y:z] is a different syntactic construct from lst[x], as far as the Python compiler is concerned.
*The information in this post is based on my understanding and prior interaction with various methods in the CPython source code. I'm drawing some conclusions between the syntax, the compiler, and the compiled code that may not be valid.

Related

Why sort and sorted functions are showing different results? [duplicate]

I am trying to sort a list by frequency of its elements.
>>> a = [5, 5, 4, 4, 4, 1, 2, 2]
>>> a.sort(key = a.count)
>>> a
[5, 5, 4, 4, 4, 1, 2, 2]
a is unchanged. However:
>>> sorted(a, key = a.count)
[1, 5, 5, 2, 2, 4, 4, 4]
Why does this method not work for .sort()?
What you see is the result of a certain CPython implementation detail of list.sort. Try this again, but create a copy of a first:
a.sort(key=a.copy().count)
a
# [1, 5, 5, 2, 2, 4, 4, 4]
.sort modifies a internally, so a.count is going to produce un-predictable results. This is documented as an implementation detail.
What copy call does is it creates a copy of a and uses that list's count method as the key. You can see what happens with some debug statements:
def count(x):
print(a)
return a.count(x)
a.sort(key=count)
[]
[]
[]
...
a turns up as an empty list when accessed inside .sort, and [].count(anything) will be 0. This explains why the output is the same as the input - the predicates are all the same (0).
OTOH, sorted creates a new list, so it doesn't have this problem.
If you really want to sort by frequency counts, the idiomatic method is to use a Counter:
from collections import Counter
a.sort(key=Counter(a).get)
a
# [1, 5, 5, 2, 2, 4, 4, 4]
It doesn't work with the list.sort method because CPython decides to "empty the list" temporarily (the other answer already presents this). This is mentioned in the documentation as implementation detail:
CPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined. The C implementation of Python makes the list appear empty for the duration, and raises ValueError if it can detect that the list has been mutated during a sort.
The source code contains a similar comment with a bit more explanation:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
The explanation isn't straight-forward but the problem is that the key-function and the comparisons could change the list instance during sorting which is very likely to result in undefined behavior of the C-code (which may crash the interpreter). To prevent that the list is emptied during the sorting, so that even if someone changes the instance it won't result in an interpreter crash.
This doesn't happen with sorted because sorted copies the list and simply sorts the copy. The copy is still emptied during the sorting but there's no way to access it, so it isn't visible.
However you really shouldn't sort like this to get a frequency sort. That's because for each item you call the key function once. And list.count iterates over each item, so you effectively iterate the whole list for each element (what is called O(n**2) complexity). A better way would be to calculate the frequency once for each element (can be done in O(n)) and then just access that in the key.
However since CPython has a Counter class that also supports most_common you could really just use that:
>>> from collections import Counter
>>> [item for item, count in reversed(Counter(a).most_common()) for _ in range(count)]
[1, 2, 2, 5, 5, 4, 4, 4]
This may change the order of the elements with equal counts but since you're doing a frequency count that shouldn't matter to much.

Why is it that modifying lists in functions changes the original list, but declaring them in a function creates a new object?

I'm new to writing code, and I understand the behavior of lists to an extent. Whenever a list is modified within a function's scope, the one in global scope changes too.
For example,
def modify_list(lst):
lst.append(5)
lst = [1, 2, 3, 4]
#This output is [1, 2, 3, 4]
print(lst)
modify_list(lst)
#This output is [1, 2, 3, 4, 5] because of the function.
print(lst)
I don't understand why this example won't work:
def modify_list(lst):
lst = [1, 2, 3, 4, 5]
lst = [1, 2, 3, 4]
#Output is [1, 2, 3, 4]
print(lst)
modify_list(lst)
#Output is [1, 2, 3, 4]
print(lst)
Why doesn't lst get modified in the second example? Is it because I'm creating a new object within the function's scope? Using the global keyword works instead of passing a parameter, but I want to avoid using global unless absolutely necessary.
I'm using this in an initialization function and want to revert the list back to its original state whenever the function is called. Again, using global works, I'm just wondering why this doesn't work.
Thanks! (Sorry if I'm not good at explaining things well)
The id function comes in handy here. Basically, the id function returns an integer that is guaranteed to be unique and constant for the lifetime of whatever object it was called on. In fact, in CPython (regular Python), id returns the address of the object in memory.
So if you run your code, printing the id of lst before and after running your modify_list, you'll find that the id changes when you assign to lst but not when you append.
Calling append on lst won't change the id of lst because lists in Python are mutable, and appending simply mutates the list. But, when you assign [1, 2, 3, 4, 5] to lst, you are creating a brand-new list object and assigning it to lst. This is not a mutation, and doesn't change the original in anyway. In general in Python, you can mutate arguments within a function to modify the original copy, but assigning a new object to it is not a mutation and won't change the original copy.
In
def modify_list(lst):
lst = [1, 2, 3, 4, 5]
you make local variable lst point to a completely different object than the one you passed as the argument in modify_list(lst) call. Maybe this article will help you understand: https://medium.com/school-of-code/passing-by-assignment-in-python-7c829a2df10a

Usage of python function

I'm a new learner of python/programming. Here is a question on top of head about the use of function in python.
If I had a list called myList.
(a) If I were to sort it, I would use myList.sort()
(b) If I were to sort it temporarily, I would use sorted(myList)
Note the difference between the use of two functions, one is to apply the function to myList, the other one is use myList as a parameter to the function.
My question is, each time when I use a function.
How do I know if the function should be used as an "action" to be applied to an object (in (a)), or
should an object passed to the function as a parameter,(in (b)).
I have been confused with this for quite long time. appreciate any explanations.
Thanks.
There are two big differences between list.sort and sorted(list)
The list.sort() sorts the list in-place, which means it modifies the
list. The sorted function does not modify original list but returns
a sorted list
The list.sort() only applies to list (it is a method), but sorted built-in function can take any iterable object.
Please go through this useful documentation.
Only sorted is a function - list.sort is a method of the list type.
Functions such as sorted are applicable to more than a specific type. For example, you can get a sorted list, set, or even a temporary generator. Only the output is concrete (you always get a new list) but not the input.
Methods such as sort are applicable only to the type that holds them. For example, there is a list.sort method but not a dict.sort method. Even for types whose methods have the same name, switching them is not sensible - for example, set.copy cannot be used to copy a dict.
An easy way to distinguish the two is that functions live in regular namespaces, such as modules. On the other hand, methods only live inside classes and their instances.
sorted # function
list.sort # method
import math
math.sqrt # function
math.pi.as_integer_ratio # method
Conventionally, Python usually uses functions for immutable actions and methods for mutating actions. For example, sorted provides a new sorted list leaving the old one untouched; my_list.sort() sorts the existing list, providing no new one.
my_list = [4, 2, 3, 1]
print(sorted(my_list)) # prints [1, 2, 3, 4]
print(my_list) # prints [4, 2, 3, 1] - unchanged by sorted
print(my_list.sort()) # prints None - no new list produced
print(my_list) # prints [1, 2, 3, 4] - changed by sort
sort() is an in-place function whereas sorted() will return a sorted list, but will not alter your variable in place. The following demonstrates the difference:
l = [1, 2, 1, 3, 2, 4]
l.sort()
print(l) --returns [1, 1, 2, 2, 3, 4]
l = [1, 2, 1, 3, 2, 4]
new_l = sorted(l)
print(new_l) -- returns [1, 1, 2, 2, 3, 4]
print(l) -- [1, 2, 1, 3, 2, 4]
If you want to maintain the original order of your list use sorted, otherwise you can use sort().

recursive function python, create function that generates all numbers that have same sum N

I am trying to code a recursive function that generates all the lists of numbers < N who's sum equal to N in python
This is the code I wrote :
def fn(v,n):
N=5
global vvi
v.append(n) ;
if(len(v)>N):
return
if(sum(v)>=5):
if(sum(v)==5): vvi.append(v)
else:
for i in range(n,N+1):
fn(v,i)
this is the output I get
vvi
Out[170]: [[1, 1, 1, 1, 1, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5]]
I tried same thing with c++ and it worked fine
What you need to do is to just formulate it as a recursive description and implement it. You want to prepend all singleton [j] to each of the lists with sum N-j, unless N-j=0 in which you also would include the singleton itself. Translated into python this would be
def glist(listsum, minelm=1):
for j in range(minelm, listsum+1):
if listsum-j > 0:
for l in glist(listsum-j, minelm=j):
yield [j]+l
else:
yield [j]
for l in glist(5):
print(l)
The solution contains a mechanism that will exclude permutated solutions by requiring the lists to be non-decreasing, this is done via the minelm argument that limits the values in the rest of the list. If you wan't to include permuted lists you could disable the minelm mechanism by replacing the recursion call to glist(listsum-j).
As for your code I don't really follow what you're trying to do. I'm sorry, but your code is not very clear (and that's not a bad thing only in python, it's actually more so in C).
First of all it's a bad idea to return the result from a function via a global variable, returning result is what return is for, but in python you have also yield that is nice if you want to return multiple elements as you go. For a recursive function it's even more horrible to return via a global variable (or even use it) since you are running many nested invocations of the function, but have only one global variable.
Also calling a function fn taking arguments v and n as argument. What do that actually tell you about the function and it's argument? At most that it's a function and probably that one of the argument should be a number. Not very useful if somebody (else) is to read and understand the code.
If you want an more elaborate answer what's formally wrong with your code you should probably include a minimal, complete, verifiable example including the expected output (and perhaps observed output).
You may want to reconsider the recursive solution and consider a dynamic programming approach:
def fn(N):
ways = {0:[[]]}
for n in range(1, N+1):
for i, x in enumerate(range(n, N+1)):
for v in ways[i]:
ways.setdefault(x, []).append(v+[n])
return ways[N]
>>> fn(5)
[[1, 1, 1, 1, 1], [1, 1, 1, 2], [1, 2, 2], [1, 1, 3], [2, 3], [1, 4], [5]]
>>> fn(3)
[[1, 1, 1], [1, 2], [3]]
Using global variables and side effects on input parameters is generally consider bad practice and you should look to avoid.

python: print i for i in list

When I printed a list, I got a syntax error when using this method:
print i for i in [1,2,3]
I knew this is ok when using this method:
for i in [1, 2, 3]:
print i
and I knew that
(i for i in [1, 2, 3])
is a generator object, but I just don't get it that why
print i for i in [1, 2, 3]
does't work. Can anyone give me a clue?
The list comprehension syntax x for x in ... requires brackets around it. That's a syntactic rule of Python, just like requiring indentation, or requiring a colon after if or whatever. i for i in [1, 2, 3] by itself is not valid syntax. You need either [i for i in [1, 2, 3]] (for a list comprehension) or (i for i in [1, 2, 3]) (for a generator comprehension).
In Python 2, print is a statement, not an expression or a function, so you can't use it directly in a comprehension. Use this trick:
def f(x): print x
[f(i) for i in [1,2,3]]
Note that (f(i)...) doesn't work because this just creates a generator which would call f() if you iterated over it. The list comprehension [] actually invokes f().
[EDIT] If you use Python > 2.6, you can achieve the same using
from __future__ import print_function
[print(i) for i in [1, 2, 3]]
Note the () around the argument to print.
The list comprehension syntax ([expression for loop]) is a shorthand loop syntax for producing a list.
You are not producing a list, you want to print items in a loop. Since you are not producing a python list, you have to use a regular loop.
Alternatively, since all you are doing is printing the items on separate lines, just add the newlines yourself:
print '\n'.join(i for i in [1, 2, 3])
This produces the same output as:
for i in [1, 2, 3]:
print i
If you use Python 3, or use from __future__ import print at the top of your module and so use the print() function, you can send all values to the function in one call, and tell print() to use newlines in between:
values = [1, 2, 3]
print(*values, sep="\n")
As an expression (in the grammar):
[i for i in [1, 2, 3]] is a list comprehension.
(i for i in [1, 2, 3]) is a generator expression.
But i for i in [1, 2, 3] by itself is a syntax error, and that's just the way it is. There must be something surrounding it. Unless you have ( or [ around it, it's not a valid expression, because the for keyword is not valid at that point.
Inside the print statement, it wants an expression.
(As a red herring, func(i for i in [1, 2, 3]) is permitted as an expression, being a function call with the first argument being a generator expression.)
Print is not a function, it's a statement, and you can't have them in expressions. Just use a regular loop as you don't want to produce a list, that a list comprehension does. In theory you can do (not that you should. at all):
from __future__ import print_function
[print(my_item) for my_item in [1,2,3,4]]
1
2
3
4
Out[26]:
[None, None, None, None]
This is invalid python syntax. The i for i in [1, 2, 3] is only valid in a list or generator comprehension, ie. surrounded by [] or () respectively.
You'll want to use:
print '\n'.join(str(i) for i in [1, 2, 3])

Categories