I recently started coding in Python 2.7. I'm a molecular biologist.
I'm writing a script that involves creating lists like this one:
mylist = [[0, 4, 6, 1], 102]
These lists are incremented by adding an item to mylist[0] and summing a value to mylist[1].
To do this, I use the code:
def addres(oldpep, res):
return [oldpep[0] + res[0], oldpep[1] + res[1]]
Which works well. Since mylist[0] can become a bit long, and I have millions of these lists to take care of, I thought that using append or extend might make my code faster, so I tried:
def addres(pep, res):
pep[0].extend(res[0])
pep[1] += res[1]
return pep
Which in my mind should give the same result. It does give the same result when I try it on an arbitrary list. But when I feed it to the million of lists, it gives me a very different result. So... what's the difference between the two? All the rest of the script is exactly the same.
Thank you!
Roberto
The difference is that the second version of addres modifies the list that you passed in as pep, where the first version returns a new one.
>>> mylist = [[0, 4, 6, 1], 102]
>>> list2 = [[3, 1, 2], 205]
>>> addres(mylist, list2)
[[0, 4, 6, 1, 3, 1, 2], 307]
>>> mylist
[[0, 4, 6, 1, 3, 1, 2], 307]
If you need to not modify the original lists, I don't think you're going to really going to get a faster Python implementation of addres than the first one you wrote. You might be able to deal with the modification, though, or come up with a somewhat different approach to speed up your code if that's the problem you're facing.
List are objects in python which are passed by reference.
a=list()
This doesn't mean that a is the list but a is pointing towards a list just created.
In first example, you are using list element and creating a new list, an another object while in the second one you are modifying the list content itself.
Related
I am trying to sort a list by frequency of its elements.
>>> a = [5, 5, 4, 4, 4, 1, 2, 2]
>>> a.sort(key = a.count)
>>> a
[5, 5, 4, 4, 4, 1, 2, 2]
a is unchanged. However:
>>> sorted(a, key = a.count)
[1, 5, 5, 2, 2, 4, 4, 4]
Why does this method not work for .sort()?
What you see is the result of a certain CPython implementation detail of list.sort. Try this again, but create a copy of a first:
a.sort(key=a.copy().count)
a
# [1, 5, 5, 2, 2, 4, 4, 4]
.sort modifies a internally, so a.count is going to produce un-predictable results. This is documented as an implementation detail.
What copy call does is it creates a copy of a and uses that list's count method as the key. You can see what happens with some debug statements:
def count(x):
print(a)
return a.count(x)
a.sort(key=count)
[]
[]
[]
...
a turns up as an empty list when accessed inside .sort, and [].count(anything) will be 0. This explains why the output is the same as the input - the predicates are all the same (0).
OTOH, sorted creates a new list, so it doesn't have this problem.
If you really want to sort by frequency counts, the idiomatic method is to use a Counter:
from collections import Counter
a.sort(key=Counter(a).get)
a
# [1, 5, 5, 2, 2, 4, 4, 4]
It doesn't work with the list.sort method because CPython decides to "empty the list" temporarily (the other answer already presents this). This is mentioned in the documentation as implementation detail:
CPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined. The C implementation of Python makes the list appear empty for the duration, and raises ValueError if it can detect that the list has been mutated during a sort.
The source code contains a similar comment with a bit more explanation:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
The explanation isn't straight-forward but the problem is that the key-function and the comparisons could change the list instance during sorting which is very likely to result in undefined behavior of the C-code (which may crash the interpreter). To prevent that the list is emptied during the sorting, so that even if someone changes the instance it won't result in an interpreter crash.
This doesn't happen with sorted because sorted copies the list and simply sorts the copy. The copy is still emptied during the sorting but there's no way to access it, so it isn't visible.
However you really shouldn't sort like this to get a frequency sort. That's because for each item you call the key function once. And list.count iterates over each item, so you effectively iterate the whole list for each element (what is called O(n**2) complexity). A better way would be to calculate the frequency once for each element (can be done in O(n)) and then just access that in the key.
However since CPython has a Counter class that also supports most_common you could really just use that:
>>> from collections import Counter
>>> [item for item, count in reversed(Counter(a).most_common()) for _ in range(count)]
[1, 2, 2, 5, 5, 4, 4, 4]
This may change the order of the elements with equal counts but since you're doing a frequency count that shouldn't matter to much.
I am trying to code a recursive function that generates all the lists of numbers < N who's sum equal to N in python
This is the code I wrote :
def fn(v,n):
N=5
global vvi
v.append(n) ;
if(len(v)>N):
return
if(sum(v)>=5):
if(sum(v)==5): vvi.append(v)
else:
for i in range(n,N+1):
fn(v,i)
this is the output I get
vvi
Out[170]: [[1, 1, 1, 1, 1, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5]]
I tried same thing with c++ and it worked fine
What you need to do is to just formulate it as a recursive description and implement it. You want to prepend all singleton [j] to each of the lists with sum N-j, unless N-j=0 in which you also would include the singleton itself. Translated into python this would be
def glist(listsum, minelm=1):
for j in range(minelm, listsum+1):
if listsum-j > 0:
for l in glist(listsum-j, minelm=j):
yield [j]+l
else:
yield [j]
for l in glist(5):
print(l)
The solution contains a mechanism that will exclude permutated solutions by requiring the lists to be non-decreasing, this is done via the minelm argument that limits the values in the rest of the list. If you wan't to include permuted lists you could disable the minelm mechanism by replacing the recursion call to glist(listsum-j).
As for your code I don't really follow what you're trying to do. I'm sorry, but your code is not very clear (and that's not a bad thing only in python, it's actually more so in C).
First of all it's a bad idea to return the result from a function via a global variable, returning result is what return is for, but in python you have also yield that is nice if you want to return multiple elements as you go. For a recursive function it's even more horrible to return via a global variable (or even use it) since you are running many nested invocations of the function, but have only one global variable.
Also calling a function fn taking arguments v and n as argument. What do that actually tell you about the function and it's argument? At most that it's a function and probably that one of the argument should be a number. Not very useful if somebody (else) is to read and understand the code.
If you want an more elaborate answer what's formally wrong with your code you should probably include a minimal, complete, verifiable example including the expected output (and perhaps observed output).
You may want to reconsider the recursive solution and consider a dynamic programming approach:
def fn(N):
ways = {0:[[]]}
for n in range(1, N+1):
for i, x in enumerate(range(n, N+1)):
for v in ways[i]:
ways.setdefault(x, []).append(v+[n])
return ways[N]
>>> fn(5)
[[1, 1, 1, 1, 1], [1, 1, 1, 2], [1, 2, 2], [1, 1, 3], [2, 3], [1, 4], [5]]
>>> fn(3)
[[1, 1, 1], [1, 2], [3]]
Using global variables and side effects on input parameters is generally consider bad practice and you should look to avoid.
I am new to programming and am having a problem with an algorithm I am writing in python. It first assembles a list from a sequence of variables (which each contain a list), and I need it to be able to call the same variable multiple times within a sequence. It then processes the list Like this:
a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]
def function(input):
return output
so that function(c) returns [[1, 2, 3, output_a1], [4,5,6, output_b1]]
If anyone would like to know more about the function, I will gladly provide more information, but my troubleshooting this far has led me to believe that the source problem is rather simple. The problem I am having is that if I call the same variable multiple times in my master list like so: c = [a, b, a], I would like function(c) to return:
[[1, 2, 3, output_a1],
[4, 5, 6, output_b1],
[1, 2, 3, output_a2]]
However, function() processes all instances of a when it encounters just one, so that I get:
[[1, 2, 3, output_a1, output_a3],
[4, 5, 6, output_b1],
[1, 2, 3, output_a1, output_a3]]
I have found two ways to fix this, but I am really not happy with them and I suspect there is a better way. In the first way, I print c and copy and paste it into the function:
function_a([[1, 2, 3], [4, 5, 6], [1, 2, 3]])
and this returns the desired output. Additionaly, I can create another variable with the same contents as a, d = [1, 2, 3], and have c = [a, b, d], and once again, function_a(c) will return the desired output. I have tried a variety of things, but it seems that if any element in c is linked to another through variables, then I encounter this error. Since I will be running this algorithm with fairly lengthy sequences that may contain several instances of the same element, I would really like a clean way to fix this error. Any advice is much appreciated, and I will provide more details about the function if need be. Thanks for reading!
In case anyone is reading this with the same problem, Roberto responded with this link which provided all the information I needed to solve the problem. Little did I know, if I am working with a nested list that contains multiple copies from the same variable, I need to make a deep copy of it to prevent any part of the list from being modified.
I need some help because I think I'm lost. I've search before in this site and of course I've Google it, but believe me that if it was so simple to me, I haven't ask it at all, so please be kind to me.
I'm new to python and coding isn't that easy to me.
Anyway, take a look at my code:
def coin_problem(price, cash_coins):
if (price < 0):
return []
if (price == 0):
return [[]]
options = []
for a_coin in cash_coins:
coins_minus_coin = cash_coins[:]
coins_minus_coin.remove(a_coin)
sub_coin_problem = coin_problem (price - a_coin, cash_coins)
for coin in sub_coin_problem:
coin.append(a_coin)
options.extend(sub_coin_problem)
return options
print coin_problem(4, [1, 2])
As you can see, I've tried to deal with the famous coin problem by recursion (and as I wrote before, I know many have already asked about this, I read their questions and the answers but I still couldn't understand the fully solutions).
This code was made by me, all of it. And now I'm stuck and confused. When the value of "price" is 4 and the value of "cash_coins" is [1,2] instead of returning something like this:
[1,1,1,1]
[2,2]
[2,1,1]
I get something more like:
[[1, 1, 1, 1], [2, 1, 1], [1, 2, 1], [1, 1, 2], [2, 2]]
the combination of "2" and the double "1" repeats 3 times instead of "1".
I don't know what should I do in order to fix the problem or to improve my code so it will work better.
When you want to add a single item to a list, use append. When you want to combine two lists together, use extend.
Tips:
coins_minus_coin = cash_coins[:]
coins_minus_coin.remove(coin)
You never use this variable.
for i in sub_coins:
i.append(coin)
cash_coins_options.append(sub_coins)
You never use i either. I'd guess you meant:
for i in sub_coins:
i.append(coin)
cash_coins_options.append(i)
That solves the problem of stange results, but your solution will still only find []. Why? Your recursion can only stop on a return []; it can't handle another fundamental case when you can tell the price using a single coin. Try adding at the top this simple condition:
# actually I changed my mind-
# I assume you're learning so try this one yourself :-)
This will cause your function to behave much better:
>>> print coin_problem(4, [1,2])
[[2, 1, 1], [1, 2, 1], [2, 2]]
which manages to produce correct answers (even though it duplicates some of them).
Today I spent about 20 minutes trying to figure out why
this worked as expected:
users_stories_dict[a] = s + [b]
but this would have a None value:
users_stories_dict[a] = s.append(b)
Anyone know why the append function does not return the new list? I'm looking for some sort of sensible reason this decision was made; it looks like a Python novice gotcha to me right now.
append works by actually modifying a list, and so all the magic is in side-effects. Accordingly, the result returned by append is None. In other words, what one wants is:
s.append(b)
and then:
users_stories_dict[a] = s
But, you've already figured that much out. As to why it was done this way, while I don't really know, my guess is that it might have something to do with a 0 (or false) exit value indicating that an operation proceeded normally, and by returning None for functions whose role is to modify their arguments in-place you report that the modification succeeded.
But I agree that it would be nice if it returned the modified list back. At least, Python's behavior is consistent across all such functions.
The append() method returns a None, because it modifies the list it self by adding the object appended as an element, while the + operator concatenates the two lists and return the resulting list
eg:
a = [1,2,3,4,5]
b = [6,7,8,9,0]
print a+b # returns a list made by concatenating the lists a and b
>>> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
print a.append(b) # Adds the list b as element at the end of the list a and returns None
>>> None
print a # the list a was modified during the last append call and has the list b as last element
>>> [1, 2, 3, 4, 5, [6, 7, 8, 9, 0]]
So as you can see the easiest way is just to add the two lists together as even if you append the list b to a using append() you will not get the result you want without additional work