Output List Duplicating Values - python

My function same_num takes values that are common to both sorted lists and appends them onto 'result'. It's using recursion and two offsets, pos1 and pos2 that are always initially set to 0, to compare values in the list. When running the function, it works fine the first time, however if I run the function a second time, the original result is appended with the answer I got from running it initially. Where am I going wrong?
result=[]
def same_num(list1,list2,pos1,pos2):
list1=sorted(list1)
list2=sorted(list2)
if pos1==len(list1) or pos2==len(list2):
return result
if list1[pos1]==list2[pos2]:
result.append(list1[pos1])
return same_num(list1,list2,pos1+1,pos2+1)
if list1[pos1]>list2[pos2]:
return same_num(list1,list2,pos1,pos2+1)
if list1[pos1]<list2[pos2]:
return same_num(list1,list2,pos1+1,pos2)
For example:
same_num([3,1,2,4],[3,1,2,4,5,6],0,0)=>[1,2,3,4]
Rerunning the previous example in the shell produces:
same_num([3,1,2,4],[3,1,2,4,5,6],0,0)=>[1, 2, 3, 4, 1, 2, 3, 4]
when it should still produce:
[1,2,3,4]

The problem is that result is a global variable. Globals are bad! You are adding stuff to result (result.append(...)) but never clearing it out after the first invocation of the same_num function.
(Although I can see why you are taking this approach, because conceptually it is often easier to approach recursive functions using global variables.)
If you make result a parameter of the same_num function that can be passed to recursive invocations of the same function... this issue is fixed.
def same_num(list1,list2,pos1,pos2,init_result=None):
# IMPORTANT: see remark below on why init_result=[]
# would not do what you expect
result = init_result if init_result is not None else []
list1=sorted(list1)
list2=sorted(list2)
if pos1==len(list1) or pos2==len(list2):
return result
if list1[pos1]==list2[pos2]:
result.append(list1[pos1])
return same_num(list1,list2,pos1+1,pos2+1,result)
if list1[pos1]>list2[pos2]:
return same_num(list1,list2,pos1,pos2+1,result)
if list1[pos1]<list2[pos2]:
return same_num(list1,list2,pos1+1,pos2,result)
# multiple invocations will return the same (expected) result
print( same_num([3,1,2,4],[3,1,2,4,5,6],0,0) )
print( same_num([3,1,2,4],[3,1,2,4,5,6],0,0) )
By the way, see "Common Python Gotchas: Mutable default arguments" for why I used init_result=None as the default, rather than init_result=[].

When running the function, it works fine the first time, however if I
run the function a second time, the original result is appended with
the answer I got from running it initially.
That is the exact issue. You are not emptying the previous result before you call it again. result still contains the values from the first time you ran the function.
For example, try running it like this instead:
output = same_num([3,1,2,4],[3,1,2,4,5,6],0,0)
print output
result = []
output = same_num([3,1,2,4],[3,1,2,4,5,6],0,0)
print output
Both outputs will be [1,2,3,4]

Related

Python - why isnt my function outputting 'a' ten times? Beginner question

def testfunction():
for i in range(10):
return('a')
print(testfunction())
I want 'a' outputed 10 times in one line. If I use print instead of return, it gives me 10 'a's but each on a new line. Can you help?
return terminates the current function, while print is a call to another function(atleast in python 3)
Any code after a return statement will not be run.
Python's way of printing 10 a's would be:
print('a' * 10)
In your case it would look like the following:
def testfunction ():
return 'a' * 10
print(testfunction ())
The reason its only printing once is because the return statment finishes the function (the return function stops the loop).
In order to print 'a' 10 times you want to do the following:
def testfunction():
for i in range(10):
print('a')
testfunction()
If you want "a" printed 10 times in one single line then you can simply go for:
def TestCode():
print("a"*10)
There's no need to use the for loop. For loop will just "a" for 10 times but every time it'll be a new line.
You can also take in a function argument and get "a" printed as many times as desired.
Such as:
def TestCode(times):
t = "a"*times
print(t)
Test:
TestCode(5)
>>> aaaaa
TestCode(7)
>>> aaaaaaa
print and return get mixed up when starting Python.
A function can return anything but it doesn't mean that the value will be printed for you to see. A function can even return another function (it's called functional programming).
The function below is adapted from your question and it returns a string object. When you call the function, it returns the string object into the variable called x. That contains all of the info you wanted and you can print that to the console.
You could have also used yield or print in your for loop but that may be outside of the scope.
def test_function(item:str="a", n:int=10):
line = item*n # this will be a string object
return line
ten_a_letters = test_function()
print(ten_a_letters)
"aaaaaaaaaa"
two_b_letters = test_function("b",2)
print(two_b_letters)
"bb"
I want 'a' outputed 10 times in one line. If I use print instead of
return, it gives me 10 'a's but each on a new line.
If you want to use print, the you need to pass a 2nd parameter as follows:
def testfunction():
for i in range(10):
print('a', end='')
However, I think the pythonic way would be to do the following:
def testfunction():
print('a' * 10)
When you use return you end the execution of the function immediately and only one value is returned.
Other answers here provide an easier way to solve your problem (which is great), but I would like to suggest a different approach using yield (instead of return) and create a generator (which might be an overkill but a valid alternative nonetheless):
def testfunction():
for i in range(10):
yield('a')
print(''.join(x for x in testfunction()))
1. What does "yield" keyword do?
def test ():
print('a' * 10)
test()
Output will be 'aaaaaaaaaa'.

Pythonic way to efficiently handle variable number of return args

So I have a function that can either work quietly or verbosely. In quiet mode it produces an output. In verbose mode it also saves intermediate calculations to a list, though doing so takes extra computation in itself.
Before you ask, yes, this is an identified bottleneck for optimization, and the verbose output is rarely needed so that's fine.
So the question is, what's the most pythonic way to efficiently handle a function which may or may not return a second value? I suspect a pythonic way would be named tuples or dictionary output, e.g.
def f(x,verbose=False):
result = 0
verbosity = []
for _ in x:
foo = # something quick to calculate
result += foo
if verbose:
verbosity += # something slow to calculate based on foo
return {"result":result, "verbosity":verbosity}
But that requires constructing a dict when it's not needed.
Some alternatives are:
# "verbose" changes syntax of return value, yuck!
return result if verbose else (result,verbosity)
or using a mutable argument
def f(x,verbosity=None):
if verbosity:
assert verbosity==[[]]
result = 0
for _ in x:
foo = # something quick to calculate
result += foo
if verbosity:
# hard coded value, yuck
verbosity[0] += # something slow to calculate based on foo
return result
# for verbose results call as
verbosity = [[]]
f(x,verbosity)
Any better ideas?
Don't return verbosity. Make it an optional function argument, passed in by the caller, mutated in the function if not empty.
The non-pythonic part of some answers is the need to test the structure of the return value. Passing mutable arguments for optional processing avoids this ugliness.
I like the first option, but instead of passing a verbose parameter in the function call, return a tuple of a quick result and a lazily-evaluated function:
import time
def getResult(x):
quickResult = x * 2
def verboseResult():
time.sleep(5)
return quickResult * 2
return (quickResult, verboseResult)
# Returns immediately
(quickResult, verboseResult) = getResult(2)
print(quickResult) # Prints immediately
print(verboseResult()) # Prints after running the long-running function

Understanding a Python function

I need some help understanding a function that i want to use but I'm not entirely sure what some parts of it do. I understand that the function is creating dictionaries from reads out of a Fasta-file. From what I understand this is supposed to generate pre- and suffix dictionaries for ultimately extending contigs (overlapping dna-sequences).
The code:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
dict = {}
multipleKeys = []
i = 1
for read in reads:
if read[0:lenKeys] in dict:
multipleKeys.append(read[0:lenKeys])
else:
dict[read[0:lenKeys]] = read[lenKeys:]
if verbose:
print("\rChecking suffix", i, "of", len(reads), end = "", flush = True)
i += 1
for key in set(multipleKeys):
del(dict[key])
if verbose:
print("\nCreated", len(dict), "suffixes with length", lenSuffix, \
"from", len(reads), "Reads. (", len(reads) - len(dict), \
"unambigous)")
return(dict)
Additional Information: reads = readFasta("smallReads.fna", verbose = True)
This is how the function is called:
if __name__ == "__main__":
reads = readFasta("smallReads.fna", verbose = True)
suffixDicts = makeSuffixDicts(reads, 10)
The smallReads.fna file contains strings of bases (Dna):
"> read 1
TTATGAATATTACGCAATGGACGTCCAAGGTACAGCGTATTTGTACGCTA
"> read 2
AACTGCTATCTTTCTTGTCCACTCGAAAATCCATAACGTAGCCCATAACG
"> read 3
TCAGTTATCCTATATACTGGATCCCGACTTTAATCGGCGTCGGAATTACT
Here are the parts I don't understand:
lenKeys = len(reads[0]) - lenSuffix
What does the value [0] mean? From what I understand "len" returns the number of elements in a list.
Why is "reads" automatically a list? edit: It seems a Fasta-file can be declared as a List. Can anybody confirm that?
if read[0:lenKeys] in dict:
Does this mean "from 0 to 'lenKeys'"? Still confused about the value.
In another function there is a similar line: if read[-lenKeys:] in dict:
What does the "-" do?
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
Here I don't understand the parameters: How can reads be a parameter? What is lenSuffix = 20 in the context of this function other than a value subtracted from len(reads[0])?
What is verbose? I have read about a "verbose-mode" ignoring whitespaces but i have never seen it used as a parameter and later as a variable.
The tone of your question makes me feel like you're confusing things like program features (len, functions, etc) with things that were defined by the original programmer (the type of reads, verbose, etc).
def some_function(these, are, arbitrary, parameters):
pass
This function defines a bunch of parameters. They don't mean anything at all, other than the value I give to them implicitly. For example if I do:
def reverse_string(s):
pass
s is probably a string, right? In your example we have:
def makeSuffixDict(reads, lenSuffix = 20, verbose = True):
lenKeys = len(reads[0]) - lenSuffix
...
From these two lines we can infer a few things:
the function will probably return a dictionary (from its name)
lenSuffix is an int, and verbose is a bool (from their default parameters)
reads can be indexed (string? list? tuple?)
the items inside reads have length (string? list? tuple?)
Since Python is dynamically typed, this is ALL WE CAN KNOW about the function so far. The rest would be explained by its documentation or the way it's called.
That said: let me cover all your questions in order:
What does the value [0] mean?
some_object[0] is grabbing the first item in a container. [1,2,3][0] == 1, "Hello, World!"[0] == "H". This is called indexing, and is governed by the __getitem__ magic method
From what I understand "len" returns the number of elements in a list.
len is a built-in function that returns the length of an object. It is governed by the __len__ magic method. len('abc') == 3, also len([1, 2, 3]) == 3. Note that len(['abc']) == 1, since it is measuring the length of the list, not the string inside it.
Why is "reads" automatically a list?
reads is a parameter. It is whatever the calling scope passes to it. It does appear that it expects a list, but that's not a hard and fast rule!
(various questions about slicing)
Slicing is doing some_container[start_idx : end_idx [ : step_size]]. It does pretty much what you'd expect: "0123456"[0:3] == "012". Slice indexes are considered to be zero-indexed and lay between the elements, so [0:1] is identical to [0], except that slices return lists, not individual objects (so 'abc'[0] == 'a' but 'abc'[0:1] == ['a']). If you omit either start or end index, it is treated as the beginning or end of the string respectively. I won't go into step size here.
Negative indexes count from the back, so '0123456'[-3:] == '456'. Note that [-0]is not the last value,[-1]is. This is contrasted with[0]` being the first value.
How can reads be a parameter?
Because the function is defined as makeSuffixDict(reads, ...). That's what a parameter is.
What is lenSuffix = 20 in the context of this function
Looks like it's the length of the expected suffix!
What is verbose?
verbose has no meaning on its own. It's just another parameter. Looks like the author included the verbose flag so you could get output while the function ran. Notice all the if verbose blocks seem to do nothing, just provide feedback to the user.

2 inputs to a function?

So Ive been giving the following code in a kind of sort of python class. Its really a discrete math class but he uses python to demonstrate everything. This code is supposed to demonstate a multiplexer and building a xor gate with it.
def mux41(i0,i1,i2,i3):
return lambda s1,s0:{(0,0):i0,(0,1):i1,(1,0):i2,(1,1):i3}[(s1,s0)]
def xor2(a,b):
return mux41(0,1,1,0)(a,b)
In the xor2 function I dont understand the syntax behind return mux41(0,1,1,0)(a,b) the 1's and 0's are the input to the mux function, but what is the (a,b) doing?
The (a, b) is actually the input to the lambda function that you return in the mux41 function.
Your mux41 function returns a lambda function which looks like it returns a value in a dictionary based on the input to the mux41 function. You need the second input to say which value you want to return.
It is directly equivalent to:
def xor2(a,b):
f = mux41(0,1,1,0)
return f(a,b)
That is fairly advanced code to throw at Python beginners, so don't feel bad it wasn't obvious to you. I also think it is rather trickier than it needs to be.
def mux41(i0,i1,i2,i3):
return lambda s1,s0:{(0,0):i0,(0,1):i1,(1,0):i2,(1,1):i3}[(s1,s0)]
This defines a function object that returns a value based on two inputs. The two inputs are s1 and s0. The function object builds a dictionary that is pre-populated with the four values passed int to mux41(), and it uses s0 and s1 to select one of those four values.
Dictionaries use keys to look up values. In this case, the keys are Python tuples: (0, 0), (0, 1), (1, 0), and (1,1). The expression (s1,s0) is building a tuple from the arguments s0 and s1. This tuple is used as the key to lookup a value from the dictionary.
def xor2(a,b):
return mux41(0,1,1,0)(a,b)
So, mux41() returns a function object that does the stuff I just discussed. xor2() calls mux41() and gets a function object; then it immediately calls that returned function object, passing in a and b as arguments. Finally it returns the answer.
The function object created by mux41() is not saved anywhere. So, every single time you call xor2(), you are creating a function object, which is then garbage collected. When the function object runs, it builds a dictionary object, and this too is garbage collected after each single use. This is possibly the most complicated XOR function I have ever seen.
Here is a rewrite that might make this a bit clearer. Instead of using lambda to create an un-named function object, I'll just use def to create a named function.
def mux41(i0,i1,i2,i3):
def mux_fn(s1, s0):
d = {
(0,0):i0,
(0,1):i1,
(1,0):i2,
(1,1):i3
}
tup = (s1, s0)
return d[tup]
return mux_fn
def xor2(a,b):
mux_fn = mux41(0,1,1,0)
return mux_fn(a,b)
EDIT: Here is what I would have written if I wanted to make a table-lookup XOR in Python.
_d_xor2 = {
(0,0) : 0,
(0,1) : 1,
(1,0) : 1,
(1,1) : 0
}
def xor2(a,b):
tup = (a, b)
return _d_xor2[tup]
We build the lookup dictionary once, then use it directly from xor2(). It's not really necessary to make an explicit temp variable in xor2() but it might be a bit clearer. You could just do this:
def xor2(a,b):
return _d_xor2[(a, b)]
Which do you prefer?
And of course, since Python has an XOR operator built-in, you could write it like this:
def xor2(a,b):
return a ^ b
If I were writing this for real I would probably add error handling and/or make it operate on bool values.
def xor2(a,b):
return bool(a) ^ bool(b)
EDIT: One more thing just occurred to me. In Python, the rule is "the comma makes the tuple". The parentheses around a tuple are sometimes optional. I just checked, and it works just fine to leave off the parentheses in a dictionary lookup. So you can do this:
def xor2(a,b):
return _d_xor2[a, b]
And it works fine. This is perhaps a bit too tricky? If I saw this in someone else's code, it would surprise me.

Is there a way in Python to return a value via an output parameter?

Some languages have the feature to return values using parameters also like C#.
Let’s take a look at an example:
class OutClass
{
static void OutMethod(out int age)
{
age = 26;
}
static void Main()
{
int value;
OutMethod(out value);
// value is now 26
}
}
So is there anything similar in Python to get a value using parameter, too?
Python can return a tuple of multiple items:
def func():
return 1,2,3
a,b,c = func()
But you can also pass a mutable parameter, and return values via mutation of the object as well:
def func(a):
a.append(1)
a.append(2)
a.append(3)
L=[]
func(L)
print(L) # [1,2,3]
You mean like passing by reference?
For Python object the default is to pass by reference. However, I don't think you can change the reference in Python (otherwise it won't affect the original object).
For example:
def addToList(theList): # yes, the caller's list can be appended
theList.append(3)
theList.append(4)
def addToNewList(theList): # no, the caller's list cannot be reassigned
theList = list()
theList.append(5)
theList.append(6)
myList = list()
myList.append(1)
myList.append(2)
addToList(myList)
print(myList) # [1, 2, 3, 4]
addToNewList(myList)
print(myList) # [1, 2, 3, 4]
Pass a list or something like that and put the return value in there.
In addition, if you feel like reading some code, I think that pywin32 has a way to handle output parameters.
In the Windows API it's common practice to rely heavily on output parameters, so I figure they must have dealt with it in some way.
You can do that with mutable objects, but in most cases it does not make sense because you can return multiple values (or a dictionary if you want to change a function's return value without breaking existing calls to it).
I can only think of one case where you might need it - that is threading, or more exactly, passing a value between threads.
def outer():
class ReturnValue:
val = None
ret = ReturnValue()
def t():
# ret = 5 won't work obviously because that will set
# the local name "ret" in the "t" function. But you
# can change the attributes of "ret":
ret.val = 5
threading.Thread(target = t).start()
# Later, you can get the return value out of "ret.val" in the outer function
Adding to Tark-Tolonen's answer:
Please absolutely avoid altering the object reference of the output argument in your function, otherwise the output argument won't work. For instance, I wish to pass an ndarray into a function my_fun and modify it
def my_fun(out_arr)
out_arr = np.ones_like(out_arr)
print(out_arr) # prints 1, 1, 1, ......
print(id(out_arr))
a = np.zeros(100)
my_fun(a)
print(a) # prints 0, 0, 0, ....
print(id(a))
After calling my_fun, array a stills remains all zeros since the function np.ones_like returns a reference to another array full of ones and assigns it to out_arr instead of modifying the object reference passed by out_arr directly. Running this code you will find that two print(id()) gives different memory locations.
Also, beware of the array operators from numpy, they usually returns a reference to another array if you write something like this
def my_fun(arr_a, arr_b, out_arr)
out_arr = arr_a - arr_b
Using the - and = operator might cause similar problems. To prevent having out_arr's memory location altered, you can use the numpy functions that does the exactly same operations but has a out parameter built in. The proceeding code should be rewritten as
def my_fun(arr_a, arr_b, out_arr):
np.subtract(arr_a, arr_b, out = out_arr)
And the memory location of out_arr remains the same before and after calling my_fun while its values gets modified successfully.

Categories