Error handling while searching Python lists - python

I'm trying to calculate a few metrics around an orderbook, and looking for how deep I need to go in to fill an order of $x. I have created a list of lists that shows the price and cumulative asks, and then am finding how deep I need to go under cumulative asks >= x. However, depending on the orderbook and x, sometimes I don't have enough data and end up with a list index out of range error. How can I set some sort of default, None, N/A, whatever, when this is the case.
Note: I feel like there may be a better way than creating a list with the >= and then filtering for first elements of that list, so if that makes solving the index out of range error easier, that'd be great, too.
cum_ob = [[1.54325, 4296.408], [1.5449, 5862.9366], [1.54495, 7679.7978], [1.545, 7695.2478], [1.5467, 7696.7945]]
usd_5k = ([sublist for sublist in cum_ob if sublist[-1] >= 5000][0][0])
usd_10k = ([sublist for sublist in cum_ob if sublist[-1] >= 10000][0][0])
So usd_5k would return 1.5449, while usd_10k throws the error since I've exhausted these levels without reaching 10000.

Maybe you should write a function to do it rather than using list comprehension that it not so flexible for your case :
def search_for_amount(cumulative_list, threshold):
for order in cumulative_list:
if order[1] >= threshold:
return order # You return a tuple with (element, cumulative sum)
# If nothing was found, return None
return None
Then you would be able to get your previous values as follow :
usd_5k = search_for_amount(cum_ob, 5000) # Returns (1.5449, 5862)
usd_10k = search_for_amount(cum_ob, 10000) # Returns None
One advantage of this technique compared to yours is that you do not need to go through the whole list each time. You just iterate until you find a value that matches the threshold condition. With your technique, you have to iterate over every elements of the list, every time, even though the first element matches the condition.

You can use the more_itertools recipe first_true, pasted below. It takes 3 arguments: the iterable (your list), a default for what should be returned if the condition is not met and your list is exhausted; and a predicate (condition) which will be the >=5/10k check.
Since your list has pairs of values, I've set the default to [None, None] so when you index [0] of the returned value, you'll get None, or the price.
usd_5k = first_true(cum_ob, default=[None, None], pred=lambda sub: sub[1] >= 5_000)[0]
print(usd_5k)
# 1.5449
usd_10k = first_true(cum_ob, default=[None, None], pred=lambda sub: sub[1] >= 10_000)[0]
print(usd_10k)
# None
For the first_true code, you can either copy-paste the code or install more_itertools - the body is just one line.
# from https://docs.python.org/3/library/itertools.html#itertools-recipes
def first_true(iterable, default=False, pred=None):
"""Returns the first true value in the iterable.
If no true value is found, returns *default*
If *pred* is not None, returns the first item
for which pred(item) is true.
"""
# first_true([a,b,c], x) --> a or b or c or x
# first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
return next(filter(pred, iterable), default)

Related

Built in (remove) function not working with function variable

Have a good day everyone, pardon my lack of understanding, but I can't seem to figure out why python built in function does not work when being called with another function variable and it just doesn't do what I want at all. Here is the code
def ignoreten(h):
ignoring = False
for i in range (1,len(h)-2):
if ignoring == True and h[i]==10:
h.remove(10)
if ignoring == False and h[i] ==10:
ignoring = True
The basic idea of this is just to decided the first 10 in a list, keep it, continue iterating until you faced another 10, then just remove that 10 to avoid replication, I had searched around but can't seem to find any solution and that's why I have to bring it up here. Thank you
The code you listed
def ignoreten(h):
ignoring = False
for i in range (1,len(h)-2):
if ignoring == True and h[i]==10:
h.remove(10)
if ignoring == False and h[i] ==10:
ignoring = True
Will actually do almost the exact opposite of what you want. It'll iterate over h (sort of, see [1]), and if it finds 10 twice, it'll remove the first occurrence from the list. (And, if it finds 10 three times, it'll remove the first two occurrences from the list.)
Note that list.remove will:
Remove the first item from the list whose value is equal to x. It
raises a ValueError if there is no such item.
Also note that you're mutating the list you're iterating over, so there's some additional weirdness here which may be confusing you, depending on your input.
From your follow-up comment to my question, it looks like you want to remove only the second occurrence of 10, not the first and not any subsequent occurrences.
Here are a few ways:
Iterate, store index, use del
def ignoreten(h):
index = None
found_first = False
for i,v in enumerate(h):
if v == 10:
if not found_first:
found_first = True
else:
index = i
break
if index is not None:
del h[index]
A little more verbose than necessary, but explicit, safe, and modifiable without much fear.
Alternatively, you could delete inside the loop but you want to make sure you immediately break:
def ignoreten(h):
found_first = False
for i,v in enumerate(h):
if v == 10:
if not found_first:
found_first = True
else:
del h[i]
break
Collect indices of 10s, remove second
def ignoreten(h):
indices = [i for (i,v) in enumerate(h) if v == 10]
if len(indices) > 1:
del h[indices[1]] # The second index of 10 is at indices[1]
Clean, but will unnecessarily iterate past the second 10 and collect as many indices of 10s are there are. Not likely a huge issue, but worth pointing out.
Collect indices of 10s, remove second (v2, from comments)
def ignoreten(h):
indices = [i for (i,v) in enumerate(h) if v == 10]
for i in reversed(indices[1:]):
del h[i]
From your comment asking about removing all non-initial occurrences of 10, if you're looking for in-place modification of h, then you almost definitely want something like this.
The first line collects all the indices of 10 into a list.
The second line is a bit tricky, but working inside-out it:
[1:] "throws out" the first element of that list (since you want to keep the first occurrence of 10)
reversed iterates over that list backwards
del h[i] removes the values at those indices.
The reason we iterate backwards is because doing so won't invalidate the rest of our indices that we've yet to delete.
In other words, if the list h was [1, 10, 2, 10, 3, 10], our indices list would be [1, 3, 5].
In both cases we skip 1, fine.
But if we iterate forwards, once we delete 3, and our list shrinks to 5 elements, when we go to delete 5 we get an IndexError.
Even if we didn't go out of bounds to cause an IndexError, our elements would shift and we'd be deleting incorrect values.
So instead, we iterate backwards over our indices, delete 5, the list shrinks to 5 elements, and index 3 is still valid (and still 10).
With list.index
def ignoreten(h):
try:
second_ten = h.index(10, h.index(10)+1)
del h[second_ten]
except ValueError:
pass
The inner .index call finds the first occurrence, the second uses the optional begin parameter to start searching after that. Wrapped in try/except in case there are less than two occurrences.
⇒ Personally, I'd prefer these in the opposite order of how they're listed.
[1] You're iterating over a weird subset of the list with your arguments to range. You're skipping (not applying your "is 10" logic to) the first and last two elements this way.
Bonus: Walrus abuse
(don't do this)
def ignoreten(h):
x = 0
return [v for v in h if v != 10 or (x := x + 1) != 1]
(unlike the previous versions that operated on h in-place, this creates a new list without the second occurrence of 10)
But the walrus operator is contentious enough already, please don't let this code out in the wild. Really.

Write an generator/iterator expression for this sequence

I have this exercise that I fail to understand
Suppose we are given a list X of integers. We need to construct a sequence of indices (positions) of the elements in this list equal to the maximal element. The indicies in the sequence are in the ascending order.
Hint use the enumerator function
from typing import Iterator
X = [1,10,3,4,10,5]
S : Iterator[int] = YOUR_EXPRESSION
assert list(S)==[1,4]
This is the only thing I could come up with, but for sure it does not return [1,4]
If you wondering what I don't understand, it is not clear from reading the description how it could return [1,4].
Maybe you want to try to explain that to me first...
This is my (wrong) solution
my_enumerate=enumerate (X)
my_enumerate=(list(my_enumerate))
my_enumerate.sort(reverse=True)
So you have the list X containing [1,10,3,4,10,5]. The maximal, or largest, element is 10. Which means we should return a list of all the indices where we find 10. There are two 10s at index 1 and 4 respectively.
Using enumerate you get at each iteration the index and element. You can use that to filter out the elements you don't need. List comprehensions are useful in this case, allowing for filtering with the if syntax i.e. [val for val in items if some_condition]
you can use a generator like this
max_val=max(X)
s = (i for i, v in enumerate(X) if v==max_val)
This is my solution
( x[0] for x in enumerate (X) if x[1] == max(X) )
this is the book solution
(i for (i, n) in enumerate(X) if n == max(X))
This requires two steps:
Determine the maximum value with max
Iterate the indices of your list and retain those that have this maximum value
To avoid a bad time complexity, it is necessary to not repeat the first step:
S : Iterator[int] = (lambda mx:
(i for i, x in enumerate(X) if x == mx)
)(max(X))
The reason for presenting the code in such ugly expression, is that in the question it seems a requirement to follow the template, and only alter the part that is marked with "YOUR_EXPRESSION".
This is not how you would write it without such artificial constraints. You would just do mx = max(X) and then assign the iterator to S in the next statement without the need for this inline lambda.

In this short recursive function `list_sum(aList)`, the finish condition is `if not aList: return 0`. I see no logic in why this condition works

I am learning the recursive functions. I completed an exercise, but in a different way than proposed.
"Write a recursive function which takes a list argument and returns the sum of its integers."
L = [0, 1, 2, 3, 4] # The sum of elements will be 10
My solution is:
def list_sum(aList):
count = len(aList)
if count == 0:
return 0
count -= 1
return aList[0] + list_sum(aList[1:])
The proposed solution is:
def proposed_sum(aList):
if not aList:
return 0
return aList[0] + proposed_sum(aList[1:])
My solution is very clear in how it works.
The proposed solution is shorter, but it is not clear for me why does the function work. How does if not aList even happen? I mean, how would the rest of the code fulfill a not aList, if not aList means it checks for True/False, but how is it True/False here?
I understand that return 0 causes the recursion to stop.
As a side note, executing without if not aList throws IndexError: list index out of range.
Also, timeit-1million says my function is slower. It takes 3.32 seconds while the proposed takes 2.26. Which means I gotta understand the proposed solution.
On the call of the function, aList will have no elements. Or in other words, the only element it has is null. A list is like a string or array. When you create a variable you reserve some space in the memory for it. Lists and such have a null on the very last position which marks the end so nothing can be stored after that point. You keep cutting the first element in the list, so the only thing left is the null. When you reach it you know you're done.
If you don't use that condition the function will try to take a number that doesn't exist, so it throws that error.
You are counting the items in the list, and the proposed one check if it's empty with if not aList this is equals to len(aList) == 0, so both of you use the same logic.
But, you're doing count -= 1, this has no sense since when you use recursion, you pass the list quiting one element, so here you lose some time.
According to PEP 8, this is the proper way:
• For sequences, (strings, lists, tuples), use the fact that empty
sequences are false.
Yes: if not seq:
if seq:
No: if len(seq)
if not len(seq)
Here is my amateur thougts about why:
This implicit check will be faster than calling len, since len is a function to get the length of a collection, it works by calling an object's __len__ method. This will find up there is no item to check __len__.
So both will find up there is no item there, but one does it directly.
not aList
return True if there is no elements in aList. That if statement in the solution covers edge case and checks if input parameter is not empty list.
For understand this function, let's run it step by step :
step 0 :
L=[0,1,2,3,4]
proposed_sum([0,1,2,3,4])
L != []
return l[0] + proposed_sum([1,2,3,4])
step 1 calcul proposed_sum([1,2,3,4]):
proposed_sum([1,2,3,4])
L != []
return l[0] + sum([2,3,4])
step 2 calcul proposed_sum([2,3,4]):
proposed_sum([2,3,4])
L != []
return l[0] + sum([3,4])
step 3 calcul proposed_sum([3,4]):
proposed_sum([3,4])
L != []
return l[0] + sum([4])
step 4 calcul proposed_sum([4]):
proposed_sum([4])
L != []
return l[0] + sum([])
step 5 calcul proposed_sum([]):
proposed_sum([])
L == []
return 0
step 6 replace:
proposed_sum([0,1,2,3,4])
By
proposed_sum([]) + proposed_sum([4]) + proposed_sum([3,4]) + proposed_sum([2,3,4]) + proposed_sum([1,2,3,4])+ proposed_sum([0,1,2,3,4])
=
(0) + 4 + 3 + 2 + 1 + 0
Python considers as False multiple values:
False (of course)
0
None
empty collections (dictionaries, lists, tuples)
empty strings ('', "", '''''', """""", r'', u"", etc...)
any other object whose __nonzero__ method returns False
in your case, the list is evaluated as a boolean. If it is empty, it is considered as False, else it is considered as True. This is just a shorter way to write if len(aList) == 0:
in addition, concerning your new question in the comments, consider the last line of your function:
return aList[0] + proposed_sum(aList[1:])
This line call a new "instance" of the function but with a subset of the original list (the original list minus the first element). At each recursion, the list passed in argument looses an element and after a certain amount of recursions, the passed list is empty.

Returning the index of the largest element in an array in Python

I'm trying to create a function that returns the largest element of an array, I feel I have the correct code but my syntax is in the wrong order, I'm trying to use a for/while loop in order to do so. So far I have the following:
def manindex(arg):
ans = 0
while True:
for i in range (len(arg)):
if arg[i] > arg[ans]:
pass
ans = i
return ans
Not sure where I'm going wrong if anyone could provide some guidance, thanks
EDIT: So it's been pointing out I'm causing an infinite loop so if I take out the while statement I'm left with
def manindex(arg):
ans = 0
for i in range (len(arg)):
if arg[i] > arg[ans]:
ans = i
return ans
But I have a feeling it's still not correct
When you say array I think you mean list in Python, you don't need a for/loop or while/loop to achieve this at all.
You can also use index with max, like so:
xs.index(max(xs))
sample:
xs = [1,123,12,234,34,23,42,34]
xs.index(max(xs))
3
You could use max with the key parameter set to seq.__getitem__:
def argmax(seq):
return max(range(len(seq)), key=seq.__getitem__)
print(argmax([0,1,2,3,100,4,5]))
yields
4
The idea behind finding the largest index is always the same, iterating over the elements of the array, compare to the max value we have at the moment, if it's better, the index of the current element is the maximum now, if it's not, we keep looking for it.
enumerate approach:
def max_element_index(items):
max_index, max_value = None, None
for index, item in enumerate(items):
if item > max_value:
max_index, max_value = index, item
return max_index
functional approach:
def max_element_index(items):
return reduce(lambda x,y: x[1] > y[1] and x or y,
enumerate(items), (None, None))[0]
At the risk of looking cryptic, the functional approach uses the reduce function which takes two elements and decides what is the reduction. Those elements are tuples (index, element), which are the result of the enumerate function.
The reduce function, defined on the lambda body takes two elements and return the tuple of the largest. As the reduce function reduces until only one element in the result is encountered, the champion is the tuple containing the index of the largest and the largest element, so we only need to access the 0-index of the tuple to get the element.
On the other hand if the list is empty, None object is returned, which is granted on the third parameter of the reduce function.
Before I write a long winded explanation, let me give you the solution:
index, value = max(enumerate(list1), key=lambda x: x[1])
One line, efficient (single pass O(n)), and readable (I think).
Explanation
In general, it's a good idea to use as much of python's incredibly powerful built-in functions as possible.
In this instance, the two key functions are enumerate() and max().
enumerate() converts a list (or actually any iterable) into a sequence of indices and values. e.g.
>>> list1 = ['apple', 'banana', 'cherry']
>>> for tup in enumerate(list1):
... print tup
...
(0, 'apple')
(1, 'banana')
(2, 'cherry')
max() takes an iterable and returns the maximum element. Unfortunately, max(enumerate(list1)) doesn't work, because max() will sort based on the first element of the tuple created by enumerate(), which sadly is the index.
One lesser-known feature of max() is that it can take a second argument in the form max(list1, key=something). The key is a function that can be applied to each value in the list, and the output of that function is what gets used to determine the maximum. We can use this feature to tell max() that it should be ranking items by the second item of each tuple, which is the value contained in the list.
Combining enumerate() and max() with key (plus a little help from lambda to create a function that returns the second element of a tuple) gives you this solution.
index, value = max(enumerate(list1), key=lambda x: x[1])
I came up with this recently (and am sprinkling it everywhere in my code) after watching Raymond Hettinger's talk on Transforming Code into Beautiful, Idiomatic Python, where he suggests exorcising the for i in xrange(len(list1)): pattern from your code.
Alternatively, without resorting to lambda (Thanks #sweeneyrod!):
from operator import itemgetter
index, value = max(enumerate(list1), key=itemgetter(1))
I believe if you change your for loop to....
for i in range (len(arg)):
if arg[i] > ans:
ans = arg[i]
it should work.
You could try something like this. If the list is empty, then the function will return an error.
m is set to the first element of the list, we then iterate over the list comparing the value at ever step.
def findMax(xs):
m = xs[0]
for x in xs:
if x > m:
m = x
return m
findMax([]) # error
findMax([1]) # 1
findMax([2,1]) # 2
if you wanted to use a for loop and make it more generic, then:
def findGeneric(pred, xs):
m = xs[0]
for x in xs:
if pred(x,m):
m = x
return m
findGeneric(lambda a,b: len(a) > len(b), [[1],[1,1,1,1],[1,1]]) # [1,1,1,1]

In Python, how can I find the index of the first item in a list that is NOT some value?

Python's list type has an index(x) method. It takes a single parameter x, and returns the (integer) index of the first item in the list that has the value x.
Basically, I need to invert the index(x) method. I need to get the index of the first value in a list that does NOT have the value x. I would probably be able to even just use a function that returns the index of the first item with a value != None.
I can think of a 'for' loop implementation with an incrementing counter variable, but I feel like I'm missing something. Is there an existing method, or a one-line Python construction that can handle this?
In my program, the situation comes up when I'm handling lists returned from complex regex matches. All but one item in each list have a value of None. If I just needed the matched string, I could use a list comprehension like '[x for x in [my_list] if x is not None]', but I need the index in order to figure out which capture group in my regex actually caused the match.
Exiting at the first match is really easy: instead of computing a full list comprehension (then tossing away everything except the first item), use next over a genexp. Assuming for example that you want -1 when no item satisfies the condition of being != x,
return next((i for i, v in enumerate(L) if v != x), -1)
This is Python 2.6 syntax; if you're stuck with 2.5 or earlier, .next() is a method of the genexp (or other iterator) and doesn't accept a default value like the -1 above (so if you don't want to see a StopIteration exception you'll have to use a try/except). But then, there is a reason more releases were made after 2.5 -- continuous improvement of the language and its built-ins!-)
Using a list comprehension when you only need the first just feels slimy (to me). Use a for-loop and exit early.
>>> lst = [None, None, None, "foo", None]
>>> for i, item in enumerate(lst):
... if item: break
... else:
... print "not found"
...
>>> i
3
enumerate() returns an iterator that yields a tuple of the current index of the iterable as well as the item itself.
[i for i, x in enumerate(my_list) if x != value][0]
If you're not sure whether there's a non-matching item, use this instead:
match = [i for i, x in enumerate(my_list) if x != value]
if match:
i = match[0]
# i is your number.
You can make this even more "functional" with itertools, but you will soon reach the point where a simple for loop is better. Even the above solutions aren't as efficient as a for loop, since they construct a list of all non-matching indices before you pull the one of interest.
A silly itertools-based solution:)
import itertools as it, operator as op, functools as ft
def index_ne(item, sequence):
sequence= iter(sequence)
counter= it.count(-1) # start counting at -1
pairs= it.izip(sequence, counter) # pair them
get_1st= it.imap(op.itemgetter(0), pairs) # drop the used counter value
ne_scanner= it.ifilter(ft.partial(op.ne, item), get_1st) # get only not-equals
try:
ne_scanner.next() # this should be the first not equal
except StopIteration:
return None # or raise some exception, all items equal to item
else:
return counter.next() # should be the index of the not-equal item
if __name__ == "__main__":
import random
test_data= [0]*20
print "failure", index_ne(0, test_data)
index= random.randrange(len(test_data))
test_data[index]= 1
print "success:", index_ne(0, test_data), "should be", index
All this just to take advantage of the itertools.count counting :)

Categories