Python: Why Lists do not have a find method?

Python: Why Lists do not have a find method? - python

I was trying to write an answer to this question and was quite surprised to find out that there is no find method for lists, lists have only the index method (strings have find and index).
Can anyone tell me the rationale behind that?
Why strings have both?

I don't know why or maybe is buried in some PEP somewhere, but i do know 2 very basic "find" method for lists, and they are array.index() and the in operator. You can always make use of these 2 to find your items. (Also, re module, etc)

I think the rationale for not having separate 'find' and 'index' methods is they're not different enough. Both would return the same thing in the case the sought item exists in the list (this is true of the two string methods); they differ in case the sought item is not in the list/string; however you can trivially build either one of find/index from the other. If you're coming from other languages, it may seem bad manners to raise and catch exceptions for a non-error condition that you could easily test for, but in Python, it's often considered more pythonic to shoot first and ask questions later, er, to use exception handling instead of tests like this (example: Better to 'try' something and catch the exception or test if its possible first to avoid an exception?).
I don't think it's a good idea to build 'find' out of 'index' and 'in', like
if foo in my_list:
foo_index = my_list.index(foo)
else:
foo_index = -1 # or do whatever else you want
because both in and index will require an O(n) pass over the list.
Better to build 'find' out of 'index' and try/catch, like:
try:
foo_index = my_list.index(foo)
catch ValueError:
foo_index = -1 # or do whatever else you want
Now, as to why list was built this way (with only index), and string was built the other way (with separate index and find)... I can't say.

The "find" method for lists is index.
I do consider the inconsistency between string.find and list.index to be unfortunate, both in name and behavior: string.find returns -1 when no match is found, where list.index raises ValueError. This could have been designed more consistently. The only irreconcilable difference between these operations is that string.find searches for a string of items, where list.index searches for exactly one item (which, alone, doesn't justify using different names).

Related

Idiomatic way to to match against list of munch.munch objects?

I am using the openstack shade library to manage our openstack stacks. One task is to list all stacks owned by a user (for example to then allow for deletion of them).
The shade library call list_stacks() returns a list of munch.Munch objects, and basically I want to identify that stack object that has either an 'id' or 'name' matching some user provided input.
I came up with this code here:
def __find_stack(self, connection, stack_info):
stacks = connection.list_stacks()
for stack in stacks:
if stack_info in stack.values():
return stack
return None
But it feels clumsy, and I am wondering if there is a more idiomatic way to solve this in python? (stack_info is a simple string, either the "name" or "id", in other words: it might match this or that entry within the "dict" values of the munched stack objects)

As my comment suggests, I don't really think there is something to improve.
However, performance-wise, you could use filter to push the loop down to C level which may be beneficial if there are a lot of stacks.
Readability-wise, I don't think that you would gain much.
def __find_stack(self, connection, stack_info):
stacks = connection.list_stacks()
return list(filter(lambda stack: stack_info in stack.values(), stacks))
However this approach is not "short-circuited". Your original code stops when it finds a match, and this one will not, so in theory you will get more than one match if they exist (or an empty list in case there is no match).

Dynamically nesting a list, and related comprehension/mapping to find indices of string match

The context of what I'm doing: I'm translating if/then/else statements between 2 languages via a Python script (2x for now, but may eventually upgrade to 3x). I have a function that takes the if/then/else statement from the original language and breaks it into a list of [if_clause,then_clause,else_clause]. The thing is, there may be (and often are) nested if statements in the then and/or else clauses. For example, I would pass a string like...
if (sim_time<=1242) then (new_tmaxF0740) else if (sim_time<=2338) then (new_tmaxF4170) else (new_tmaxF7100)
...to my function, and it would return the list...
['(sim_time<=1242)','(new_tmaxF0740)','if (sim_time<=2338) then (new_tmaxF4170) else (new_tmaxF7100)']
So, as you can see, in this case the else clause needs to be further broken up by running it again through the same function I used to generate the list, this time only passing the last list element to that function. I am going about this by testing the original string to see if there are more than 1 if statements contained (I already have the regex for this) and my thought is to use a loop to create nested lists within the original list, that might then look like...
[if_clause,then_clause,[if_clause, then_clause, else_clause]]
These can be nested any number of times/to any dimension. My plan so far is to write a loop that looks for the next nested if statement (using a regex), and reassigns the list index where the if statement is found to the resultant list from applying my if_extract() function to break up the statement.
I feel like list comprehension may not do this, because to find the indices, it seems like the list comprehension statement might have to dynamically change. Maybe better suited for map, but I'm not sure how to apply? I ultimately want to iterate through the loop to return the index of the next (however deeply nested) if statement so I can continue breaking them apart with my function.

If I understand correctly, you could call your function recursively.
def split_if_then_else(str):
if check_if_if_in_string_function(str)
if_clause, then_clause, else_clause = split_str_core_function(str)
then_clause = split_if_then_else(str)
return [if_clause, then_clause, else_clause]
else:
return str
I didn't test it since I don't know what functions you are using exactly, but I think something like this should work

How to access an attribute of an arbitrary element of a set

I have a non-empty set S and every s in S has an attribute s.x which I know is independent of the choice of s. I'd like to extract this common value a=s.x from S. There is surely something better than
s=S.pop()
a=s.x
S.add(s)
-- maybe that code is fast but surely I shouldn't be changing S?
Clarification: some answers and comments suggest iterating over all of S. The reason I want to avoid this is that S might be huge; my method above will I think run quickly however large S is; my only issue with it is that S changes, and I see no reason that I need to change S.

This is almost but not quite the same as this question on getting access to an element of a set when there's only one-- there are solutions which apply there which won't work here, and others which work but are inefficient. But the general trick of using next(iter(something_iterable)) to nondestructively get an element still applies:
>>> S = {1+2j, 2+2j, 3+2j}
>>> next(iter(S))
(2+2j) # Note: could have been any element
>>> next(iter(S)).imag
2.0

How to delete elements in a list based on another list?

Suppose I have a list called icecream_flavours, and two lists called new_flavours and unavailable. I want to remove the elements in flavours that appear in 'unavailable', and add those in new_flavours to the original one. I wrote the following program:
for i in unavailable:
icecream_flavours.remove(i)
for j in new_flavours:
icecream_flavours.append(j)
the append one is fine, but it keeps showing 'ValueError: list.remove(x): x not in list' for the first part of the program. What's the problem?
thanks

There are two possibilities here.
First, maybe there should never be anything in unavailable that wasn't in icecream_flavours, but, because of some bug elsewhere in your program, that isn't true. In that case, you're going to need to debug where things first go wrong, whether by running under the debugger or by adding print calls all over the code. At any rate, since the problem is most likely in code that you haven't shown us here, we can't help if that's the problem.
Alternatively, maybe it's completely reasonable for something to appear in unavailable even though it's not in icecream_flavours, and in that case you just want to ignore it.
That's easy to do, you just need to write the code that does it. As the docs for list.remove explain, it:
raises ValueError when x is not found in s.
So, if you want to ignore cases when i is not found in icecream_flavours, just use a try/except:
for i in unavailable:
try:
icecream_flavours.remove(i)
except ValueError:
# We already didn't have that one... which is fine
pass
That being said, there are better ways to organize your code.
First, using the right data structure always makes things easier. Assuming you don't want duplicate flavors, and the order of flavors doesn't matter, what you really want here is sets, not lists. And if you had sets, this would be trivial:
icecream_flavours -= unavailable
icecream_flavours |= new_flavours
Even if you can't do that, it's usually simpler to create a new list than to mutate one in-place:
icecream_flavours = [flavour for flavour in icecream_flavours
if flavour not in set(unavailable)]
(Notice that I converted unavailable to a set, so we don't have to brute-force search for each flavor in a list.)
Either one of these changes makes the code shorter, and makes it more efficient. But, more importantly, they both make the code easier to reason about, and eliminate the possibility of bugs like the one you're trying to fix.

To add all the new_flavours that are not unavailable, you can use a list comprehension, then use the += operator to add it to the existing flavors.
icecream_flavours += [i for i in new_flavours if i not in unavailable]
If there are already flavors in the original list you want to remove, you can remove them in the same way
icecream_flavours = [i for i in icecream_flavours if i not in unavailable]

If you first want to remove all the unavailable flavours from icecream_flavours and then add the new flavours, you can use this list comprehension:
icecream_flavours = [i for i in icecream_flavours if i not in unavailable] + new_flavours

Your error is caused because unavailable contains flavours that are not in icecream_flavours.
Unless order is important, you could use set instead of list as they have operations for differences and unions and you don't need to worry about duplicates
If you must use lists, a list comprehension is a better way to filter the list
icecream_flavours = [x for x in icecream_flavours if x not in unavaiable]
You can extend the list of flavours like this
icecream_flavours += new_flavours
assuming there are no duplicates.

Python Extension Returned Object Etiquette

I am writing a python extension to provide access to Solaris kstat data ( in the same spirit as the shipping perl library Sun::Solaris::Kstat ) and I have a question about conditionally returning a list or a single object. The python use case would look something like:
cpu_stats = cKstats.lookup(module='cpu_stat')
cpu_stat0 = cKstats.lookup('cpu_stat',0,'cpu_stat0')
As it's currently implemented, lookup() returns a list of all kstat objects which match. The first case would result in a list of objects ( as many as there are CPUs ) and the second call specifies a single kstat completely and would return a list containing one kstat.
My question is it poor form to return a single object when there is only one match, and a list when there are many?
Thank you for the thoughtful answer! My python-fu is weak but growing stronger due to folks like you.

"My question is it poor form to return a single object when there is only one match, and a list when there are many?"
It's poor form to return inconsistent types.
Return a consistent type: List of kstat.
Most Pythonistas don't like using type(result) to determine if it's a kstat or a list of kstats.
We'd rather check the length of the list in a simple, consistent way.
Also, if the length depends on a piece of system information, perhaps an API method could provide this metadata.
Look at DB-API PEP for advice and ideas on how to handle query-like things.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.