Detect empty list vs list of empty strings

Detect empty list vs list of empty strings - python

I am new to Python, and I have searched for solution for detecting empty list. Say we have an empty string, like:
a = ''
if not a:
print('a is empty')
This works fine. However, my problem arises when:
a = ['','','']
if not a:
print('a is empty')
This one does not work. What is the Pythonic way of detecting a which I guess is a list containing empty strings in the above case?
Thanks in advance for your comments and suggestions.

A list is only empty if it has no members. It has members, but they're empty, it still has members, so it's still truthy.
If you want to test whether all of the members of a list are empty, you can write it like this:
if all(not element for element in a):
print('a is made of empty things')
The all function should be pretty obvious. The argument to it might not be. It's a generator expression—if you've never seen one, first read about list comprehensions, then iterators and the following two sections.
Or you can turn it around and test whether not any of the members are not empty:
if not any(a):
print('a is made of empty things')
The double negative seems a bit harder to understand. But on the other hand, it means you don't need to understand a generator expression (because (element for element in a) is just the same thing as a, so we can keep it simple).
(And, as PM 2Ring points out, this one probably a little faster, although probably not enough to matter for most uses.)

Related

Zen of Python 'Explicit is better than implicit'

I'm trying to understand what 'implicit' and 'explicit' really means in the context of Python.
a = []
# my understanding is that this is implicit
if not a:
print("list is empty")
# my understanding is that this is explicit
if len(a) == 0:
print("list is empty")
I'm trying to follow the Zen of Python rules, but I'm curious to know if this applies in this situation or if I am over-thinking it?

The two statements have very different semantics. Remember that Python is dynamically typed.
For the case where a = [], both not a and len(a) == 0 are equivalent. A valid alternative might be to check not len(a). In some cases, you may even want to check for both emptiness and listness by doing a == [].
But a can be anything. For example, a = None. The check not a is fine, and will return True. But len(a) == 0 will not be fine at all. Instead you will get TypeError: object of type 'NoneType' has no len(). This is a totally valid option, but the if statements do very different things and you have to pick which one you want.
(Almost) everything has a __bool__ method in Python, but not everything has __len__. You have to decide which one to use based on the situation. Things to consider are:
Have you already verified whether a is a sequence?
Do you need to?
Do you mind if your if statement crashed on non-sequences?
Do you want to handle other falsy objects as if they were empty lists?
Remember that making the code look pretty takes second place to getting the job done correctly.

Though this question is old, I'd like to offer a perspective.
In a dynamic language, my preference would be to always describe the expected type and objective of a variable in order to offer more purpose understanding. Then use the knowledge of the language to be succinct and increase readability where possible (in python, an empty list's boolean result is false). Thus the code:
lst_colours = []
if not lst_colours:
print("list is empty")
Even better to convey meaning is using a variable for very specific checks.
lst_colours = []
b_is_list_empty = not lst_colours
if b_is_list_empty:
print("list is empty")
Checking a list is empty would be a common thing to do several times in a code base. So even better such things in a separate file helper function library. Thus isolating common checks, and reducing code duplication.
lst_colours = []
if b_is_list_empty(lst_colours):
print("list is empty")
def b_is_list_empty (lst):
......
Most importantly, add meaning as much as possible, have an agreed company standard to chose how to tackle the simple things, like variable naming and implicit/explicit code choices.

Try to think of:
if not a:
...
as shorthand for:
if len(a) == 0:
...
I don't think this is a good example of a gotcha with Python's Zen rule of "explicit" over "implicit". This is done rather mostly because of readability. It's not that the second one is bad and the other is good. It's just that the first one is more skillful. If one understands boolean nature of lists in Python, I think you find the first is more readable and readability counts in Python.

Class wordplay with four methods

I have the following problem and two very important questions.
Write a class called Wordplay. It should have a field that holds a list of words. The user
of the class should pass the list of words they want to use to the class. There should be the
following methods:
words_with_length(length) — returns a list of all the words of length length
starts_with(s) — returns a list of all the words that start with s
ends_with(s) — returns a list of all the words that end with s
palindromes() — returns a list of all the palindromes in the list
First problem. After compiling my program the methods starts with and ends with return the same word.
Next problem. In this case i have created a list of three names. But what if i wanted to ask for a list size and iterate over it while asking to input a word. How can i implement that idea?
class Wordplay:
def __init__(self):
self.words_list=[]
def words_with_lenght(self,lenght):
for i in range(0,len(self.words_list)-1):
if len(self.words_list[i])==lenght:
return self.words_list[i]
def starts_with_s(self,s):
for i in range(0,len(self.words_list)-1):
if s.startswith('s')==True:
return self.words_list[i]
def ends_with_s(self,s):
for i in range(0,len(self.words_list)-1):
if s.endswith('s')==True:
return self.words_list[i]
def palindromes(self):
for i in range(0,len(self.words_list)-1):
normal_word=self.words_list[i]
reversed_word=normal_word[::-1]
if reversed_word==normal_word:
return reversed_word
verification=Wordplay()
verification.words_list=['sandro','abba','luis']
lenght=int(input('Digit size you want to compare\n'))
s='s'
print(verification.words_with_lenght(lenght))
print(verification.starts_with_s(s))
print(verification.ends_with_s(s))
print(verification.palindromes())
If i input for example size 4 i expect the result to be:
abba,luis ; sandro ; luis ; abba and not-
abba; sandro ; sandro ; abba

In the line if s.startswith('s')==True:, you've passed the string "s" into the function resulting in
if 's'.startswith('s')==True:
# ^^^
return self.words_list[i]
This conditional is always true. You probably don't need a parameter here at all since the assignment asks you to hard code "s". You can use:
if self.words_list[i].startswith('s'):
return self.words_list[i]
Notice the above example uses a return as soon as a match is found. This is a problem. The loops in this program break early, returning from the function as soon as a single match is located. You may have intended to append each successful match to a list and return the resulting list or use the yield keyword to return a generator (but the caller would need to use list() if they want a persistent list from the generator). Using a list to build a result would look like:
result = []
for i in range(len(self.words_list)):
if self.words_list[i].startswith('s'):
result.append(self.words_list[i])
return result
Another issue: the loops in this program don't iterate all the way through their respective lists. The range() function is inclusive of the start and exclusive of the end, so you likely intended range(len(self.words_list)) instead of range(0, len(self.words_list) - 1).
Beyond that, there are a number of design and style points I'd like to suggest:
Use horizontal space between operators and use vertical whitespace around blocks.
foo=bar.corge(a,b,c)
if foo==baz:
return quux
is clearer as
foo = bar.corge(a, b, c)
if foo == baz:
return quux
Use 4 spaces to indent instead of 2, which makes it easier to quickly determine which code is in which block.
Prefer for element in my_list instead of for i in range(len(my_list)). If you need the index, in most cases you can use for i, elem in enumerate(my_list). Better yet, use list comprehensions to perform filtering operations, which is most of this logic.
There's no need to use if condition == True. if condition is sufficient. You can simplify confusing and inaccurate logic like:
def palindromes(self):
for i in range(0,len(self.words_list)-1):
normal_word=self.words_list[i]
reversed_word=normal_word[::-1]
if reversed_word==normal_word:
return reversed_word
to, for example:
def palindromes(self):
return [word for word in self.words_list if word[::-1] == word]
that is, avoid intermediate variables and indexes whenever possible.
I realize you're probably tied down to the design, but this strikes me as a strange way to write a utility class. It'd be more flexible as static methods that operate on iterables. Typical usage might be like:
from Wordplay import is_palindrome
is_palindrome(some_iterable)
instead of:
wordplay = Wordplay(some_iterable)
wordplay.palindromes()
My rationale is that this class is basically stateless, so it seems odd to impose state when none is needed. This is a bit subjective, but worth noting (if you've ever used the math or random modules, it's the same idea).
The lack of parameter in the constructor is even weirder; the client of the class has to magically "know" somehow that words_list is the internal variable name they need to make an assignment to in order to populate class state. This variable name should be an implementation detail that the client has no idea about. Failing providing a parameter in the initialization function, there should be a setter for this field (or just skip internal state entirely).
ends_with_s(self, s) is a silly function; it seems the designer is confused between wanting to write ends_with(self, letter) and ends_with_s(self) (the former is far preferable). What if you want a new letter? Do you need to write dozens of functions for each possible ending character ends_with_a, ends_with_b, ends_with_c, etc? I realize it's just a contrived assignment, but the class still exhibits poor design.
Spelling error: words_with_lenght -> words_with_length.
Here's a general tip on how to build the skill at locating these problems: work in very small chunks and run your program often. It appears that these four functions were written all in one go without testing each function along the way to make sure it worked first. This is apparent because the same mistakes were repeated in all four functions.

s.endswith('s') compares your input string s ("s") with "s". "s" ends in "s", so it always returns your first entry. Change it to if self.words_list[i].startswith('s'): (same for endswith).
I would recommend changing your for loops to iterate over the words themselves though:
def ends_with_s(self, s):
for word in self.words_list:
if word.endswith('s'):
return word
Entering a list of values as you described:
amount = int(input("How many words? "))
words = [input("Word {}".format(i + 1)) for i in range(amount)]

Pythonic way to add to a set and care about if it worked?

Often times I find that, when working with Pythonic sets, the Pythonic way seems to be absent.
For example, doing something like a dijkstra or a*:
openSet, closedSet = set(nodes), set(nodes)
while openSet:
walkSet, openSet = openSet, set()
for node in walkSet:
for dest in node.destinations():
if dest.weight() < constraint:
if dest not in closedSet:
closedSet.add(dest)
openSet.add(dest)
This is a weakly contrived example, the focus is the last three lines:
if not value in someSet:
someSet.add(value)
doAdditionalThings()
Given the Python way of working with, for example, accessing/using values of a dict, I would have expected to be able to do:
try:
someSet.add(value)
except KeyError:
continue # well, that's ok then.
doAdditionalThings()
As a C++ programmer, my skin crawls a bit that I can't even do:
if someSet.add(value):
# add wasn't blocked by the value already being present
doAdditionalThings()
Is there a more Pythonic (and if possible more efficient) way to work with this sort of set-as-guard usage?

The add operation is not supposed to also tell you if the item was already in the set; it just makes sure it is in there after you add it. Or put another way, what you want is not "add an item and check if it worked"; you want to first check if the item is there, and if not, then do some special stuff. If all you wanted to do was add the item, you wouldn't do the check at all. There is nothing unpythonic about this pattern:
if item not in someSet:
someSet.add(item)
doStuff()
else:
doOtherStuff()
It is true that the API could have been designed so that .add returned whether the item was already in there, but in my experience that's not a particularly common use case. Part of the point of sets is that you can freely add items without worrying about whether they were already in there (since adding an already-included item has no effect). Also, having .add return None is consistent with the general convention for Python builtin types that methods that mutate their arguments return None. It is really things like dict.setdefault (which gets an item but first adds it if isn't there) that are the unusual case.

"for i in True,False:" Python, explanation needed

I hope I can ask here for an explanation not just a solution to problem.
So I know how this works:
for i in range(10):
//blocks of code
It goes from i =0 all the way up to i = 9, so it the blocks of code executes 10 times. My question is what does this do:
for i in True,False:
//block of code
Does this run just once ? Or two times ? Or does the blocks of code use the i as True/False or 1/0 ?
I hoe someone can clarify this for me. Thanks !

The True,False is a tuple, equivalent to (True, False). That tuple has a length of two, so the block of code runs twice.
As for whether it runs as booleans or integers, that depends on how you use i. bool is a subclass of int in Python, so it will normally act as a boolean, but you will be able to do mathematical operations with it as it is basically just another representation of an integer.

In Python, the for keyword is really a "foreach". It iterates over the objects you give to it.
range() returns a list (in Python 2.x), so for i in range(3): iterates over the integers in the list. In Python 3.x range() returns an iterator, so for i in range(3): iterates over the integers yielded up by the iterator. Either way, i is set to integers from the range, one at a time.
Python has tuples, which are usually written like this: (True, False)
That's a tuple with two elements. The first is True and the second is False.
But in Python, you don't actually need the parentheses for a tuple; a series of values separated by commas is also a tuple. Thus this is a tuple equivalent to the first one: True, False
It's tricky to make a length-1 tuple in Python. You still need the comma, so it looks weird. Here's a length-1 tuple: 0,
This looks weird but it's legal: a loop that will run exactly once, because we pass a length-1 tuple to for:
for i in 0,:
print i
This will print 0 and terminate.

for ... in ... loops basically cycle through every element in what's called an iterable object. Iterable objects include lists, dicts, tuples, etc. range(x) returns the list [0,1,2,3,...,(x-1)], for example, so
for i in range(10):
//blocks of code
is really the same thing as
for i in [0,1,2,3,4,5,6,7,8,9]:
//blocks of code
Hence, you can think of
for i in True,False:
//block of code
as being interpreted as
for i in [True,False]:
//block of code

Short answer: it runs two times. The first time, i==True, and the second time, i==False.
You need to know how for loops work and what a tuple is for this to make sense. A for loop just... well... loops over an iterable. You could rewrite what you have a couple different ways:
# The parentheses here don't do anything different from what you had, actually.
# But it makes it more clear that you're making a tuple and iterating over it.
for i in (True, False):
// block of code
Equivalently, you can loop over a list:
for i in [True, False]:
// block of code
You'll get exactly the same results this way, you're just looping through a list instead of a tuple.

Why is there no explicit emptyness check (for example `is Empty`) in Python

The Zen of Python says "Explicit is better than implicit". Yet the "pythonic" way to check for emptiness is using implicit booleaness:
if not some_sequence:
some_sequence.fill_sequence()
This will be true if some_sequence is an empty sequence, but also if it is None or 0.
Compare with a theoretical explicit emptiness check:
if some_sequence is Empty:
some_sequence.fill_sequence()
With some unfavorably chosen variable name the implicit booleaness to check for emptiness gets even more confusing:
if saved:
mess_up()
Compare with:
if saved is not Empty:
mess_up()
See also: "Python: What is the best way to check if a list is empty?". I find it ironic that the most voted answer claims that implicit is pythonic.
So is there a higher reason why there is no explicit emptiness check, like for example is Empty in Python?

Polymorphism in if foo: and if not foo: isn't a violation of "implicit vs explicit": it explicitly delegates to the object being checked the task of knowing whether it's true or false. What that means (and how best to check it) obviously does and must depend on the object's type, so the style guide mandates the delegation -- having application-level code arrogantly asserts it knows better than the object would be the height of folly.
Moreover, X is Whatever always, invariably means that X is exactly the same object as Whatever. Making a totally unique exception for Empty or any other specific value of Whatever would be absurd -- hard to imagine a more unPythonic approach. And "being exactly the same object" is obviously transitive -- so you could never any more have distinct empty lists, empty sets, empty dicts... congratulations, you've just designed a completely unusable and useless language, where every empty container crazily "collapses" to a single empty container object (just imagine the fun when somebody tries to mutate an empty container...?!).

The reason why there is no is Empty is astoundingly simple once you understand what the is operator does.
From the python manual:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. x is not y yields the
inverse truth value.
That means some_sequence is Empty checks whether some_sequence is the same object as Empty. That cannot work the way you suggested.
Consider the following example:
>>> a = []
>>> b = {}
Now let's pretend there is this is Empty construct in python:
>>> a is Empty
True
>>> b is Empty
True
But since the is operator does identity check that means that a and b are identical to Empty. That in turn must mean that a and b are identical, but they are not:
>>> a is b
False
So to answer your question "why is there no is Empty in python?": because is does identity check.
In order to have the is Empty construct you must either hack the is operator to mean something else or create some magical Empty object which somehow detects empty collections and then be identical to them.
Rather than asking why there is no is Empty you should ask why there is no builtin function isempty() which calls the special method __isempty__().
So instead of using implicit booleaness:
if saved:
mess_up()
we have explicit empty check:
if not isempty(saved):
mess_up()
where the class of saved has an __isempty__() method implemented to some sane logic.
I find that far better than using implicit booleaness for emptyness check.
Of course you can easily define your own isempty() function:
def isempty(collection):
try:
return collection.__isempty__()
except AttributeError:
# fall back to implicit booleaness but check for common pitfalls
if collection is None:
raise TypeError('None cannot be empty')
if collection is False:
raise TypeError('False cannot be empty')
if collection == 0:
raise TypeError('0 cannot be empty')
return bool(collection)
and then define an __isempty__() method which returns a boolean for all your collection classes.

I agree that sometimes if foo: isn't explicit for me when I really want to tell the reader of the code that it's emptiness I'm testing. In those cases, I use if len(foo):. Explicit enough.
I 100% agree with Alex w.r.t is Empty being unpythonic.

Consider that Lisp has been using () empty list or its symbol NIL quite some years as False and T or anything not NIL as True, but generally computation of Truth already produced some useful result that need not be reproduce if needed. Look also partition method of strings, where middle result works very nicely as while control with the non-empty is True convention.
I try generally avoid using of len as it is most times very expensive in tight loops. It is often worth while to update length value of result in program logic instead of recalculating length.
For me I would prefer Python to have False as () or [] instead of 0, but that is how it is. Then it would be more natural to use not [] as not empty. But now () is not [] is True so you could use:
emptyset = set([])
if myset == emptyset:
If you want to be explicit of the empty set case (not myset is set([]))
I myself quite like the if not myset as my commenter.
Now came to my mind that maybe this is closest to explicit not_empty:
if any(x in myset for x in myset): print "set is not empty"
and so is empty would be:
if not any(x in myset for x in myset): print "set is empty"

There is an explicit emptyness check for iterables in Python. It is spelled not. What's implicit there? not gives True when iterable is empty, and gives False when it is nonempty.
What exactly do you object to? A name? As others have told you, it's certainly better than is Empty. And it's not so ungrammatical: considering how things are usually named in Python, we might imagine a sequence called widgets, containing, surprisingly, some widgets. Then,
if not widgets:
can be read as "if there are no widgets...".
Or do you object the length? Explicit doesn't mean verbose, those are two different concepts. Python does not have addition method, it has + operator, that is completely explicit if you know the type you're applying it to. The same thing with not.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Detect empty list vs list of empty strings - python

Related

Zen of Python 'Explicit is better than implicit'

Class wordplay with four methods

Pythonic way to add to a set and care about if it worked?

"for i in True,False:" Python, explanation needed

Why is there no explicit emptyness check (for example `is Empty`) in Python

Categories

Resources