Is there a way I can implement the code block below using map or list comprehension or any other faster way, keeping it functionally the same?
def name_check(names, n_val):
lower_names = names.lower()
for item in set(n_val):
if item in lower_names:
return True
return False
Any help here is appreciated
A simple implementation would be
return any(character in names_lower for character in n_val)
A naive guess at the complexity would be O(K*2*N) where K is the number of characters in names and N is the number of characters in n_val. We need one "loop" for the call to lower*, one for the inner comprehension, and one for any. Since any is a built-in function and we're using a generator expression, I would expect it to be faster than your manual loop, but as always, profile to be sure.
To be clear, any short-circuits, so that behaviour is preserved
Notes on Your Implementation
On using a set: Your intuition to use a set to reduce the number of checks is a good one (you could add it to my form above, also), but it's a trade-off. In the case that the first element short circuits, the extra call to set is an additional N steps to produce the set expression. In the case where you wind up checking each item, it will save you some time. It depends on your expected inputs. If n_val was originally an iterable, you've lost that benefit and allocated all the memory up front. If you control the input to the function, why not just recommend it's called using lists that don't have duplicates (i.e., call set() on its input), and leave the function general?
* #Neopolitan pointed out that names_lower = names.lower() should be called out of the loop, as your original implementation called it, else it may (will?) be called repeatedly in the generator expression
Related
I have the following problem and two very important questions.
Write a class called Wordplay. It should have a field that holds a list of words. The user
of the class should pass the list of words they want to use to the class. There should be the
following methods:
words_with_length(length) — returns a list of all the words of length length
starts_with(s) — returns a list of all the words that start with s
ends_with(s) — returns a list of all the words that end with s
palindromes() — returns a list of all the palindromes in the list
First problem. After compiling my program the methods starts with and ends with return the same word.
Next problem. In this case i have created a list of three names. But what if i wanted to ask for a list size and iterate over it while asking to input a word. How can i implement that idea?
class Wordplay:
def __init__(self):
self.words_list=[]
def words_with_lenght(self,lenght):
for i in range(0,len(self.words_list)-1):
if len(self.words_list[i])==lenght:
return self.words_list[i]
def starts_with_s(self,s):
for i in range(0,len(self.words_list)-1):
if s.startswith('s')==True:
return self.words_list[i]
def ends_with_s(self,s):
for i in range(0,len(self.words_list)-1):
if s.endswith('s')==True:
return self.words_list[i]
def palindromes(self):
for i in range(0,len(self.words_list)-1):
normal_word=self.words_list[i]
reversed_word=normal_word[::-1]
if reversed_word==normal_word:
return reversed_word
verification=Wordplay()
verification.words_list=['sandro','abba','luis']
lenght=int(input('Digit size you want to compare\n'))
s='s'
print(verification.words_with_lenght(lenght))
print(verification.starts_with_s(s))
print(verification.ends_with_s(s))
print(verification.palindromes())
If i input for example size 4 i expect the result to be:
abba,luis ; sandro ; luis ; abba and not-
abba; sandro ; sandro ; abba
In the line if s.startswith('s')==True:, you've passed the string "s" into the function resulting in
if 's'.startswith('s')==True:
# ^^^
return self.words_list[i]
This conditional is always true. You probably don't need a parameter here at all since the assignment asks you to hard code "s". You can use:
if self.words_list[i].startswith('s'):
return self.words_list[i]
Notice the above example uses a return as soon as a match is found. This is a problem. The loops in this program break early, returning from the function as soon as a single match is located. You may have intended to append each successful match to a list and return the resulting list or use the yield keyword to return a generator (but the caller would need to use list() if they want a persistent list from the generator). Using a list to build a result would look like:
result = []
for i in range(len(self.words_list)):
if self.words_list[i].startswith('s'):
result.append(self.words_list[i])
return result
Another issue: the loops in this program don't iterate all the way through their respective lists. The range() function is inclusive of the start and exclusive of the end, so you likely intended range(len(self.words_list)) instead of range(0, len(self.words_list) - 1).
Beyond that, there are a number of design and style points I'd like to suggest:
Use horizontal space between operators and use vertical whitespace around blocks.
foo=bar.corge(a,b,c)
if foo==baz:
return quux
is clearer as
foo = bar.corge(a, b, c)
if foo == baz:
return quux
Use 4 spaces to indent instead of 2, which makes it easier to quickly determine which code is in which block.
Prefer for element in my_list instead of for i in range(len(my_list)). If you need the index, in most cases you can use for i, elem in enumerate(my_list). Better yet, use list comprehensions to perform filtering operations, which is most of this logic.
There's no need to use if condition == True. if condition is sufficient. You can simplify confusing and inaccurate logic like:
def palindromes(self):
for i in range(0,len(self.words_list)-1):
normal_word=self.words_list[i]
reversed_word=normal_word[::-1]
if reversed_word==normal_word:
return reversed_word
to, for example:
def palindromes(self):
return [word for word in self.words_list if word[::-1] == word]
that is, avoid intermediate variables and indexes whenever possible.
I realize you're probably tied down to the design, but this strikes me as a strange way to write a utility class. It'd be more flexible as static methods that operate on iterables. Typical usage might be like:
from Wordplay import is_palindrome
is_palindrome(some_iterable)
instead of:
wordplay = Wordplay(some_iterable)
wordplay.palindromes()
My rationale is that this class is basically stateless, so it seems odd to impose state when none is needed. This is a bit subjective, but worth noting (if you've ever used the math or random modules, it's the same idea).
The lack of parameter in the constructor is even weirder; the client of the class has to magically "know" somehow that words_list is the internal variable name they need to make an assignment to in order to populate class state. This variable name should be an implementation detail that the client has no idea about. Failing providing a parameter in the initialization function, there should be a setter for this field (or just skip internal state entirely).
ends_with_s(self, s) is a silly function; it seems the designer is confused between wanting to write ends_with(self, letter) and ends_with_s(self) (the former is far preferable). What if you want a new letter? Do you need to write dozens of functions for each possible ending character ends_with_a, ends_with_b, ends_with_c, etc? I realize it's just a contrived assignment, but the class still exhibits poor design.
Spelling error: words_with_lenght -> words_with_length.
Here's a general tip on how to build the skill at locating these problems: work in very small chunks and run your program often. It appears that these four functions were written all in one go without testing each function along the way to make sure it worked first. This is apparent because the same mistakes were repeated in all four functions.
s.endswith('s') compares your input string s ("s") with "s". "s" ends in "s", so it always returns your first entry. Change it to if self.words_list[i].startswith('s'): (same for endswith).
I would recommend changing your for loops to iterate over the words themselves though:
def ends_with_s(self, s):
for word in self.words_list:
if word.endswith('s'):
return word
Entering a list of values as you described:
amount = int(input("How many words? "))
words = [input("Word {}".format(i + 1)) for i in range(amount)]
In python 2, I used map to apply a function to several items, for instance, to remove all items matching a pattern:
map(os.remove,glob.glob("*.pyc"))
Of course I ignore the return code of os.remove, I just want all files to be deleted. It created a temp instance of a list for nothing, but it worked.
With Python 3, as map returns an iterator and not a list, the above code does nothing.
I found a workaround, since os.remove returns None, I use any to force iteration on the full list, without creating a list (better performance)
any(map(os.remove,glob.glob("*.pyc")))
But it seems a bit hazardous, specially when applying it to methods that return something. Another way to do that with a one-liner and not create an unnecessary list?
The change from map() (and many other functions from 2.7 to 3.x) returning a generator instead of a list is a memory saving technique. For most cases, there is no performance penalty to writing out the loop more formally (it may even be preferred for readability).
I would provide an example, but #vaultah nailed it in the comments: still a one-liner:
for x in glob.glob("*.pyc"): os.remove(x)
for i in vr_world.getNodeNames():
if i != "_error_":
World[i] = vr_world.getChild(i)
vr_world.getNodeNames() returns me a gigantic list, vr_world.getChild(i) returns a specific type of object.
This is taking a long time to run, is there anyway to make it more efficient? I have seen one-liners for loops before that are supposed to be faster. Ideas?
kaloyan suggests using a generator. Here's why that may help.
If getNodeNames() builds a list, then your loop is basically going over the list twice: once to build it, and once when you iterate over the list.
If getNodeNames() is a generator, then your loop doesn't ever build the list; instead of creating the item and adding it to the list, it creates the item and yields it to the caller.
Whether or not this helps is contingent on a couple of things. First, it has to be possible to implement getNodeNames() as a generator. We don't know anything about the implementation details of that function, so it's not possible to say if that's the case. Next, the number of items you're iterating over needs to be pretty big.
Of course, none of this will have any effect at all if it turns out that the time-consuming operation in all of this is vr_world.getChild(). That's why you need to profile your code.
I don't think you can make it faster than what you have there. Yes, you can put the whole thing on one line but that will not make it any faster. The bottleneck obviously is getNodeNames(). If you can make it a generator, you will start populating the World dict with results sooner (if that matters to you) and if you make it filter out the "_error_" values, you will not have the deal with that at a later stage.
World = dict((i, vr_world.getChild(i)) for i in vr_world.getNodeNames() if i != "_error_")
This is a one-liner, but not necessarily much faster than your solution...
Maybe you can use a filter and a map, however I don't know if this would be any faster:
valid = filter(lambda i: i != "_error_", vr_world.getNodeNames())
World = map(lambda i: vr_world.getChild(i), valid)
Also, as you'll see a lot around here, profile first, and then optimize, otherwise you may be wasting time. You have two functions there, maybe they are the slow parts, not the iteration.
More and more features of Python move to be "lazy executable", like generator
expressions and other kind of iterators.
Sometimes, however, I see myself wanting to roll a one liner "for" loop, just to perform some action.
What would be the most pythonic thing to get the loop actually executed?
For example:
a = open("numbers.txt", "w")
(a.write ("%d " % i) for i in xrange(100))
a.close()
Not actuall code, but you see what I mean. If I use a list generator, instead, I have the side effect of creating a N-lenght list filled with "None"'s.
Currently what I do is to use the expression as the argument in a call to "any" or to "all". But I would like to find a way that would not depend on the result of the expression performed in the loop - both "any" and "all" can stop depending on the expression evaluated.
To be clear, these are ways to do it that I already know about, and each one has its drawbacks:
[a.write ("%d " % i) for i in xrange(100))]
any((a.write ("%d " % i) for i in xrange(100)))
for item in (a.write ("%d " % i) for i in xrange(100)): pass
There is one obvious way to do it, and that is the way you should do it. There is no excuse for doing it a clever way.
a = open("numbers.txt", "w")
for i in xrange(100):
a.write("%d " % i)
d.close()
Lazy execution gives you a serious benefit: It allows you to pass a sequence to another piece of code without having to hold the entire thing in memory. It is for the creation of efficient sequences as data types.
In this case, you do not want lazy execution. You want execution. You can just ... execute. With a for loop.
If I wanted to do this specific example, I'd write
for i in xrange(100): a.write('%d ' % i)
If I often needed to consume an iterator for its effect, I'd define
def for_effect(iterable):
for _ in iterable:
pass
There are many accumulators which have the effect of consuming the whole iterable they're given, such as min or max -- but even they don't ignore entirely the results yielded in the process (min and max, for example, will raise an exception if some of the results are complex numbers). I don't think there's a built-in accumulator that does exactly what you want -- you'll have to write (and add to your personal stash of tiny utility function) a tiny utility function such as
def consume(iterable):
for item in iterable: pass
The main reason, I guess, is that Python has a for statement and you're supposed to use it when it fits like a glove (i.e., for the cases you'd want consume for;-).
BTW, a.write returns None, which is falsish, so any will actually consume it (and a.writelines will do even better!). But I realize you were just giving that as an example;-).
It is 2019 -
and this is a question from 2010 that keeps showing up. A recent thread in one of Python's mailing lists spammed over 70 e-mails on this subject, and they refused again to add a consume call to the language.
On that thread, the most efficient mode to that actually showed up, and it is far from being obvious, so I am posting it as the answer here:
import deque
consume = deque(maxlen=0).extend
And then use the consume callable to process generator expressions.
It turns out the deque native code in cPython actually is optimized for the maxlen=0 case, and will just consume the iterable.
The any and all calls I mentioned in the question should be equally as efficient, but one has to worry about the expression truthiness in order for the iterable to be consumed.
I see this still may be controversial, after all, an explicit two line for loop can handle this - I remembered this question because I just made a commit where I create some threads, start then, and join then back - without a consume callable, that is 4 lines with mostly boiler plate, and without benefiting from cycling through the iterable in native code:
https://github.com/jsbueno/extracontext/blob/a5d24be882f9aa18eb19effe3c2cf20c42135ed8/tests/test_thread.py#L27
When I should use a while loop or a for loop in Python? It looks like people prefer using a for loop (for brevity?). Is there any specific situation which I should use one or the other? Is it a matter of personal preference? The code I have read so far made me think there are big differences between them.
Yes, there is a huge difference between while and for.
The for statement iterates through a collection or iterable object or generator function.
The while statement simply loops until a condition is False.
It isn't preference. It's a question of what your data structures are.
Often, we represent the values we want to process as a range (an actual list), or xrange (which generates the values) (Edit: In Python 3, range is now a generator and behaves like the old xrange function. xrange has been removed from Python 3). This gives us a data structure tailor-made for the for statement.
Generally, however, we have a ready-made collection: a set, tuple, list, map or even a string is already an iterable collection, so we simply use a for loop.
In a few cases, we might want some functional-programming processing done for us, in which case we can apply that transformation as part of iteration. The sorted and enumerate functions apply a transformation on an iterable that fits naturally with the for statement.
If you don't have a tidy data structure to iterate through, or you don't have a generator function that drives your processing, you must use while.
while is useful in scenarios where the break condition doesn't logically depend on any kind of sequence. For example, consider unpredictable interactions:
while user_is_sleeping():
wait()
Of course, you could write an appropriate iterator to encapsulate that action and make it accessible via for – but how would that serve readability?¹
In all other cases in Python, use for (or an appropriate higher-order function which encapsulate the loop).
¹ assuming the user_is_sleeping function returns False when false, the example code could be rewritten as the following for loop:
for _ in iter(user_is_sleeping, False):
wait()
The for is the more pythonic choice for iterating a list since it is simpler and easier to read.
For example this:
for i in range(11):
print i
is much simpler and easier to read than this:
i = 0
while i <= 10:
print i
i = i + 1
for loops is used
when you have definite itteration (the number of iterations is known).
Example of use:
Iterate through a loop with definite range: for i in range(23):.
Iterate through collections(string, list, set, tuple, dictionary): for book in books:.
while loop is an indefinite itteration that is used when a loop repeats
unkown number of times and end when some condition is met.
Note that in case of while loop the indented body of the loop should modify at least one variable in the test condition else the result is infinite loop.
Example of use:
The execution of the block of code require that the user enter specified input: while input == specified_input:.
When you have a condition with comparison operators: while count < limit and stop != False:.
Refrerences: For Loops Vs. While Loops, Udacity Data Science, Python.org.
First of all there are differences between the for loop in python and in other languages.
While in python it iterates over a list of values (eg: for value in [4,3,2,7]), in most other languages (C/C++, Java, PHP etc) it acts as a while loop, but easier to read.
For loops are generally used when the number of iterations is known (the length of an array for example), and while loops are used when you don't know how long it will take (for example the bubble sort algorithm which loops as long as the values aren't sorted)
Consider processing iterables. You can do it with a for loop:
for i in mylist:
print i
Or, you can do it with a while loop:
it = mylist.__iter__()
while True:
try:
print it.next()
except StopIteration:
break
Both of those blocks of code do fundamentally the same thing in fundamentally the same way. But the for loop hides the creation of the iterator and the handling of the StopIteration exception so that you don't need to deal with them yourself.
The only time I can think of that you'd use a while loop to handle an iterable would be if you needed to access the iterator directly for some reason, e.g. you needed to skip over items in the list under some circumstances.
For loops usually make it clearer what the iteration is doing. You can't always use them directly, but most of the times the iteration logic with the while loop can be wrapped inside a generator func. For example:
def path_to_root(node):
while node is not None:
yield node
node = node.parent
for parent in path_to_root(node):
...
Instead of
parent = node
while parent is not None:
...
parent = parent.parent
while loop is better for normal loops
for loop is much better than while loop while working with strings, like lists, strings etc.
If your data is dirty and it won't work with a for loop, you need to clean your data.
For me if your problem demands multiple pointers to be used to keep
track of some boundary I would always prefer While loop.
In other cases it's simply for loop.