What is best practice to use generators for counting purposes - python

Let's say I have a list like:
my_list = range(10)
And I want to count how many even numbers there are in the list. Note that I am not interested with the values, I just want the count of them. So I can:
len( [0 for i in my_list if i % 2 == 0] ) # Method 1
len( [i for i in my_list if i % 2 == 0] ) # Method 2
len( [_ for i in my_list if i % 2 == 0] ) # Method 3
Is any of the above methods better than others from the speed or memory perspectives?
Actually I don't even need to construct the list, but I don't want to:
counter = 0
for item in my_list:
if item % 2 == 0:
counter += 1
So, which one is a good way of counting with generators?
PS: The list in my case has more memory-heavy items, that is why I want to optimize if possible.

Use none of the above. Use sum() and a generator expression:
sum(i % 2 == 0 for i in mylist)
In Python, the bool boolean type is a subclass of int and True has an integer value of 1, False has 0, so you can sum a series of True and False results.
The sum()-with-generator expression only has to keep one boolean in memory at a time, no intermediary list has to be produced and kept around just to calculate a length.
Alternatively, stick to filtering and sum 1 literals:
sum(1 for i in mylist if i % 2 == 0)
This results in fewer objects needing to be added up still.

Related

determine if list is periodic python

I am curious to find out a function to check if a given list is periodic or not and return the periodic elements. lists are not loaded rather their elements are generated and added on the fly, if this note will make the algorithm easier anyhow.
For example, if the input to the function is [1,2,1,2,1,2,1,2], the output shall be (1,2).
I am looking for some tips and hints on the easier methods to achieve this.
Thanks in advance,
This problem can be solved with the Knuth-Morris-Pratt algorithm for string matching. Please get familiar with the way the fail-links are calculated before you proceed.
Lets consider the list as something like a sequence of values (like a String). Let the size of the list/sequence is n.
Then, you can:
Find the length of the longest proper prefix of your list which is also a suffix. Let the length of the longest proper prefix suffix be len.
If n is divisible by n - len, then the list is periodic and the period is of size len. In this case you can print the first len values.
More info:
GeeksForGeeks article.
Knuth-Morris-Pratt algorithm
NOTE: the original question had python and python-3.x tags, they were edited not by OP, that's why my answer is in python.
I use itertools.cycle and zip to determine if the list is k-periodic for a given k, then just iterate all possible k values (up to half the length of the list).
try this:
from itertools import cycle
def is_k_periodic(lst, k):
if len(lst) < k // 2: # we want the returned part to repaet at least twice... otherwise every list is periodic (1 period of its full self)
return False
return all(x == y for x, y in zip(lst, cycle(lst[:k])))
def is_periodic(lst):
for k in range(1, (len(lst) // 2) + 1):
if is_k_periodic(lst, k):
return tuple(lst[:k])
return None
print(is_periodic([1, 2, 1, 2, 1, 2, 1, 2]))
Output:
(1, 2)
Thank you all for answering my question. Neverthelss, I came up with an implementation that suits my needs.
I will share it here with you looking forward your inputs to optimize it for better performance.
The algorithm is:
assume the input list is periodic.
initialize a pattern list.
go over the list up to its half, for each element i in this first half:
add the element to the pattern list.
check if the pattern is matched throughout the list.
if it matches, declare success and return the pattern list.
else break and start the loop again adding the next element to the pattern list.
If a pattern list is found, check the last k elements of the list where k is len(list) - len(list) modulo the length of the pattern list, if so, return the pattern list, else declare failure.
The code in python:
def check_pattern(nums):
p = []
i = 0
pattern = True
while i < len(nums)//2:
p.append(nums[i])
for j in range(0, len(nums)-(len(nums) % len(p)), len(p)):
if nums[j:j+len(p)] != p:
pattern = False
break
else:
pattern = True
# print(nums[-(len(nums) % len(p)):], p[:(len(nums) % len(p))])
if pattern and nums[-(len(nums) % len(p)) if (len(nums) % len(p)) > 0 else -len(p):] ==\
p[:(len(nums) % len(p)) if (len(nums) % len(p)) > 0 else len(p)]:
return p
i += 1
return 0
This algorithm might be inefficient in terms of performance but it checks the list even if the last elements did not form a complete period.
Any hints or suggestions are highly appreciated.
Thanks in advance,,,
Let L the list. Classic method is: use your favorite algorithm to search the second occurence of the sublist L in the list L+L. If the list is present at index k, then the period is L[:k]:
L L
1 2 1 2 1 2 1 2 | 1 2 1 2 1 2 1 2
1 2 1 2 1 2 1 2
(This is conceptually identical to #KonstantinYovkov's answer). In Python: example with strings (because Python has no builtin sublist search method):
>>> L = "12121212"
>>> k = (L+L).find(L, 1) # skip the first occurrence
>>> L[:k]
'12'
But:
>>> L = "12121"
>>> k = (L+L).find(L, 1)
>>> L[:k] # k is None => return the whole list
'12121'

Multiply odd indices by 2?

So I'm writing a function that is going to multiply each number at an odd index in a list by 2. I'm stuck though, as I really don't know how to approach it.
This is my code.
def produkt(pnr):
for i in pnr:
if i % 2 != 0:
i = i * 2
return pnr
If I, for example, type produkt([1,2,3]) I get [1,2,3] back but I would want it to be [2,2,6].
note that modifying i in your example does not change the value from the input list (integers are immutable). And you're also mixing up the values with their position.
Also, since indices start at 0 in python, you got it the wrong way.
In those cases, a simple list comprehension with a ternary expression will do, using enumerate to be able to get hold of the indices (making it start at 1 to match your case, you can adjust at will):
[p*2 if i%2 else p for i,p in enumerate(pnr,1)]
(note if i%2 is shorter that if i%2 != 0)
using list comprehensions:
multiply odd numbers by 2:
[x*2 if x%2 else x for x in pnr]
After clarification of question wording:
multiply numbers at odd indices by 2:
[x*2 if i%2 else x for i,x in enumerate(pnr)]
Consider using list comprehensions:
def produkt(pnr):
return [k * 2 if k % 2 else k for k in pnr]
Doing i = i * 2 you just override a local variable.
UPDATE (question was changed):
def produkt(pnr):
return [k * 2 if i % 2 else k for i, k in enumerate(pnr, 1)]
You can get the indices using enumerate, however that starts by default with index 0 (not 1) but it accepts a start argument to override that default.
The problem with your approach is that you don't change the actual list contents, you just assign a different value to the name i (which represented a list element until you assigned a different value to it with i = i*2). If you want it to work in-place you would need to modify the list itself: e.g. pnr[idx] *= 2 or pnr[idx] = pnr[idx] * 2.
However, it's generally easier to just create a new list instead of modifying an existing one.
For example:
def produkt(pnr):
newpnr = [] # create a new list
for idx, value in enumerate(pnr, 1):
# If you're testing for not-zero you can omit the "!=0" because every
# non-zero number is "truthy".
if idx % 2:
newpnr.append(value * 2) # append to the new list
else:
newpnr.append(value) # append to the new list
return newpnr # return the new list
>>> produkt([1,2,3])
[2, 2, 6]
Or even better: use a generator function instead of using all these appends:
def produkt(pnr):
for idx, value in enumerate(pnr, 1):
if idx % 2:
yield value * 2
else:
yield value
>>> list(produkt([1,2,3])) # generators should be consumed, for example by "list"
[2, 2, 6]
Of course you could also just use a list comprehension:
def produkt(pnr):
return [value * 2 if idx % 2 else value for idx, value in enumerate(pnr, 1)]
>>> produkt([1,2,3])
[2, 2, 6]
Try this:
def produkt(pnr):
return [ 2*x if i % 2 == 0 else x for i, x in enumerate(pnr)]
It will double every element in your list with an odd index.
>>> produkt([1,2,3])
[2, 2, 6]
Your code does not work, as i is no reference to the value inside the list, but just its value.
You have to store the new value in the list again.
def produkt(pnr):
for i in range(len(pnr)):
if pnr[i] % != 0:
pnr[i] *= 2
return pnr
or use this more convenient solution:
def produkt(pnr):
return [x * 2 if x % 2==0 else x for x in pnr]
Edit: As the question has been changed (completely) you should use this code:
def produkt(pnr):
return [x * 2 if ind % 2 else x for ind, x in enumerate(pnr)]
The first examples multiply each odd index by 2 and the former code multiplies the numbers at odd indices by 2.
Your problem is that i is a copy of the values in the pnr list, not the value in the list itself. So, you are not changing the list when doing i = i * 2.
The existing answers are already good and show the idiomatic way to achieve your goal. However, here is the minimum change to make it work as expected for learning purpose.
produkt(pnr):
new_pnr = list(pnr)
for ix in len(new_pnr):
if new_pnr[ix] % 2 != 0:
new_pnr[ix] *= 2
return new_pnr
Without new_pnr you'd be changing the list in place and then you wouldn't need to return it.

Not comprehending list comprehension in python

While doing some list comprehension exercises, i accidentally did the code below. This ended up printing True/False for all 16 entries on the list.
threes_and_fives =[x % 3 == 0 or x % 5 == 0 for x in range(16)]
print threes_and_fives
After i played with it i was able to get the outcome that I wanted, where it printed the numbers from that list that are divisible by 3 or 5.
threes_and_fives =[x for x in range(16) if x % 3 == 0 or x % 5 == 0]
print threes_and_fives
My questions is why did the first code evaluated to true or false and the other one didn't? I'm trying to get a grasp of python so the more explanations the better :) Thanks!
What you may be missing is that there is nothing special about relational operators in Python, they are expressions like any others, ones that happen to produce Boolean values. To take some examples:
>>> 1 + 1 == 2
True
>>> 2 + 2 == 5
False
>>> [1 + 1 == 2, 2 + 2 == 5]
[True, False]
A list comprehension simply collects expressions involving elements of an iterable sequence into a list:
>>> [x for x in xrange(5)] # numbers 0 through 4
[0, 1, 2, 3, 4]
>>> [x**2 for x in xrange(5)] # squares of 0 through 4
[0, 1, 4, 9, 16]
Your first expression worked just like that, but with the expression producing Booleans: it told Python to assemble a list of Boolean values corresponding to whether the matching ordinal is divisible by 3 or 5.
What you actually wanted was a list of numbers, filtered by the specified condition. Python list comprehensions support this via an optional if clause, which takes an expression and restricts the resulting list to those items for which the Boolean expression returns a true value. That is why your second expression works correctly.
In the following code:
[x % 3 == 0 or x % 5 == 0 for x in range(16)]
The list comprehension is returning the result of x % 3 == 0 or x % 5 == 0 for every value in range(16), which is a boolean value. For instance, if you set x equal to 0, you can see what is happening at every iteration of the loop:
x = 0
x % 3 == 0 or x % 5 == 0
# True
Hope this helps, and happy FizzBuzzing
In your first code sample, you are putting the value of
x % 3 == 0 or x % 5 == 0
into your list. As that expression is evaluated as true or false, you will end up with boolean values in the list.
In the second example, your condition is the condition for including x in the list, so you get the list of numbers which are divisible by 3 or 5. So, the list comprehension statement has to be read as
Include value x from the set {0,1,...,15} where condition (x is divisible by 3 or 5) is met.
EDIT: Fixed the set according to #user4815162342's comment.
The following line is a condition returning True or False:
x % 3 == 0 or x % 5 == 0
So in your first attempt you have put it in your list
This is simply the allowed expression syntax in Python. For a list comprehension, you need ...
Item to appear in the final list
Source of items
Restrictions
The first example you gave evaluates to a Boolean value (True / False).
The second one says to put the value of x into the list, with the restriction of divisibility.

How does Python handle multiple conditions in a list comprehension?

I was trying to create a list comprehension from a function that I had and I came across an unexpected behavior. Just for a better understanding, my function gets an integer and checks which of its digits divides the integer exactly:
# Full function
divs = list()
for i in str(number):
digit = int(i)
if digit > 0 and number % digit == 0:
divs.append(digit)
return len(divs)
# List comprehension
return len([x for x in str(number) if x > 0 and number % int(x) == 0])
The problem is that, if I give a 1012 as an input, the full function returns 3, which is the expected result. The list comprehension returns a ZeroDivisionError: integer division or modulo by zero instead. I understand that it is because of this condition:
if x > 0 and number % int(x) == 0
In the full function, the multiple condition is handled from the left to the right, so it is fine. In the list comprehension, I do not really know, but I was guessing that it was not handled in the same way.
Until I tried with a simpler function:
# Full function
positives = list()
for i in numbers:
if i > 0 and 20 % i ==0:
positives.append(i)
return positives
# List comprehension
return [i for i in numbers if i > 0 and 20 % i == 0]
Both of them worked. So I am thinking that maybe it has something to do with the number % int(x)? This is just curiosity on how this really works? Any ideas?
The list comprehension is different, because you compare x > 0 without converting x to int. In Py2, mismatched types will compare in an arbitrary and stupid but consistent way, which in this case sees all strs (the type of x) as greater than all int (the type of 0) meaning that the x > 0 test is always True and the second test always executes (see Footnote below for details of this nonsense). Change the list comprehension to:
[x for x in str(number) if int(x) > 0 and number % int(x) == 0]
and it will work.
Note that you could simplify a bit further (and limit redundant work and memory consumption) by importing a Py3 version of map at the top of your code (from future_builtins import map), and using a generator expression with sum, instead of a list comprehension with len:
return sum(1 for i in map(int, str(number)) if i > 0 and number % i == 0)
That only calls int once per digit, and constructs no intermediate list.
Footnote: 0 is a numeric type, and all numeric types are "smaller" than everything except None, so a str is always greater than 0. In non-numeric cases, it would be comparing the string type names, so dict < frozenset < list < set < str < tuple, except oops, frozenset and set compare "naturally" to each other, so you can have non-transitive relationships; frozenset() < [] is true, [] < set() is true, but frozenset() < set() is false, because the type specific comparator gets invoked in the final version. Like I said, arbitrary and confusing; it was removed from Python 3 for a reason.
You should say int(x) > 0 in the list comprehension

Check if all numbers in a list are same sign in Python?

How can I tell if a list (or iterable) of numbers all have the same sign?
Here's my first (naive) draft:
def all_same_sign(list):
negative_count = 0
for x in list:
if x < 0:
negative_count += 1
return negative_count == 0 or negative_count == len(list)
Is there a more pythonic and/or correct way of doing this? First thing that comes to mind is to stop iterating once you have opposite signs.
Update
I like the answers so far although I wonder about performance. I'm not a performance junkie but I think when dealing with lists it's reasonable to consider the performance. For my particular use-case I don't think it will be a big deal but for completeness of this question I think it's good to address it. My understanding is the min and max functions have O(n) performance. The two suggested answers so far have O(2n) performance whereas my above routine adding a short circuit to quit once an opposite sign is detected will have at worst O(n) performance. Thoughts?
You can make use of all function: -
>>> x = [1, 2, 3, 4, 5]
>>> all(item >= 0 for item in x) or all(item < 0 for item in x)
True
Don't know whether it's the most pythonic way.
How about:
same_sign = not min(l) < 0 < max(l)
Basically, this checks whether the smallest element of l and the largest element straddle zero.
This doesn't short-circuit, but does avoid Python loops. Only benchmarking can tell whether this is a good tradeoff for your data (and whether the performance of this piece even matters).
Instead of all you could use any, as it short-circuits on the first true item as well:
same = lambda s: any(i >= 0 for i in s) ^ any(i < 0 for i in s)
Similarly to using all, you can use any, which has the benefit of better performance, as it will break the loop on the first occurrence of different sign:
def all_same_sign(lst):
if lst[0] >= 0:
return not any(i < 0 for i in lst)
else:
return not any(i >= 0 for i in lst)
It would be a little tricky if you want to consider 0, as belonging to both groups:
def all_same_sign(lst):
first = 0
i = 0
while first == 0:
first = lst[i]
i += 1
if first > 0:
return not any(i < 0 for i in lst)
else:
return not any(i > 0 for i in lst)
In any case, you iterate the list once instead of twice as in other answers. Your code has the drawback of iterating the loop in Python, which is much less efficient than using built-in functions.

Categories