Compare different elements in two different lists - python

I need to compare if 2 different data are matching from different lists.
I have those 2 lists and I need to count the numbers of babies with :
first_name_baby = S AND age_baby = 1
age_baby = [ 2, 1, 3, 1, 4, 2, 4, 1, 1, 3, 4, 2, 2, 3]. first_name_baby= [ T, S, R, T, O, A, L, S, F, S, Z, U, S, P]
There is actually 2 times when first_name_baby = S AND age_baby = 1 but I need to write a Python program for that.

Use zip to combine corresponding list entries and then .count
>>> age_baby = [ 2, 1, 3, 1, 4, 2, 4, 1, 1, 3, 4, 2, 2, 3]
>>> first_name_baby = "T, S, R, T, O, A, L, S, F, S, Z, U, S, P".split(', ')
>>> list(zip(first_name_baby, age_baby)).count(('S', 1))
2
Alternatively, you could use numpy. This would allow a solution very similar to what you have tried:
>>> import numpy as np
>>>
>>> age_baby = np.array(age_baby)
>>> first_name_baby = np.array(first_name_baby)
>>>
>>> np.count_nonzero((first_name_baby == 'S') & (age_baby == 1))
2

you can just take the sum of 1 whenever the conditions match. iterate over the lists simultaneously using zip:
# need to make sense of the names
T, S, R, O, A, L, F, Z, U, S, P = 'T, S, R, O, A, L, F, Z, U, S, P'.split(', ')
age_baby = [2, 1, 3, 1, 4, 2, 4, 1, 1, 3, 4, 2, 2, 3]
first_name_baby = [T, S, R, T, O, A, L, S, F, S, Z, U, S, P]
sum(1 for age, name in zip(age_baby, first_name_baby)
if age == 1 and name == S)
thanks to Austin a more elegant version of this:
sum(age == 1 and name == S for age, name in zip(age_baby, first_name_baby))
this works because bools in python are subclasses of int and True is basically 1 (with overloaded __str__ and __repr__) and False is 0; therefore the booleans can just be summed and the result is the number of True comparisons.

Try this:
>>> count = 0
>>>
>>>
>>> for i in range(len(first_name_baby)):
... if first_name_baby[i] == 'S' and age_baby[i] == 1:
... count += 1
...
>>> count
2

x = len([item for idx, item in enumerate(age_baby) if item == 1 and first_name_baby[idx] == 'S'])
2
Expanded:
l = []
for idx, item in enumerate(age_baby):
if item == 1 and first_name_baby[idx] == 'S':
l.append(item)
x = len(l)

Related

Python: vertical binning of two lists

I have two lists of the same size:
A = [1, 1, 2, 2, 3, 3, 4, 5]
B = [a, b, c, d, e, f, g, h] # numeric values
How do I do a vertical binning?
Output desired:
C = [ 1, 2, 3, 4, 5] # len = 5
D = [a + b, c + d, e + f, g, h] # len = 5
i.e. a mapping of A list to its cumulative sum (vertical binning?) where it occurs in list B.
I assume a, b, ... are numeric variables:
bins = dict()
for b, x in zip(A,B):
bins[b] = bins.setdefault(b, 0) + x
C = [key for key in bins]
D = [bins[key] for key in bins]
If a, b, ... are of another type, you would have to adjust the default value in bins.setdefault(b, ...).
This is a perfect case for the use of itertools.groupby:
from itertools import groupby
from operator import itemgetter
fst = itemgetter(0)
A = [1,1,2,2,3,3,4,5]
B = [1,3,4,6,7,7,8,8]
C = []
D = []
for k, v in groupby(zip(A, B), key=fst):
C.append(k)
D.append(sum(item[-1] for item in v))
C
>>[1, 2, 3, 4, 5]
D
>>[4, 10, 14, 8, 8]
If B is a list of strings then your summation operation becomes:
D.append(''.join(item[-1] for item in v))
You can use a dictionary and since Python 3.6 the order is preserved, therefore you get your C as the keys and D as values:
A = [1,1,2,2,3,3,4,5]
B = ["a","b","c","d","e","f","g","h"]
from random import randint
rename_to_B_for_numeric = [randint(0, 255) for _ in A]
result = {}
for idx, item in enumerate(A):
if item not in result:
# not sure about the type, so...
result[item] = "" if isinstance(B[idx], str) else 0
result[item] += B[idx]
print(result)
# {1: 'ab', 2: 'cd', 3: 'ef', 4: 'g', 5: 'h'}
print(list(result.keys()))
# [1, 2, 3, 4, 5]
print(list(result.values()))
# ['ab', 'cd', 'ef', 'g', 'h']
obviously if the type of item in B is not a string nor a number (int in this case) you'll need to modify the code a little bit to get some default type. Or just use else:
if item not in result:
result[item] = B[idx]
else:
result[item] += B[idx]
Here, C is the unique values of A:
C = sorted(set(A))
gives:
[1, 2, 3, 4, 5]
Now, D is the vertical binning of B w.r.t A (if B's elements are alpha):
D = [''.join(B[i] for i in range(len(B)) if A[i] == j) for j in C]
if B's elements are num:
D = [sum(B[i] for i in range(len(B)) if A[i] == j) for j in C]
gives:
['ab', 'cd', 'ef', 'g', 'h']
Note:
A = [1,1,2,2,3,3,4,5]
B = ['a','b','c','d','e','f','g','h']
Here a,b,c,... if numeric, go for the second eqn :)

Python: Unpacking arrays of tuples or of arrays - unpacking more than 2 elements per array or tuple

My aim is to get a more elegant unpacking of a sub-tuple or sub-list for longer tuples or longer lists.
For example, I have an array with sub-arrays
s = [['yellow', 1,5,6], ['blue', 2,8,3], ['yellow', 3,4,7], ['blue',4,9,1], ['red', 1,8,2,11]]
Experimenting with an array and sub-tuple or sub-list with 2 elements,I have the following:
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
OR
s = [['yellow', 1], ['blue', 2], ['yellow', 3], ['blue', 4], ['red', 1]]
I can unpack 's' whether it has tuples or lists:
for k, v in s:
print('k = {0}, v = {1}'.format(k,v))
Produces the result
k = yellow, v = 1
k = blue, v = 2
k = yellow, v = 3
k = blue, v = 4
k = red, v = 1
Suppose I have the following array with sub-arrays of four elements each:
bongo =
[[1, 2, 3, 4], [6, 3, 2, 3], [5, 7, 11, 15], [2, 4, 7, 8]]
I can unpack 'bongo' using the variables a,b,c,d
for a,b,c,d in bongo:
print('a = {0}, b = {1}, c={2}, d={3}'.format(a,b,c,d))
a = 1, b = 2, c=3, d=4
a = 6, b = 3, c=2, d=3
a = 5, b = 7, c=11, d=15
a = 2, b = 4, c=7, d=8
Despite being able to unpack the mixed chr/number sub-array I seem to have a problem unpacking a mixed 'chr' and number sub-list (or sub-tuple (not shown, but get the same result)):
s = [['yellow', 1,5,6], ['blue', 2,8,3], ['yellow', 3,4,7], ['blue',
4,9,1], ['red', 1,8,2,11]]
That is, doing an unpacking I get the desired result with an error:
for a,b,c,d in s:
print('a = {0}, b = {1}, c = {2}, d = {3} '.format(a,b,c,d))
a = yellow, b = 1, c = 5, d = 6
a = blue, b = 2, c = 8, d = 3
a = yellow, b = 3, c = 4, d = 7
a = blue, b = 4, c = 9, d = 1
Traceback (most recent call last):
File "<pyshell#288>", line 1, in <module>
for a,b,c,d in s:
ValueError: too many values to unpack (expected 4)
My question: Is there a more elegant way of unpacking, such that I would like to get the first element, say as a key, and the rest?
To illustrate with pseudo-code - it does not work directly in python:
for k[0][0], v[0][1:4] in s:
print('k[0][0] = {0}, v[0][1:4] = {1}'.format(k[0][0],v[0][1:4]))
Such as to get the following output:
a = yellow, b = 1, c = 5, d = 6
a = blue, b = 2, c = 8, d = 3
a = yellow, b = 3, c = 4, d = 7
a = blue, b = 4, c = 9, d = 1
Inspiration:
Experimenting with the defaultdict at para 3.4.1 https://docs.python.org/3/library/collections.html#collections.defaultdict particularly the unpacking of an array with a sub-tuple.
Thank you,
Anthony of Sydney
You can covert to your desired format first:
>>> ss = {x[0]: x[1:] for x in s}
>>> ss
{'blue': [4, 9, 1], 'red': [1, 8, 2, 11], 'yellow': [3, 4, 7]}
>>> for s, v in ss.items():
... print "a = {0} b = {1} c = {2} d = {3}".format(s, *v)
...
a = blue b = 4 c = 9 d = 1
a = red b = 1 c = 8 d = 2
a = yellow b = 3 c = 4 d = 7
>>>
Further to the Mr Azim's answer, in the 5th line he used *v. This inspired me to apply this for further experimentation to an array/tuple/list instead of the dictionary.
This code produces the same result:
s = [('yellow', 1, 5, 6), ('blue', 2, 8, 3), ('green', 4, 9, 1), ('red', 1, 8, 2)]
for x, *y in s:
temparray = [b for b in y]; Note we don't use *y
print('x = {0}, temparray = {1}'.format(x, temparray))
as
for x, *y in s:
print('x = {0}, y = {1}'.format(x,y)); note we don't use *y
x = yellow, y = [1, 5, 6]
x = blue, y = [2, 8, 3]
x = green, y = [4, 9, 1]
x = red, y = [1, 8, 2]
type(y)
<class 'list'>
Conclusion: the * operator can be applied not only in dictionaries, but also in arrays/tuples/lists. When applied in a 'for' loop, as in
for var1 *var2 in aListorTupleorArray:
# var1 gets the first element of the list or tuple or array
# *var2 gets the remaining elements of the list or tuple or array
print('var1 = {0}, var2 = {1}'.format(var1,var2);#Note we don't use the * in *var2. just use var2
Thanks,
Anthony of exciting Sydney
Here is a subtle difference between printing *v and v.
#printing v in the loop
for s,v in ss.items():
print("s = {0}, v = {1}".format(s,v)); #printing s & v
s = yellow, v = [3,4,7]
s = blue, v = [4,9,1]
s = red, v = [1,8,2]
Then we have
#printing *v in the loop
for s,v in ss.items():
print("s = {0}, *v = {1}".format(s,*v)); #printing s & *v
s = yellow, v = 3 4 7
s = blue, v = 4 9 1
s = red, v = 1 8 2
Note the subtlety here: whether we use *v in the 'for' loop, print v or *v produces the same result:
#printing v in the loop
for s,*v in ss.items():
print("s = {0}, v = {1}".format(s,v)); #printing s & v
#printing v in the loop
for s,*v in ss.items():
print("s = {0}, v = {1}".format(s,*v)); #printing s & v
Produces the same result:
s = yellow, v = [[3,4,7]]
s = blue, v = [[4,9,1]]
s = red, v = [[1,8,2]]
Thank you,
Anthony of Sydney

Number of pairs

I am trying to write a code that takes
m. a, a list of integers
n. b, an integer
and returns the number of pairs (m,n) with m,n in a such that |m-n|<=b.
So far, I've got this
def nearest_pairs(a, b):
m= []
n= int
num_pairs = 0
return num_pairs
def main():
# The nearest pairs are (1,2), (2,1), (2,5) and (5,2)
x = nearest_pairs( [1,2,5] , 3 )
print( "nearest_pairs([1, 2, 5], 3) = " , nearest_pairs([1, 2, 5], 3) )
# The nearest pairs are (1,2) and (2,1)
y = nearest_pairs( [1, 2, 5] , 2 )
print( "nearest_pairs([1, 2, 5], 2) = " , nearest_pairs([1, 2, 5], 2) )
if __name__ == '__main__':
main()
The desired output should look like
>>> nearest_pairs([1,2,5],3) = 4
where 4 is the number of close pairs according to the restrictions. However, I get an error. Could anyone lead me to the right direction?
Yours doesn't make sense. No idea what you're trying with len(a, b), but it's not even allowed, since len takes only one argument. And returning something just when you found the first counting pair? Here's a fix:
def close_pairs(l, d):
ctr = 0
for a,b in permutations(l, 2):
if (a - b) <= d and (b - a) <= d:
ctr += 1
return ctr
And here's how I'd do it:
def close_pairs(l, d):
return sum(abs(a-b) <= d for a, b in permutations(l, 2))
from itertools import permutations
def nearest_pairs(a, b):
for m, n in permutations(a, 2):
if abs(m - n) <= b:
yield (m, n)
>>> list(nearest_pairs([1, 2, 5], 3))
[(1, 2), (2, 1), (2, 5), (5, 2)]
>>> list(nearest_pairs([1, 2, 5], 2))
[(1, 2), (2, 1)]
If you just want the count:
def nearest_pairs_count(a, b):
c, l = 0, len(a)
for i in range(l):
for j in range(i + 1, l):
if abs(a[i] - a[j]) <= b:
c += 2
return c

Finding the amount of equal elements in the beginning of a list

Given a list in python, I would like to find how many equal elements are in the beginning of the list.
Example input:
x1 = ['a','a','b','c','a','a','a','c']
x2 = [1, 1, 1, 3, 1, 1, 1, 8]
x3 = ['foo','bar','foobar']
Some magical function (or a one liner) would output:
f(x1) = 2 # There are 2 'a' values in the beginning.
f(x2) = 3 # There are 3 1-values in the beginning.
f(x3) = 1 # Only 1 'foo' in beginning.
If I do:
sum([1 if x=='a' else 0 for x in x1])
I just get the number of occurrences of 'a' in x1, not the number of leading values in a row. Would be nice to have a one liner which doesn't need to know the first value.
itertools.groupby can help ...
from itertools import groupby
def f(lst):
if_empty = ('ignored_key', ())
k, v = next(groupby(lst), if_empty)
return sum(1 for _ in v)
And of course we can turn this into a 1-liner (sans the import):
sum(1 for _ in next(groupby(lst), ('ignored', ()))[1])
But I wouldn't really recommend it.
demo:
>>> from itertools import groupby
>>>
>>> def f(lst):
... if_empty = ('ignored_key', ())
... k, v = next(groupby(lst), if_empty)
... return sum(1 for _ in v)
...
>>> f(x1)
2
>>> f(x2)
3
>>> f(x3)
1
>>> f([])
0
You can use takewhile.
import itertools
xs = [1, 1, 1, 3, 1, 1, 1, 8]
sum(1 for _ in itertools.takewhile(lambda x: x == xs[0], xs))
In a function:
def count_first(iterable):
i = iter(iterable)
first = next(i)
return 1 + sum(1 for _ in itertools.takewhile(lambda x: x == first, i))
Maybe is better check the first occurrence of something that is not equal to the first value:
x1 = ['a','a','b','c','a','a','a','c']
x2 = [1, 1, 1, 3, 1, 1, 1, 8]
x3 = ['foo','bar','foobar']
x4 = []
x5 = [1,1,1,1,1,1]
def f(x):
pos = -1
for pos,a in enumerate(x):
if a!=x[0]:
return pos
return pos+1
print(f(x1))
print(f(x2))
print(f(x3))
print(f(x4))
print(f(x5))
2
3
1
0
6

How do you calculate the greatest number of repetitions in a list?

If I have a list in Python like
[1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1]
How do I calculate the greatest number of repeats for any element? In this case 2 is repeated a maximum of 4 times and 1 is repeated a maximum of 3 times.
Is there a way to do this but also record the index at which the longest run began?
Use groupby, it group elements by value:
from itertools import groupby
group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1])
print max(group, key=lambda k: len(list(k[1])))
And here is the code in action:
>>> group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1])
>>> print max(group, key=lambda k: len(list(k[1])))
(2, <itertools._grouper object at 0xb779f1cc>)
>>> group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3])
>>> print max(group, key=lambda k: len(list(k[1])))
(3, <itertools._grouper object at 0xb7df95ec>)
From python documentation:
The operation of groupby() is similar
to the uniq filter in Unix. It
generates a break or new group every
time the value of the key function
changes
# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
If you also want the index of the longest run you can do the following:
group = groupby([1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 3, 3, 3, 3, 3])
result = []
index = 0
for k, g in group:
length = len(list(g))
result.append((k, length, index))
index += length
print max(result, key=lambda a:a[1])
Loop through the list, keep track of the current number, how many times it has been repeated, and compare that to the most times youve seen that number repeated.
Counts={}
Current=0
Current_Count=0
LIST = [1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1]
for i in LIST:
if Current == i:
Current_Count++
else:
Current_Count=1
Current=i
if Current_Count>Counts[i]:
Counts[i]=Current_Count
print Counts
If you want it for just any element (i.e. the element with the most repetitions), you could use:
def f((v, l, m), x):
nl = l+1 if x==v else 1
return (x, nl, max(m,nl))
maxrep = reduce(f, l, (0,0,0))[2];
This only counts continuous repetitions (Result for [1,2,2,2,1,2] would be 3) and only records the element with the the maximum number.
Edit: Made definition of f a bit shorter ...
This is my solution:
def longest_repetition(l):
if l == []:
return None
element = l[0]
new = []
lar = []
for e in l:
if e == element:
new.append(e)
else:
if len(new) > len(lar):
lar = new
new = []
new.append(e)
element = e
if len(new) > len(lar):
lar = new
return lar[0]
-You can make new copy of the list but with unique values and a corresponding hits list.
-Then get the Max of hits list and get from it's index your most repeated item.
oldlist = ["A", "B", "E", "C","A", "C","D","A", "E"]
newlist=[]
hits=[]
for i in range(len(oldlist)):
if oldlist[i] in newlist:
hits[newlist.index(oldlist[i])]+= 1
else:
newlist.append(oldlist[i])
hits.append(1);
#find the most repeated item
temp_max_hits=max(hits)
temp_max_hits_index=hits.index(temp_max_hits)
print(newlist[temp_max_hits_index])
print(temp_max_hits)
But I don't know is this the fastest way to do that or there are faster solution.
If you think there are faster or more efficient solution, kindly inform us.
I'd use a hashmap of item to counter.
Every time you see a 'key' succession, increment its counter value. If you hit a new element, set the counter to 1 and keep going. At the end of this linear search, you should have the maximum succession count for each number.
This code seems to work:
l = [1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1]
previous = None
# value/repetition pair
greatest = (-1, -1)
reps = 1
for e in l:
if e == previous:
reps += 1
else:
if reps > greatest[1]:
greatest = (previous, reps)
previous = e
reps = 1
if reps > greatest[1]:
greatest = (previous, reps)
print greatest
i write this code and working easly:
lst = [4,7,2,7,7,7,3,12,57]
maximum=0
for i in lst:
count = lst.count(i)
if count>maximum:
maximum=count
indexx = lst.index(i)
print(lst[indexx])

Categories