Python - double loop for element wise string concatenation - python

My question is very simple:
I have A = ['AA','BB'], B = ['CC','DD']
How do I get AB = ['AACC','AADD','BBCC',BBDD']?
Thank you!

You can use itertools.product:
>>> from itertools import product
>>> A = ['AA','BB']
>>> B = ['CC','DD']
>>> AB = [''.join(p) for p in product(A, B)]
>>> AB
['AACC', 'AADD', 'BBCC', 'BBDD']
This has the benefit of working with any number of iterables.

Its easier to see whats happening in a complete loop, here we are going to take i in a which will be AA and BB and j in b which will be CC and DD. On our first iteration we combine the first two AA + CC then append them to our new list , next comes AA + DD then onto BB and the process repeats.
a = ['AA','BB']
b = ['CC','DD']
res = []
for i in a:
for j in b:
x = i + j
res.append(x)
print(res)
# ['AACC', 'AADD', 'BBCC', 'BBDD']
After you understand this you can skip that process and do it with list comprehension, this is identical.
res = [i + j for i in a for j in b]

with list comprehension:
AB = [x + y for x in A for y in B]
we thus iterate over the elements in A and for each element x in A, we iterate over B, and then add x + y to the list.
Or for a variadic number of lists, and with a generator:
from itertools import product
map(''.join, product(A, B))
This can be easily extended to a variable number of elements, like:
>>> A = ['AA','BB']; B = ['CC','DD']; C = ['EE', 'FF']
>>> list(map(''.join, product(A, B, C)))
['AACCEE', 'AACCFF', 'AADDEE', 'AADDFF', 'BBCCEE', 'BBCCFF', 'BBDDEE', 'BBDDFF']

Related

Python list comprehension or lambda possible for my given code snippet?

Edit: Removed undefined variable.
So my code is basically, trying to compare if a value of one list is present in another. If so append the value to 3rd list. If the value is not present, then append to 4th list. What is the most efficient and readable way to do this task. Example of my code:
a = [1,2,3]
b = [2,3,4,5,6,7]
c = []
d = []
for ele in a:
if ele in b:
c.append(ele )
else:
d.append(ele)
a=[2,3,4,5]
b=[3,5,7,9]
c = [value for value in a if value in b]
d = [value for value in a if value not in b]
print(f'Present in B: {c}')
print(f"Not present in B: {d}")
c = [i for i in a if i in b]
d = [i for i in a if i not in b]
The best way to solve this is by using sets.
import random
a = [random.randint(1, 15) for _ in range(5)]
b = [random.randint(1, 15) for _ in range(7)]
print(a)
print(b)
set_a = set(a)
set_b = set(b)
set_intersection = set_a.intersection(set_b)
set_diff = set_a.difference(set_b)
print(list(set_intersection))
print(list(set_diff))

Creating list with specified a length and combining other lists

I am trying to generate a list that combines elements of two other lists, one is a value and one is not.
I've tried having two separate lists with and using the join function and append function to combine the two elements together at the certain stage.
To match the length of list d to list a I've used a while loop as a counter.
a=7*[1]
b=[1,2,3,4,5]
c=['a','b','c']
d=[]
The outcome i'm trying to achieve is such that:
list d becomes the length of list a
& is a combination of list b and list c
d=[1a,1b,1c,2a,2b,2c,3a]
Can think of a Naive solution for now
def create(bk, ck, len_required):
dk = []
for bitem in bk:
for citem in ck:
dk.append(str(bitem) + citem)
if len(dk) == len_required:
return dk
len_required = len(a)
b = [1, 2, 3, 4, 5]
c = ['a', 'b', 'c']
d = create(b, c, len_required)
result = [str(b[int(i / len(c)) % len(b)]) + str(c[i % len(c)]) for i in range(len(a))]
This iterates i from 0 to len(a) and concatenates b[int(i / len(c)) % len(b)] and c[i % len(c)] in the output.
You could do it with a list comprehension:
d = [str(v)+L for v in b*len(a) for L in c][:len(a)]
or, if you're allowed to use itertools:
from itertools import cycle
cycleA = cycle(str(v)+L for v in b for L in c)
d = [ next(cycleA) for _ in a ]

How to compare two lists to keep matching substrings?

As best I can describe it, I have two lists of strings and I want to return all results from list A that contain any of the strings in list B. Here are details:
A = ['dataFile1999', 'dataFile2000', 'dataFile2001', 'dataFile2002']
B = ['2000', '2001']
How do I return
C = ['dataFile2000', 'dataFile2001']?
I've been looking into list comprehensions, doing something like below
C=[x for x in A if B in A]
but I can't seem to make it work. Am I on the right track?
You were close, use any:
C=[x for x in A if any(b in x for b in B)]
More detailed:
A = ['dataFile1999', 'dataFile2000', 'dataFile2001', 'dataFile2002']
B = ['2000', '2001']
C = [x for x in A if any(b in x for b in B)]
print(C)
Output
['dataFile2000', 'dataFile2001']
You can use any() to check if any element of your list B is in x:
A = ['dataFile1999', 'dataFile2000', 'dataFile2001', 'dataFile2002']
B = ['2000', '2001']
c = [x for x in A if any(k in x for k in B)]
print(c)
Output:
['dataFile2000', 'dataFile2001']
First, I would construct a set of the years for the O(1) lookup time.1
>>> A = ['dataFile1999', 'dataFile2000', 'dataFile2001', 'dataFile2002']
>>> B = ['2000', '2001']
>>>
>>> years = set(B)
Now, keep only the elements of A that end with an element of years.
>>> [file for file in A if file[-4:] in years]
>>> ['dataFile2000', 'dataFile2001']
1 If you have very small lists (two elements certainly qualify) keep the lists. Sets have O(1) lookup but the hashing still introduces overhead.

What is the best way to split a variable length string into variables in Python?

Suppose I have a string of integers separated by commas of variable length. What is the best way to split the string and store the integers into variables?
Currently, I have the following.
input = sys.argv[1]
mylist = [int(x) for x in input.split(',')]
if len(mylist) == 2: a, b = mylist
else: a, b, c = mylist
Is there a more efficient way of doing this?
Add sentinels, then limit the list to 3 elements:
a, b, c = (mylist + [None] * 3)[:3]
Now a, b and c are at the very least set to None, and if the number of items is more than three only the first three values are used.
Demo:
>>> mylist = [1, 2]
>>> a, b, c = (mylist + [None] * 3)[:3]
>>> print a, b, c
1 2 None
>>> mylist = [1, 2, 3, 4]
>>> a, b, c = (mylist + [None] * 3)[:3]
>>> print a, b, c
1 2 3
If you need at least 2 elements, use fewer None values and catch ValueError:
try:
a, b, c = (mylist + [None])[:3]
except ValueError:
print "You mast specify at least 2 values"
sys.exit(1)
Just an addendum to Martjin. Turned it into a function to show why you might use it. You can do dynamic sentinels using
def store(mylist,expsiz = 10, dflt = None):
return mylist + [dflt]*(expsiz-len(mylist))
>>> mylist = [1,2,5]
>>> fixedlen = store(mylist)
>>> print fixedlen
[1,2,5,None,None,None,None,None,None,None]

Cross-list comprehension in Python

Let's say I have two lists of strings:
a = ['####/boo', '####/baa', '####/bee', '####/bii', '####/buu']
where #### represents 4-digit random number. And
b = ['boo', 'aaa', 'bii']
I need to know which string entry in list a contains any given entry in b. I was able to accomplish this by couple of nested loops and then using the in operator for checking the string contains the current entry in b. But, being relatively new to py, I'm almost positive this was not the most pythonic or elegant way to write it. So, is there such idiom to reduce my solution?
The following code gives you an array with the indexes of a where the part after the slash is an element from b.
a_sep = [x.split('/')[1] for x in a]
idxs = [i for i, x in enumerate(a_sep) if x in b]
To improve performance, make b a set instead of a list.
Demo:
>>> a = ['####/boo', '####/baa', '####/bee', '####/bii', '####/buu']
>>> b = ['boo', 'aaa', 'bii']
>>> a_sep = [x.split('/')[1] for x in a]
>>> idxs = [i for i, x in enumerate(a_sep) if x in b]
>>> idxs
[0, 3]
>>> [a[i] for i in idxs]
['####/boo', '####/bii']
If you prefer to get the elements directly instead of the indexes:
>>> a = ['####/boo', '####/baa', '####/bee', '####/bii', '####/buu']
>>> b = ['boo', 'aaa', 'bii']
>>> [x for x in a if x.split('/')[1] in b]
['####/boo', '####/bii']
ThiefMaster's answer is good, and mine will be quite similar, but if you don't need to know the indexes, you can take a shortcut:
>>> a = ['####/boo', '####/baa', '####/bee', '####/bii', '####/buu']
>>> b = ['boo', 'aaa', 'bii']
>>> [x for x in a if x.split('/')[1] in b]
['####/boo', '####/bii']
Again, if b is a set, that will improve performance for large numbers of elements.
import random
a=[str(random.randint(1000,9999))+'/'+e for e in ['boo','baa','bee','bii','buu']]
b = ['boo', 'aaa', 'bii']
c=[x.split('/')[-1] for x in a if x.split('/')[-1] in b]
print c
prints:
['boo', 'bii']
Or, if you want the entire entry:
print [x for x in a if x.split('/')[-1] in b]
prints:
['3768/boo', '9110/bii']
>>> [i for i in a for j in b if j in i]
['####/boo', '####/bii']
This should do what you want, elegant and pythonic.
As other answers have indicated, you can use set operations to make this faster. Here's a way to do this:
>>> a_dict = dict((item.split('/')[1], item) for item in a)
>>> common = set(a_dict) & set(b)
>>> [a_dict[i] for i in common]
['####/boo', '####/bii']

Categories