Generate a sequence of number and alternating string in python - python

Aim
I would like to generate a sequence as list in python, such as:
['s1a', 's1b', 's2a', 's2b', ..., 's10a', 's10b']
Properties:
items contain a single prefix
numbers are sorted numerical
suffix is alternating per number
Approach
To get this, I applied the following code, using an xrange and comprehensive list approach:
# prefix
p = 's'
# suffix
s = ['a', 'b']
# numbers
n = [ i + 1 for i in list(xrange(10))]
# result
[ p + str(i) + j for i, j in zip(sorted(n * len(s)), s * len(n)) ]
Question
Is there a more simple syntax to obtain the results, e.g. using itertools?
Similar to this question?

A doubled-for list comprehension can accomplish this:
['s'+str(x)+y for x in range(1,11) for y in 'ab']

itertools.product might be your friend:
all_combos = ["".join(map(str, x)) for x in itertools.product(p, n, s)]
returns:
['s1a', 's1b', 's2a', 's2b', 's3a', 's3b', 's4a', 's4b', 's5a', 's5b', 's6a', 's6b', 's7a', 's7b', 's8a', 's8b', 's9a', 's9b', 's10a', 's10b']
EDIT: as a one-liner:
all_combos = ["".join(map(str,x)) for x in itertools.product(['s'], range(1, 11), ['a', 'b'])]
EDIT 2: as pointed out in James' answer, we can change our listed string element in the product call to just strings, and itertools will still be able to iterate over them, selecting characters from each:
all_combos = ["".join(map(str,x)) for x in itertools.product('s', range(1, 11), 'ab')]

How about:
def func(prefix,suffixes,size):
k = len(suffixes)
return [prefix+str(n/k+1)+suffixes[n%k] for n in range(size*k)]
# usage example:
print func('s',['a','b'],10)
This way you can alternate as many suffixes as you want.
And of course, each one of the suffixes can be as long as you want.

You can use a double-list comprehension, where you iterate on number and suffix. You don't need to load any
Below is a lambda function that takes 3 parameters, a prefix, a number of iterations, and a list of suffixes
foo = lambda prefix,n,suffix: list(prefix+str(i)+s for s in suffix for i in range(n))
You can use it like this
foo('p',10,'abc')
Or like that, if your suffixes have more than one letter
foo('p',10,('a','bc','de'))

For maximum versatility I would do this as a generator. That way you can either create a list, or just produce the sequence items as they are needed.
Here's code that runs on Python 2 or Python 3.
def psrange(prefix, suffix, high):
return ('%s%d%s' % (prefix, i, s) for i in range(1, 1 + high) for s in suffix)
res = list(psrange('s', ('a', 'b'), 10))
print(res)
for s in psrange('x', 'abc', 3):
print(s)
output
['s1a', 's1b', 's2a', 's2b', 's3a', 's3b', 's4a', 's4b', 's5a', 's5b', 's6a', 's6b', 's7a', 's7b', 's8a', 's8b', 's9a', 's9b', 's10a', 's10b']
x1a
x1b
x1c
x2a
x2b
x2c
x3a
x3b
x3c

Related

Deleting list from a list of lists

I need to delete these lists inside of list that contains the / symbol.
List for example:
X = [['a/','$1'], ["c","d"]]
so X[0] should be deleted. The actual list are much longer and contains more instances of this condition.
I tried use something like:
print([l for l in X if l.count("/") <1])
But if I understand correctly because the / is attached to another symbol he is not counted.
Should I convert this list of lists to string, separate the / from another character, and then use the count function, or there is better solution?
One way to search "/" in each item in the sublists is to wrap a generator expression with any. Since you don't want sublists with "/" in it, the condition should be not any():
out = [lst for lst in X if not any('/' in x for x in lst)]
Output:
[['c', 'd']]
The call to filter() applies that lambda function to every list in X and filters out list with '/'.
result = list(filter(lambda l: not any('/' in s for s in l), X))
counter = 0
while counter < len(X):
removed = False
for i in X[counter]:
if '/' in i:
X.pop(counter)
removed = True
break
if not removed:
counter += 1
Given:
X = [['a/','$1'], ["c","d"]]
You can convert the sub lists to their repr string representations and detect the / in that string:
new_x=[sl for sl in X if not '/' in repr(sl)]
Or, you can use next:
new_x=[sl for sl in X if not next('/' in s for s in sl)]
Either:
>>> new_x
[['c', 'd']]

Filtering for tuples from another list and extracting values

I am working on handling two lists of tuples and deducing results.
For example:
A = [('Hi','NNG'),('Good','VV'),...n]
B = [('Happy','VA',1.0),('Hi','NNG',0.5)...n]
First, I'd like to match the words between A and B.
like 'Hi'='Happy' or 'Hi'='Hi'
Second, if they are same and match, then match word class.
whether 'NNG'='NNG' or 'NNG'='VV'
Third, if all these steps match, then extract the number!
like if A=[('Hi','NNG')] and B=('Hi','NNG',0.5)
Extract 0.5
Lastly, I want to multiply all numbers from extraction.
There are more than 1,000 tuples in each A, B. So 'for' loop will be necessary to find out this process.
How can I do this in Python?
Try something like this:
A = [('Hi', 'NNG'), ('Good', 'VV')]
B = [('Happy', 'VA', 1.0), ('Hi', 'NNG', 0.5)]
print(', '.join(repr(j[2]) for i in A for j in B if i[0] == j[0] and i[1] == j[1]))
# 0.5
One way is to use a set and (optionally) a dictionary. The benefit of this method is you also keep the key data to know where your values originated.
A = [('Hi','NNG'),('Good','VV')]
B = [('Happy','VA',1.0),('Hi','NNG',0.5)]
A_set = set(A)
res = {(i[0], i[1]): i[2] for i in B if (i[0], i[1]) in A_set}
res = list(res.values())
# [0.5]
To multiply all results in the list, see How can I multiply all items in a list together with Python?
Explanation
Use a dictionary comprehension with for i in B. What this does is return a tuple of results iterating through each element of B.
For example, when iterating the first element, you will find i[0] = 'Happy', i[1] = 'VA', i[2] = 1.0.
Since we loop through the whole list, we construct a dictionary of results with tuple keys from the first 2 elements.
Additionally, we add the criterion (i[0], i[1]) in A_set to filter as per required logic.
Python is so high level that it feels like English. So, the following working solution can be written very easily with minimum experience:
A = [('Hi','NNG'),('Good','VV')]
B = [('Happy','VA',1.0),('Hi','NNG',0.5)]
tot = 1
for ia in A:
for ib in B:
if ia == ib[:2]:
tot *= ib[2]
break # remove this line if multiple successful checks are possible
print(tot) # -> 0.5
zip() is your friend:
for tupA,tupB in zip(A,B):
if tupA[:2] == tupB[:2] : print(tupB[2])
To use fancy pythonic list comprehension:
results = [tubB[2] for tubA,tubB in zip(A,B) if tubA[:2] == tubB[:2] ]
But... why do I have a sneaky feeling this isn't what you want to do?

loop function 2 variables [duplicate]

This question already has answers here:
How do I iterate through two lists in parallel?
(8 answers)
Closed 2 years ago.
How can I include two variables in the same for loop?
t1 = [a list of integers, strings and lists]
t2 = [another list of integers, strings and lists]
def f(t): #a function that will read lists "t1" and "t2" and return all elements that are identical
for i in range(len(t1)) and for j in range(len(t2)):
...
If you want the effect of a nested for loop, use:
import itertools
for i, j in itertools.product(range(x), range(y)):
# Stuff...
If you just want to loop simultaneously, use:
for i, j in zip(range(x), range(y)):
# Stuff...
Note that if x and y are not the same length, zip will truncate to the shortest list. As #abarnert pointed out, if you don't want to truncate to the shortest list, you could use itertools.zip_longest.
UPDATE
Based on the request for "a function that will read lists "t1" and "t2" and return all elements that are identical", I don't think the OP wants zip or product. I think they want a set:
def equal_elements(t1, t2):
return list(set(t1).intersection(set(t2)))
# You could also do
# return list(set(t1) & set(t2))
The intersection method of a set will return all the elements common to it and another set (Note that if your lists contains other lists, you might want to convert the inner lists to tuples first so that they are hashable; otherwise the call to set will fail.). The list function then turns the set back into a list.
UPDATE 2
OR, the OP might want elements that are identical in the same position in the lists. In this case, zip would be most appropriate, and the fact that it truncates to the shortest list is what you would want (since it is impossible for there to be the same element at index 9 when one of the lists is only 5 elements long). If that is what you want, go with this:
def equal_elements(t1, t2):
return [x for x, y in zip(t1, t2) if x == y]
This will return a list containing only the elements that are the same and in the same position in the lists.
There's two possible questions here: how can you iterate over those variables simultaneously, or how can you loop over their combination.
Fortunately, there's simple answers to both. First case, you want to use zip.
x = [1, 2, 3]
y = [4, 5, 6]
for i, j in zip(x, y):
print(str(i) + " / " + str(j))
will output
1 / 4
2 / 5
3 / 6
Remember that you can put any iterable in zip, so you could just as easily write your exmple like:
for i, j in zip(range(x), range(y)):
# do work here.
Actually, just realised that won't work. It would only iterate until the smaller range ran out. In which case, it sounds like you want to iterate over the combination of loops.
In the other case, you just want a nested loop.
for i in x:
for j in y:
print(str(i) + " / " + str(j))
gives you
1 / 4
1 / 5
1 / 6
2 / 4
2 / 5
...
You can also do this as a list comprehension.
[str(i) + " / " + str(j) for i in range(x) for j in range(y)]
Any reason you can't use a nested for loop?
for i in range(x):
for j in range(y):
#code that uses i and j
for (i,j) in [(i,j) for i in range(x) for j in range(y)]
should do it.
If you really just have lock-step iteration over a range, you can do it one of several ways:
for i in range(x):
j = i
…
# or
for i, j in enumerate(range(x)):
…
# or
for i, j in ((i,i) for i in range(x)):
…
All of the above are equivalent to for i, j in zip(range(x), range(y)) if x <= y.
If you want a nested loop and you only have two iterables, just use a nested loop:
for i in range(x):
for i in range(y):
…
If you have more than two iterables, use itertools.product.
Finally, if you want lock-step iteration up to x and then to continue to y, you have to decide what the rest of the x values should be.
for i, j in itertools.zip_longest(range(x), range(y), fillvalue=float('nan')):
…
# or
for i in range(min(x,y)):
j = i
…
for i in range(min(x,y), max(x,y)):
j = float('nan')
…
"Python 3."
Add 2 vars with for loop using zip and range; Returning a list.
Note: Will only run till smallest range ends.
>>>a=[g+h for g,h in zip(range(10), range(10))]
>>>a
>>>[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
For your use case, it may be easier to utilize a while loop.
t1 = [137, 42]
t2 = ["Hello", "world"]
i = 0
j = 0
while i < len(t1) and j < len(t2):
print t1[i], t2[j]
i += 1
j += 1
# 137 Hello
# 42 world
As a caveat, this approach will truncate to the length of your shortest list.
I think you are looking for nested loops.
Example (based on your edit):
t1=[1,2,'Hello',(1,2),999,1.23]
t2=[1,'Hello',(1,2),999]
t3=[]
for it1, e1 in enumerate(t1):
for it2, e2 in enumerate(t2):
if e1==e2:
t3.append((it1,it2,e1))
# t3=[(0, 0, 1), (2, 1, 'Hello'), (3, 2, (1, 2)), (4, 3, 999)]
Which can be reduced to a single comprehension:
[(it1,it2,e1) for it1, e1 in enumerate(t1) for it2, e2 in enumerate(t2) if e1==e2]
But to find the common elements, you can just do:
print set(t1) & set(t2)
# set([(1, 2), 1, 'Hello', 999])
If your list contains non-hashable objects (like other lists, dicts) use a frozen set:
from collections import Iterable
s1=set(frozenset(e1) if isinstance(e1,Iterable) else e1 for e1 in t1)
s2=set(frozenset(e2) if isinstance(e2,Iterable) else e2 for e2 in t2)
print s1 & s2

removing duplicate tuples in python

I have a list of 50 numbers, [0,1,2,...49] and I would like to create a list of tuples without duplicates, where i define (a,b) to be a duplicate of (b,a). Similarly, I do not want tuples of the form (a,a).
I have this:
pairs = set([])
mylist = range(0,50)
for i in mylist:
for j in mylist:
pairs.update([(i,j)])
set((a,b) if a<=b else (b,a) for a,b in pairs)
print len(pairs)
>>> 2500
I get 2500 whereas I expect to get, i believe, 1225 (n(n-1)/2).
What is wrong?
You want all combinations. Python provides a module, itertools, with all sorts of combinatorial utilities like this. Where you can, I would stick with using itertool, it almost certainly faster and more memory efficient than anything you would cook up yourself. It is also battle-tested. You should not reinvent the wheel.
>>> import itertools
>>> combs = list(itertools.combinations(range(50),2))
>>> len(combs)
1225
>>>
However, as others have noted, in the case where you have a sequence (i.e. something indexable) such as a list, and you want N choose k, where k=2 the above could simply be implemented by a nested for-loop over the indices, taking care to generate your indices intelligently:
>>> result = []
>>> for i in range(len(numbers)):
... for j in range(i + 1, len(numbers)):
... result.append((numbers[i], numbers[j]))
...
>>> len(result)
1225
However, itertool.combinations takes any iterable, and also takes a second argument, r which deals with cases where k can be something like 7, (and you don't want to write a staircase).
Your approach essentially takes the cartesian product, and then filters. This is inefficient, but if you wanted to do that, the best way is to use frozensets:
>>> combinations = set()
>>> for i in numbers:
... for j in numbers:
... if i != j:
... combinations.add(frozenset([i,j]))
...
>>> len(combinations)
1225
And one more pass to make things tuples:
>>> combinations = [tuple(fz) for fz in combinations]
Try This,
pairs = set([])
mylist = range(0,50)
for i in mylist:
for j in mylist:
if (i < j):
pairs.append([(i,j)])
print len(pairs)
problem in your code snippet is that you filter out unwanted values but you don't assign back to pairs so the length is the same... also: this formula yields the wrong result because it considers (20,20) as valid for instance.
But you should just create the proper list at once:
pairs = set()
for i in range(0,50):
for j in range(i+1,50):
pairs.add((i,j))
print (len(pairs))
result:
1225
With that method you don't even need a set since it's guaranteed that you don't have duplicates in the first place:
pairs = []
for i in range(0,50):
for j in range(i+1,50):
pairs.append((i,j))
or using list comprehension:
pairs = [(i,j) for i in range(0,50) for j in range(i+1,50)]

More elegant way to implement regexp-like quantifiers

I'm writing a simple string parser which allows regexp-like quantifiers. An input string might look like this:
s = "x y{1,2} z"
My parser function translates this string to a list of tuples:
list_of_tuples = [("x", 1, 1), ("y", 1, 2), ("z", 1, 1)]
Now, the tricky bit is that I need a list of all valid combinations that are specified by the quantification. The combinations all have to have the same number of elements, and the value None is used for padding. For the given example, the expected output is
[["x", "y", None, "z"], ["x", "y", "y", "z"]]
I do have a working solution, but I'm not really happy with it: it uses two nested for loops, and I find the code somewhat obscure, so there's something generally awkward and clumsy about it:
import itertools
def permute_input(lot):
outer = []
# is there something that replaces these nested loops?
for val, start, end in lot:
inner = []
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
for i in range(start, end + 1):
x = [val] * i + [None] * (end - i)
inner.append(x)
outer.append(inner)
# Outer is now a list of lists.
final = []
# use itertools.product to combine the elements in the
# list of lists:
for combination in itertools.product(*outer):
# flatten the elements in the current combination,
# and append them to the final list:
final.append([x for x
in itertools.chain.from_iterable(combination)])
return final
print(permute_input([("x", 1, 1), ("y", 1, 2), ("z", 1, 1)]))
[['x', 'y', None, 'z'], ['x', 'y', 'y', 'z']]
I suspect that there's a much more elegant way of doing this, possibly hidden somewhere in the itertools module?
One alternative way to approach the problem is to use pyparsing and this example regex parser that would expand a regular expression to possible matching strings. For your x y{1,2} z sample string it would generate two possible strings expanding the quantifier:
$ python -i regex_invert.py
>>> s = "x y{1,2} z"
>>> for item in invert(s):
... print(item)
...
x y z
x yy z
The repetition itself supports both an open-ended range and a closed range and is defined as:
repetition = (
(lbrace + Word(nums).setResultsName("count") + rbrace) |
(lbrace + Word(nums).setResultsName("minCount") + "," + Word(nums).setResultsName("maxCount") + rbrace) |
oneOf(list("*+?"))
)
To get to the desired result, we should modify the way the results are yielded from the recurseList generator and return lists instead of strings:
for s in elist[0].makeGenerator()():
for s2 in recurseList(elist[1:]):
yield [s] + [s2] # instead of yield s + s2
Then, we need to only flatten the result:
$ ipython3 -i regex_invert.py
In [1]: import collections
In [2]: def flatten(l):
...: for el in l:
...: if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
...: yield from flatten(el)
...: else:
...: yield el
...:
In [3]: s = "x y{1,2} z"
In [4]: for option in invert(s):
...: print(list(flatten(option)))
...:
['x', ' ', 'y', None, ' ', 'z']
['x', ' ', 'y', 'y', ' ', 'z']
Then, if needed, you can filter the whitespace characters:
In [5]: for option in invert(s):
...: print([item for item in flatten(option) if item != ' '])
...:
['x', 'y', None, 'z']
['x', 'y', 'y', 'z']
Recursive solution (simple, good for up to few thousand tuples):
def permutations(lot):
if not lot:
yield []
else:
item, start, end = lot[0]
for prefix_length in range(start, end+1):
for perm in permutations(lot[1:]):
yield [item]*prefix_length + [None] * (end - prefix_length) + perm
It is limited by the recursion depth (~1000). If it is not enough, there is a simple optimization for start == end cases. Dependin on the expected size of list_of_tuples it might be enough
Test:
>>> list(permutations(list_of_tuples)) # list() because it's an iterator
[['x', 'y', None, 'z'], ['x', 'y', 'y', 'z']]
Without recursion (universal but less elegant):
def permutations(lot):
source = []
cnum = 1 # number of possible combinations
for item, start, end in lot: # create full list without Nones
source += [item] * (end-start+1)
cnum *= (end-start+1)
for i in range(cnum):
bitmask = [True] * len(source)
state = i
pos = 0
for _, start, end in lot:
state, m = divmod(state, end-start+1) # m - number of Nones to insert
pos += end-start+1
bitmask[pos-m:pos] = [None] * m
yield [bitmask[i] and c for i, c in enumerate(source)]
The idea behind this solution: actually, we are kind of looking full string (xyyz) though a glass wich adds certain number of None. We can count numer of possible combinations by calculating product of all (end-start+1). Then, we can just number all iterations (simple range loop) and reconstruct this mask from the iteration number. Here we reconstruct the mask by iteratively using divmod on the state number and using remainder as the number of Nones at the symbol position
The part generating the different lists based on the tuple can be written using list comprehension:
outer = []
for val, start, end in lot:
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
outer.append([[val] * i + [None] * (end - i) for i in range(start, end + 1)])
(The whole thing would be again be written with list comprehension but it makes the code harder to read IMHO).
On the other hand, the list comprehension in [x for x in itertools.chain.from_iterable(combination)] could be written in a more concise way. Indeed, the whole point is to build an actual list out of an iterable. This could be done with : list(itertools.chain.from_iterable(combination)). An aternative would be to use the sum builtin. I am not sure which is better.
Finally, the final.append part could be written with a list comprehension.
# use itertools.product to combine the elements in the list of lists:
# flatten the elements in the current combination,
return [sum(combination, []) for combination in itertools.product(*outer)]
The final code is just based on the code you've written slightly re-organised:
outer = []
for val, start, end in lot:
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
outer.append([[val] * i + [None] * (end - i) for i in range(start, end + 1)])
# use itertools.product to combine the elements in the list of lists:
# flatten the elements in the current combination,
return [sum(combination, []) for combination in itertools.product(*outer)]

Categories