Insert an item to a Python list without using insert() - python

How do you add an item to a specific position on a list. When you have an empty list and want to add 'z' to the 3rd position using insert() only insert it at the last position like,
l.insert(3,'z')
l
['z']
I want the output to be
[None, None, None, 'z']
or
['','','','z']

Try this method using a list comprehension -
n = 5
s = 'z'
out = [None if i!=n-1 else s for i in range(n)]
print(out)
[None, None, None, None, 'z']
If you want to insert the string somewhere in the middle, then a more general way is to define m and n separately where n is length of the list and m is the position -
n = 5
m = 3
s = 'z'
out = [None if i!=m-1 else s for i in range(n)]
print(out)
[None, None, 'z', None, None]

Assuming you want to have it in the Nth index:
l = l[:N] + ['z'] + l[N:]
If you start with an empty list and want it to have Nones at the start and end of array, maybe this will help you (N is the number of None items you want):
l = [None] * N
l = l[:N] + ['z'] + l[N:]

Related

Modifying a string under a specific condition

In a Python program, I am trying to modify a string under a specific condition:
X = ('4c0')
Sig = ['a', 'b', 'c', 'e']
Sig is a list. Additionally, I have a tuple:
T = (4,'d',5)
If the second element (T[1]) is not in Sig, I must create another string, starting from X:
as T[1] ('d') is not in Sig, T[2] must replace T[0] in X ('5' replacing '4');
the last element in X must be added by 1 ('1' replacing '0').
In this case, the desired result should be:
Y = ('5c1')
I made this code but it is not add any string to Y:
Y = []
for i in TT: # TT has the tuple T
i = list(i)
if i[1] not in Sig:
for j in TT:
if type(j[2]) == str:
if i[1] == j[1]:
Y.append(j[2][0]+i[1]+str(int(j[2][2]+1)))
Any ideas how I could solve this problem?
You need to be more clear as to what the conditions are if this is a one-time requirement you can use -
i = 1
if T[i] not in sig:
Y=list(X) # converting the string to a list in order to perform operations
Y[i-1] = T[i+1] # operation 1 [replace element at position -1 as compared to Tuple element position check]
Y[-1] = int(Y[-1])+1 # operation 2 [add 1 to the last element]
print(''.join(map(str, Y)))
'5c1'
There are a lot of missing conditions in your problem statement (e.g. what to do if 'd' was in Sig, How to handle values past 9, what if X[0] is not the same as T[0], what if X[2] is equal to T[2])
For the example given a simple string format should suffice:
X = ('4c0')
Sig = ['a', 'b', 'c', 'e']
T = (4,'d',5)
if T[1] not in Sig:
Y = f"{T[2]}{X[1]}{int(X[2])+1}"
print(Y) # 5c1

How to make a list comprehension in python with unequal sublists

I have a list of unequal lists.
I would like to generate a new list with list comprehension from the sublists.
s = [['a','b','c','d'],['e','f','g'],['h','i'],['j','k','l','m']]
I am trying the following code but it keeps raising an indexError:
new_s = []
for i in range(len(s)):
new_s.append((i,[t[i] for t in s if t[i]))
The expected output would be:
new_s = [(0,['a','e','h','j']),(1,['b','f','i','k']),(2,['c','g','l']),(3,['d','m'])]
Any ideas how to get this to work?
You can use itertools.zip_longest to iterate over each sublist elementwise, while using None as the fill value for the shorter sublists.
Then use filter to remove the None values that were used from padding.
So all together in a list comprehension:
>>> from itertools import zip_longest
>>> [(i, list(filter(None, j))) for i, j in enumerate(zip_longest(*s))]
[(0, ['a', 'e', 'h', 'j']), (1, ['b', 'f', 'i', 'k']), (2, ['c', 'g', 'l']), (3, ['d', 'm'])]
one without itertools but modifying the original list.
def until_depleted():
while any(sl for sl in s):
yield 1
list(enumerate(list(s_l.pop(0) for s_l in s if s_l) for _ in until_depleted()))
one without modifying the original but with a counter
idx = 0
max_idx = max(len(_) for _ in s)
def until_maxidx():
global idx
while idx < max_idx:
yield 1
idx += 1
list(enumerate(list(s_l[idx] for s_l in s if idx < len(s_l)) for _ in until_maxidx()))
A more explicit one without the inner comprehension nor calling generators:
ret = []
idx = 0
max_idx = max(len(_) for _ in s)
while idx < max_idx:
ret.append(list(s_l[idx] for s_l in s if idx < len(s_l)))
idx += 1
print(list(enumerate(ret)))
This is without itertools, but also without a comprehension, so I don't know if it does count as a solution.
s = [['a','b','c','d'],['e','f','g'],['h','i'],'j','k','l','m']]
new_s = []
for i in range(len(s)):
tmp = []
for item in s:
tmp.extend(item[i:i+1])
new_s.append((i, tmp))

More elegant way to implement regexp-like quantifiers

I'm writing a simple string parser which allows regexp-like quantifiers. An input string might look like this:
s = "x y{1,2} z"
My parser function translates this string to a list of tuples:
list_of_tuples = [("x", 1, 1), ("y", 1, 2), ("z", 1, 1)]
Now, the tricky bit is that I need a list of all valid combinations that are specified by the quantification. The combinations all have to have the same number of elements, and the value None is used for padding. For the given example, the expected output is
[["x", "y", None, "z"], ["x", "y", "y", "z"]]
I do have a working solution, but I'm not really happy with it: it uses two nested for loops, and I find the code somewhat obscure, so there's something generally awkward and clumsy about it:
import itertools
def permute_input(lot):
outer = []
# is there something that replaces these nested loops?
for val, start, end in lot:
inner = []
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
for i in range(start, end + 1):
x = [val] * i + [None] * (end - i)
inner.append(x)
outer.append(inner)
# Outer is now a list of lists.
final = []
# use itertools.product to combine the elements in the
# list of lists:
for combination in itertools.product(*outer):
# flatten the elements in the current combination,
# and append them to the final list:
final.append([x for x
in itertools.chain.from_iterable(combination)])
return final
print(permute_input([("x", 1, 1), ("y", 1, 2), ("z", 1, 1)]))
[['x', 'y', None, 'z'], ['x', 'y', 'y', 'z']]
I suspect that there's a much more elegant way of doing this, possibly hidden somewhere in the itertools module?
One alternative way to approach the problem is to use pyparsing and this example regex parser that would expand a regular expression to possible matching strings. For your x y{1,2} z sample string it would generate two possible strings expanding the quantifier:
$ python -i regex_invert.py
>>> s = "x y{1,2} z"
>>> for item in invert(s):
... print(item)
...
x y z
x yy z
The repetition itself supports both an open-ended range and a closed range and is defined as:
repetition = (
(lbrace + Word(nums).setResultsName("count") + rbrace) |
(lbrace + Word(nums).setResultsName("minCount") + "," + Word(nums).setResultsName("maxCount") + rbrace) |
oneOf(list("*+?"))
)
To get to the desired result, we should modify the way the results are yielded from the recurseList generator and return lists instead of strings:
for s in elist[0].makeGenerator()():
for s2 in recurseList(elist[1:]):
yield [s] + [s2] # instead of yield s + s2
Then, we need to only flatten the result:
$ ipython3 -i regex_invert.py
In [1]: import collections
In [2]: def flatten(l):
...: for el in l:
...: if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
...: yield from flatten(el)
...: else:
...: yield el
...:
In [3]: s = "x y{1,2} z"
In [4]: for option in invert(s):
...: print(list(flatten(option)))
...:
['x', ' ', 'y', None, ' ', 'z']
['x', ' ', 'y', 'y', ' ', 'z']
Then, if needed, you can filter the whitespace characters:
In [5]: for option in invert(s):
...: print([item for item in flatten(option) if item != ' '])
...:
['x', 'y', None, 'z']
['x', 'y', 'y', 'z']
Recursive solution (simple, good for up to few thousand tuples):
def permutations(lot):
if not lot:
yield []
else:
item, start, end = lot[0]
for prefix_length in range(start, end+1):
for perm in permutations(lot[1:]):
yield [item]*prefix_length + [None] * (end - prefix_length) + perm
It is limited by the recursion depth (~1000). If it is not enough, there is a simple optimization for start == end cases. Dependin on the expected size of list_of_tuples it might be enough
Test:
>>> list(permutations(list_of_tuples)) # list() because it's an iterator
[['x', 'y', None, 'z'], ['x', 'y', 'y', 'z']]
Without recursion (universal but less elegant):
def permutations(lot):
source = []
cnum = 1 # number of possible combinations
for item, start, end in lot: # create full list without Nones
source += [item] * (end-start+1)
cnum *= (end-start+1)
for i in range(cnum):
bitmask = [True] * len(source)
state = i
pos = 0
for _, start, end in lot:
state, m = divmod(state, end-start+1) # m - number of Nones to insert
pos += end-start+1
bitmask[pos-m:pos] = [None] * m
yield [bitmask[i] and c for i, c in enumerate(source)]
The idea behind this solution: actually, we are kind of looking full string (xyyz) though a glass wich adds certain number of None. We can count numer of possible combinations by calculating product of all (end-start+1). Then, we can just number all iterations (simple range loop) and reconstruct this mask from the iteration number. Here we reconstruct the mask by iteratively using divmod on the state number and using remainder as the number of Nones at the symbol position
The part generating the different lists based on the tuple can be written using list comprehension:
outer = []
for val, start, end in lot:
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
outer.append([[val] * i + [None] * (end - i) for i in range(start, end + 1)])
(The whole thing would be again be written with list comprehension but it makes the code harder to read IMHO).
On the other hand, the list comprehension in [x for x in itertools.chain.from_iterable(combination)] could be written in a more concise way. Indeed, the whole point is to build an actual list out of an iterable. This could be done with : list(itertools.chain.from_iterable(combination)). An aternative would be to use the sum builtin. I am not sure which is better.
Finally, the final.append part could be written with a list comprehension.
# use itertools.product to combine the elements in the list of lists:
# flatten the elements in the current combination,
return [sum(combination, []) for combination in itertools.product(*outer)]
The final code is just based on the code you've written slightly re-organised:
outer = []
for val, start, end in lot:
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
outer.append([[val] * i + [None] * (end - i) for i in range(start, end + 1)])
# use itertools.product to combine the elements in the list of lists:
# flatten the elements in the current combination,
return [sum(combination, []) for combination in itertools.product(*outer)]

python list with fixed number of elements

I want to create a fixed size list of size 6 of tuples in Python. Note that in the code below i am always re-initializing the values in the outer for loop in order to reset the previously created list which was already added to globalList. Here is a snippet:
for i,j in dictCaseId.iteritems():
listItems=[None]*6
for x,y in j:
if x=='cond':
tuppo = y
listItems.insert(0,tuppo)
if x=='act':
tuppo = y
listItems.insert(1,tuppo)
if x=='correc':
tuppo = y
listItems.insert(2,tuppo)
...
...
globalList.append(listItems)
But when I try to run the above (snippet only shown above) it increases the list size. I mean, stuff gets added but I also see the list contains more number of elements. I dont want my list size to increase and my list is a list of 6 tuples.
For example:
Initially: [None,None,None,None,None,None]
What I desire: [Mark,None,Jon,None,None,None]
What I get: [Mark,None,Jon,None,None,None,None,None]
Instead of inserting you should assign those values. list.insert inserts a new element at the index passed to it, so length of list increases by 1 after each insert operation.
On the other hand assignment modifies the value at a particular index, so length remains constant.
for i,j in dictCaseId.iteritems():
listItems=[None]*6
for x,y in j:
if x=='cond':
tuppo = y
listItems[0]=tuppo
if x=='act':
tuppo = y
listItems[1]=tuppo
if x=='correc':
tuppo = y
listItems[2]=tuppo
Example:
>>> lis = [None]*6
>>> lis[1] = "foo" #change the 2nd element
>>> lis[4] = "bar" #change the fifth element
>>> lis
[None, 'foo', None, None, 'bar', None]
Update:
>>> lis = [[] for _ in xrange(6)] # don't use [[]]*6
>>> lis[1].append("foo")
>>> lis[4].append("bar")
>>> lis[1].append("py")
>>> lis
[[], ['foo', 'py'], [], [], ['bar'], []]
Ashwini fixes your main issue, however I will make a suggestion here.
Because (from what you have shown us) you are simply assigning an element to a specific index based on a condition, it would be better to do something like this:
for i, j in dictCaseId.iteritems():
listItems = [None] * 6
lst = ['cond', 'act', 'correc', ...]
for x, y in j:
idx = lst.index(x)
listItems[idx] = y
globalList.append(listItems)
Or with a list of lists:
for i, j in dictCaseId.iteritems():
listItems = [[] for _ in xrange(6)]
lst = ['cond', 'act', 'correc', ...]
for x, y in j:
idx = lst.index(x)
listItems[idx].append(y)
globalList.append(listItems)
This allows each of the conditions to be dealt with in one go, and it condenses your code significantly.
Instead of listItems.insert(0, tuppo), do listItems[0] = tuppo etc.
thats becauce you insert the names indestead of changing the value.
do like this indstead
for i,j in dictCaseId.iteritems():
listItems=[None]*6
for x,y in j:
if x=='cond':
tuppo = y
listItems[0] = tuppo
if x=='act':
tuppo = y
listItems[1] = tuppo
if x=='correc':
tuppo = y
listItems[2] = tuppo ...
...
globalList.append(listItems)

Given a list of slices, how do I split a sequence by them?

Given a list of slices, how do I separate a sequence based on them?
I have long amino-acid strings that I would like to split based on start-stop values in a list. An example is probably the most clear way of explaining it:
str = "MSEPAGDVRQNPCGSKAC"
split_points = [[1,3], [7,10], [12,13]]
output >> ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']
The extra parentheses are to show which elements were selected from the split_points list. I don't expect the start-stop points to ever overlap.
I have a bunch of ideas that would work, but seem terribly inefficient (code-length wise), and it seems like there must be a nice pythonic way of doing this.
Strange way to split strings you have there:
def splitter( s, points ):
c = 0
for x,y in points:
yield s[c:x]
yield "(%s)" % s[x:y+1]
c=y+1
yield s[c:]
print list(splitter(str, split_points))
# => ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']
# if some start and endpoints are the same remove empty strings.
print list(x for x in splitter(str, split_points) if x != '')
Here is a simple solution for you. to grab each of the sets specified by the point.
In[4]: str[p[0]:p[1]+1] for p in split_points]
Out[4]: ['SEP', 'VRQN', 'CG']
To get the parenthesis:
In[5]: ['(' + str[p[0]:p[1]+1] + ')' for p in split_points]
Out[5]: ['(SEP)', '(VRQN)', '(CG)']
Here is the cleaner way of doing it to do the whole deal:
results = []
for i in range(len(split_points)):
start, stop = split_points[i]
stop += 1
last_stop = split_points[i-1][1] + 1 if i > 0 else 0
results.append(string[last_stop:start])
results.append('(' + string[start:stop] + ')')
results.append(string[split_points[-1][1]+1:])
All of the below solutions are bad, and more for fun than anything else, do not use them!
This more of a WTF solution, but I figured I'd post it since it was asked for in comments:
split_points = [(x, y+1) for x, y in split_points]
split_points = [((split_points[i-1][1] if i > 0 else 0, p[0]), p) for i, p in zip(range(len(split_points)), split_points)]
results = [string[n[0]:n[1]] + '\n(' + string[m[0]:m[1]] + ')' for n, m in split_points] + [string[split_points[-1][1][1]:]]
results = '\n'.join(results).split()
still trying to figure out the one liner, here's a two:
split_points = [((split_points[i-1][1]+1 if i > 0 else 0, p[0]), (p[0], p[1]+1)) for i, p in zip(range(len(split_points)), split_points)]
print '\n'.join([string[n[0]:n[1]] + '\n(' + string[m[0]:m[1]] + ')' for n, m in split_points] + [string[split_points[-1][1][1]:]]).split()
And the one liner that should never be used:
print '\n'.join([string[n[0]:n[1]] + '\n(' + string[m[0]:m[1]] + ')' for n, m in (((split_points[i-1][1]+1 if i > 0 else 0, p[0]), (p[0], p[1]+1)) for i, p in zip(range(len(split_points)), split_points))] + [string[split_points[-1][1]:]]).split()
Here's some code that will work.
result = []
last_end = 0
for sp in split_points:
result.append(str[last_end:sp[0]])
result.append('(' + str[sp[0]:sp[1]+1] + ')')
last_end = sp[1]+1
result.append(str[last_end:])
print result
If you just want the parts in the parenthesis it becomes a little simpler:
result = [str[sp[0]:sp[1]+1] for sp in split_points]
Here's a solution that converts your split_points to regular string slices and then prints out the appropriate slices:
str = "MSEPAGDVRQNPCGSKAC"
split_points = [[1, 3], [7, 10], [12, 13]]
adjust = [s for sp in [[x, y + 1] for x, y in split_points] for s in sp]
zipped = zip([None] + adjust, adjust + [None])
out = [('(%s)' if i % 2 else '%s') % str[x:y] for i, (x, y) in
enumerate(zipped)]
print out
>>> ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']
>>> str = "MSEPAGDVRQNPCGSKAC"
>>> split_points = [[1,3], [7,10], [12,13]]
>>>
>>> all_points = sum(split_points, [0]) + [len(str)-1]
>>> map(lambda i,j: str[i:j+1], all_points[:-1], all_points[1:])
['MS', 'SEP', 'PAGDV', 'VRQN', 'NPC', 'CG', 'GSKAC']
>>>
>>> str_out = map(lambda i,j: str[i:j+1], all_points[:-1:2], all_points[1::2])
>>> str_in = map(lambda i,j: str[i:j+1], all_points[1:-1:2], all_points[2::2])
>>> sum(map(list, zip(['(%s)' % s for s in str_in], str_out[1:])), [str_out[0]])
['MS', '(SEP)', 'PAGDV', '(VRQN)', 'NPC', '(CG)', 'GSKAC']
Probably not for elegance, but just because I can do it in a oneliner :)
>>> reduce(lambda a,ij:a[:-1]+[str[a[-1]:ij[0]],'('+str[ij[0]:ij[1]+1]+')',
ij[1]], split_points, [0])[:-1] + [str[split_points[-1][-1]+1:]]
['M', '(SEP)', 'PAGD', '(VRQN)', 'NP', '(CG)', 'SKAC']
Maybe you like it. Here some explanation:
In your question you pass one set of slices, and implicitly you want to have the complement set of slices as well (to generate the un-parenthesized [is that English?] slices). So basically, each slice [i,j] lacks the previous j. e.g. [7,10] lacks the 3 and [1,3] lacks the 0.
reduce processes lists and at each step passes the output so far (a) plus the next input element (ij). The trick is that apart from producing the plain output, we add each time an extra variable --- a sort of memory --- which is in the next step retrieved in a[-1]. In this particular example we store the last j value, and hence at all times we have the full information to provide both the unparenthesized and the parenthesized substring.
Finally, the memory is stripped with [:-1] and replaced by the remainder of the original str in [str[split_points[-1][-1]+1:]].

Categories