Modifying a string under a specific condition

Modifying a string under a specific condition - python

In a Python program, I am trying to modify a string under a specific condition:
X = ('4c0')
Sig = ['a', 'b', 'c', 'e']
Sig is a list. Additionally, I have a tuple:
T = (4,'d',5)
If the second element (T[1]) is not in Sig, I must create another string, starting from X:
as T[1] ('d') is not in Sig, T[2] must replace T[0] in X ('5' replacing '4');
the last element in X must be added by 1 ('1' replacing '0').
In this case, the desired result should be:
Y = ('5c1')
I made this code but it is not add any string to Y:
Y = []
for i in TT: # TT has the tuple T
i = list(i)
if i[1] not in Sig:
for j in TT:
if type(j[2]) == str:
if i[1] == j[1]:
Y.append(j[2][0]+i[1]+str(int(j[2][2]+1)))
Any ideas how I could solve this problem?

You need to be more clear as to what the conditions are if this is a one-time requirement you can use -
i = 1
if T[i] not in sig:
Y=list(X) # converting the string to a list in order to perform operations
Y[i-1] = T[i+1] # operation 1 [replace element at position -1 as compared to Tuple element position check]
Y[-1] = int(Y[-1])+1 # operation 2 [add 1 to the last element]
print(''.join(map(str, Y)))
'5c1'

There are a lot of missing conditions in your problem statement (e.g. what to do if 'd' was in Sig, How to handle values past 9, what if X[0] is not the same as T[0], what if X[2] is equal to T[2])
For the example given a simple string format should suffice:
X = ('4c0')
Sig = ['a', 'b', 'c', 'e']
T = (4,'d',5)
if T[1] not in Sig:
Y = f"{T[2]}{X[1]}{int(X[2])+1}"
print(Y) # 5c1

Related

Insert an item to a Python list without using insert()

How do you add an item to a specific position on a list. When you have an empty list and want to add 'z' to the 3rd position using insert() only insert it at the last position like,
l.insert(3,'z')
l
['z']
I want the output to be
[None, None, None, 'z']
or
['','','','z']

Try this method using a list comprehension -
n = 5
s = 'z'
out = [None if i!=n-1 else s for i in range(n)]
print(out)
[None, None, None, None, 'z']
If you want to insert the string somewhere in the middle, then a more general way is to define m and n separately where n is length of the list and m is the position -
n = 5
m = 3
s = 'z'
out = [None if i!=m-1 else s for i in range(n)]
print(out)
[None, None, 'z', None, None]

Assuming you want to have it in the Nth index:
l = l[:N] + ['z'] + l[N:]
If you start with an empty list and want it to have Nones at the start and end of array, maybe this will help you (N is the number of None items you want):
l = [None] * N
l = l[:N] + ['z'] + l[N:]

More elegant way to implement regexp-like quantifiers

I'm writing a simple string parser which allows regexp-like quantifiers. An input string might look like this:
s = "x y{1,2} z"
My parser function translates this string to a list of tuples:
list_of_tuples = [("x", 1, 1), ("y", 1, 2), ("z", 1, 1)]
Now, the tricky bit is that I need a list of all valid combinations that are specified by the quantification. The combinations all have to have the same number of elements, and the value None is used for padding. For the given example, the expected output is
[["x", "y", None, "z"], ["x", "y", "y", "z"]]
I do have a working solution, but I'm not really happy with it: it uses two nested for loops, and I find the code somewhat obscure, so there's something generally awkward and clumsy about it:
import itertools
def permute_input(lot):
outer = []
# is there something that replaces these nested loops?
for val, start, end in lot:
inner = []
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
for i in range(start, end + 1):
x = [val] * i + [None] * (end - i)
inner.append(x)
outer.append(inner)
# Outer is now a list of lists.
final = []
# use itertools.product to combine the elements in the
# list of lists:
for combination in itertools.product(*outer):
# flatten the elements in the current combination,
# and append them to the final list:
final.append([x for x
in itertools.chain.from_iterable(combination)])
return final
print(permute_input([("x", 1, 1), ("y", 1, 2), ("z", 1, 1)]))
[['x', 'y', None, 'z'], ['x', 'y', 'y', 'z']]
I suspect that there's a much more elegant way of doing this, possibly hidden somewhere in the itertools module?

One alternative way to approach the problem is to use pyparsing and this example regex parser that would expand a regular expression to possible matching strings. For your x y{1,2} z sample string it would generate two possible strings expanding the quantifier:
$ python -i regex_invert.py
>>> s = "x y{1,2} z"
>>> for item in invert(s):
... print(item)
...
x y z
x yy z
The repetition itself supports both an open-ended range and a closed range and is defined as:
repetition = (
(lbrace + Word(nums).setResultsName("count") + rbrace) |
(lbrace + Word(nums).setResultsName("minCount") + "," + Word(nums).setResultsName("maxCount") + rbrace) |
oneOf(list("*+?"))
)
To get to the desired result, we should modify the way the results are yielded from the recurseList generator and return lists instead of strings:
for s in elist[0].makeGenerator()():
for s2 in recurseList(elist[1:]):
yield [s] + [s2] # instead of yield s + s2
Then, we need to only flatten the result:
$ ipython3 -i regex_invert.py
In [1]: import collections
In [2]: def flatten(l):
...: for el in l:
...: if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
...: yield from flatten(el)
...: else:
...: yield el
...:
In [3]: s = "x y{1,2} z"
In [4]: for option in invert(s):
...: print(list(flatten(option)))
...:
['x', ' ', 'y', None, ' ', 'z']
['x', ' ', 'y', 'y', ' ', 'z']
Then, if needed, you can filter the whitespace characters:
In [5]: for option in invert(s):
...: print([item for item in flatten(option) if item != ' '])
...:
['x', 'y', None, 'z']
['x', 'y', 'y', 'z']

Recursive solution (simple, good for up to few thousand tuples):
def permutations(lot):
if not lot:
yield []
else:
item, start, end = lot[0]
for prefix_length in range(start, end+1):
for perm in permutations(lot[1:]):
yield [item]*prefix_length + [None] * (end - prefix_length) + perm
It is limited by the recursion depth (~1000). If it is not enough, there is a simple optimization for start == end cases. Dependin on the expected size of list_of_tuples it might be enough
Test:
>>> list(permutations(list_of_tuples)) # list() because it's an iterator
[['x', 'y', None, 'z'], ['x', 'y', 'y', 'z']]
Without recursion (universal but less elegant):
def permutations(lot):
source = []
cnum = 1 # number of possible combinations
for item, start, end in lot: # create full list without Nones
source += [item] * (end-start+1)
cnum *= (end-start+1)
for i in range(cnum):
bitmask = [True] * len(source)
state = i
pos = 0
for _, start, end in lot:
state, m = divmod(state, end-start+1) # m - number of Nones to insert
pos += end-start+1
bitmask[pos-m:pos] = [None] * m
yield [bitmask[i] and c for i, c in enumerate(source)]
The idea behind this solution: actually, we are kind of looking full string (xyyz) though a glass wich adds certain number of None. We can count numer of possible combinations by calculating product of all (end-start+1). Then, we can just number all iterations (simple range loop) and reconstruct this mask from the iteration number. Here we reconstruct the mask by iteratively using divmod on the state number and using remainder as the number of Nones at the symbol position

The part generating the different lists based on the tuple can be written using list comprehension:
outer = []
for val, start, end in lot:
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
outer.append([[val] * i + [None] * (end - i) for i in range(start, end + 1)])
(The whole thing would be again be written with list comprehension but it makes the code harder to read IMHO).
On the other hand, the list comprehension in [x for x in itertools.chain.from_iterable(combination)] could be written in a more concise way. Indeed, the whole point is to build an actual list out of an iterable. This could be done with : list(itertools.chain.from_iterable(combination)). An aternative would be to use the sum builtin. I am not sure which is better.
Finally, the final.append part could be written with a list comprehension.
# use itertools.product to combine the elements in the list of lists:
# flatten the elements in the current combination,
return [sum(combination, []) for combination in itertools.product(*outer)]
The final code is just based on the code you've written slightly re-organised:
outer = []
for val, start, end in lot:
# For each tuple, create a list of constant length
# Each element contains a different number of
# repetitions of the value of the tuple, padded
# by the value None if needed.
outer.append([[val] * i + [None] * (end - i) for i in range(start, end + 1)])
# use itertools.product to combine the elements in the list of lists:
# flatten the elements in the current combination,
return [sum(combination, []) for combination in itertools.product(*outer)]

Function that takes a list of strings and returns another specific list

I need to create a function that takes a list of words and then I want to check all the strings in that list and return another list of strings where the string first and last character are the same.
For example, given input_list = ['a','aa','aba','abb'] output should be ['aa','aba'].

Try the following:
def myfunc(lst):
return [item for item in lst if len(item) > 1 and item[0] == item[-1]]
>>> myfunc(['a','aa','aba','abb'])
['aa', 'aba']
>>>

Just check the length is > 1, and see if the first char x[0] is equal to the last char x[-1]:
print(list(filter(lambda x: len(x) > 1 and x[0] == x[-1], lst)))
['aa', 'aba']
Or if you want a function:
f = lambda l:list(filter(lambda x: len(x) > 1 and x[0] == x[-1], l))
print(f(lst))

The way to approach this, typically, is filtering a list, instead of seeing it as a different one, and define any function (either regular or lambda) to express what needs to be filtered on. That way your code is clear and easy to test and maintain:
filteredList = filter(lambda x: len(x) > 1 and x[0] == x[-1], myList)
#or:
def starts_and_ends_with_same_char(subject):
return len(subject) > 1 and str[0] == subject[-1]
filteredList = filter(starts_and_ends_with_same_char, myList)

Golfing a little:
>>> [s for s in lst if s[1:] and s[0] == s[-1]]
['aa', 'aba']

Python: finding lowest integer

I have the following code:
l = ['-1.2', '0.0', '1']
x = 100.0
for i in l:
if i < x:
x = i
print x
The code should find the lowest value in my list (-1.2) but instead when i print 'x' it finds the value is still 100.0
Where is my code going wrong?

To find the minimum value of a list, you might just as well use min:
x = min(float(s) for s in l) # min of a generator
Or, if you want the result as a string, rather than a float, use a key function:
x = min(l, key=float)

You aren't comparing integers, you're comparing strings. Strings compare lexicographically -- meaning character by character -- instead of (as you seem to want) by converting the value to a float. Make your list hold numbers (floats or integers, depending on what you want), or convert the strings to floats or integers in your loop, before you compare them.
You may also be interested in the min builtin function, which already does what your current loop does (without the converting, that is.)

It looks like you want to convert the list to a list of numbers
>>> foo = ['-1.2', '0.0', '1']
>>> bar = map(float, foo)
>>> bar
[-1.2, 0.0, 1.0]
>>> min(bar)
-1.2
or if it really is strings you want, that you want to use min's key argument
>>> foo = ['-1.2', '0.0', '1']
>>> min(foo, key=float)
'-1.2'

Python has a built in min function to help you with finding the smallest.
However, you need to convert your list items to numbers before you can find the lowest integer( what, isn't that float? )
min(float(i) for i in l)

l is a list of strings. When you put numbers between single quotes like that, you are creating strings, which are just a sequence of characters. To make your code work properly, you would have to do this:
l = [-1.2, 0.0, 1] # no quotation marks
x = 100.0
for i in l:
if i < x:
x = i
print x
If you must use a list of strings, you can try to let Python try to make a number out of each string. This is similar to Justin's answer, except it understands floating-point (decimal) numbers correctly.
l = ['-1.2', '0.0', '1']
x = 100.0
for i in l:
inum = float(i)
if inum < x:
x = inum
print x
I hope that this is code that you are writing to learn either Python or programming in general. If this is the case, great. However, if this is production code, consider using Python's built-in functions.
l = ['-1.2', '0.0', '1']
lnums = map(float, l) # turn strings to numbers
x = min(lnums) # find minimum value
print x

number_list = [99.5,1.2,-0.3]
number_list.sort()
print number_list[0]

Cast the variable to a float before doing the comparison:
if float(i) < float(x):
The problem is that you are comparing strings to floats, which will not work.

list1 = [10,-4,5,2,33,4,7,8,2,3,5,8,99,-34]
print(list1)
max_v=list1[0]
min_v=list1[0]
for x in list1:
if x>max_v:
max_v=x
print('X is {0} and max is {1}'.format(x,max_v))
for x in list1:
if x<min_v:
min_v=x
print('X is {0} and min is {1}'.format(x,min_v))
print('Max values is ' + str(max_v))
print('Min values is ' + str(min_v))

Or no float conversion at all by just specifying floats in the list.
l = [-1.2, 0.0, 1]
x = min(l)
or
l = min([-1.2, 0.0, 1])

You have to start somewhere the correct code should be:
The code to return the minimum value
l = [ '0.0', '1','-1.2']
x = l[0]
for i in l:
if i < x:
x = i
print x
But again it's good to use directly integers instead of using quotations ''
This way!
l = [ 0.0, 1,-1.2]
x = l[0]
for i in l:
if i < x:
x = i
print x

l = [-1.2, 0.0, 1]
x = 100.0
for i in l:
if i < x:
x = i
print (x)
This is the answer, i needed this for my homework, took your code, and i deleted the " " around the numbers, it then worked, i hope this helped

You have strings in the list and you are comparing them with the number 100.0.

'''Functions'''
import math
#functions
def min3(x1,x2,x3):
if x1<= x2 and x1<= x3:
return x1
elif x2<= x1 and x2<= x3:
return x2
elif x3<= x2 and x3<= x1:
return x3
print(min3(4, 7, 5))
print(min3(4, 5, 5))
print(min3(4, 4, 4))
print(min3(-2, -6, -100))
print(min3("Z", "B", "A"))

Given a list of slices, how do I split a sequence by them?

Given a list of slices, how do I separate a sequence based on them?
I have long amino-acid strings that I would like to split based on start-stop values in a list. An example is probably the most clear way of explaining it:
str = "MSEPAGDVRQNPCGSKAC"
split_points = [[1,3], [7,10], [12,13]]
output >> ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']
The extra parentheses are to show which elements were selected from the split_points list. I don't expect the start-stop points to ever overlap.
I have a bunch of ideas that would work, but seem terribly inefficient (code-length wise), and it seems like there must be a nice pythonic way of doing this.

Strange way to split strings you have there:
def splitter( s, points ):
c = 0
for x,y in points:
yield s[c:x]
yield "(%s)" % s[x:y+1]
c=y+1
yield s[c:]
print list(splitter(str, split_points))
# => ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']
# if some start and endpoints are the same remove empty strings.
print list(x for x in splitter(str, split_points) if x != '')

Here is a simple solution for you. to grab each of the sets specified by the point.
In[4]: str[p[0]:p[1]+1] for p in split_points]
Out[4]: ['SEP', 'VRQN', 'CG']
To get the parenthesis:
In[5]: ['(' + str[p[0]:p[1]+1] + ')' for p in split_points]
Out[5]: ['(SEP)', '(VRQN)', '(CG)']
Here is the cleaner way of doing it to do the whole deal:
results = []
for i in range(len(split_points)):
start, stop = split_points[i]
stop += 1
last_stop = split_points[i-1][1] + 1 if i > 0 else 0
results.append(string[last_stop:start])
results.append('(' + string[start:stop] + ')')
results.append(string[split_points[-1][1]+1:])
All of the below solutions are bad, and more for fun than anything else, do not use them!
This more of a WTF solution, but I figured I'd post it since it was asked for in comments:
split_points = [(x, y+1) for x, y in split_points]
split_points = [((split_points[i-1][1] if i > 0 else 0, p[0]), p) for i, p in zip(range(len(split_points)), split_points)]
results = [string[n[0]:n[1]] + '\n(' + string[m[0]:m[1]] + ')' for n, m in split_points] + [string[split_points[-1][1][1]:]]
results = '\n'.join(results).split()
still trying to figure out the one liner, here's a two:
split_points = [((split_points[i-1][1]+1 if i > 0 else 0, p[0]), (p[0], p[1]+1)) for i, p in zip(range(len(split_points)), split_points)]
print '\n'.join([string[n[0]:n[1]] + '\n(' + string[m[0]:m[1]] + ')' for n, m in split_points] + [string[split_points[-1][1][1]:]]).split()
And the one liner that should never be used:
print '\n'.join([string[n[0]:n[1]] + '\n(' + string[m[0]:m[1]] + ')' for n, m in (((split_points[i-1][1]+1 if i > 0 else 0, p[0]), (p[0], p[1]+1)) for i, p in zip(range(len(split_points)), split_points))] + [string[split_points[-1][1]:]]).split()

Here's some code that will work.
result = []
last_end = 0
for sp in split_points:
result.append(str[last_end:sp[0]])
result.append('(' + str[sp[0]:sp[1]+1] + ')')
last_end = sp[1]+1
result.append(str[last_end:])
print result
If you just want the parts in the parenthesis it becomes a little simpler:
result = [str[sp[0]:sp[1]+1] for sp in split_points]

Here's a solution that converts your split_points to regular string slices and then prints out the appropriate slices:
str = "MSEPAGDVRQNPCGSKAC"
split_points = [[1, 3], [7, 10], [12, 13]]
adjust = [s for sp in [[x, y + 1] for x, y in split_points] for s in sp]
zipped = zip([None] + adjust, adjust + [None])
out = [('(%s)' if i % 2 else '%s') % str[x:y] for i, (x, y) in
enumerate(zipped)]
print out
>>> ['M', '(SEP)', 'AGD', '(VRQN)', 'P', '(CG)', 'SKAC']

>>> str = "MSEPAGDVRQNPCGSKAC"
>>> split_points = [[1,3], [7,10], [12,13]]
>>>
>>> all_points = sum(split_points, [0]) + [len(str)-1]
>>> map(lambda i,j: str[i:j+1], all_points[:-1], all_points[1:])
['MS', 'SEP', 'PAGDV', 'VRQN', 'NPC', 'CG', 'GSKAC']
>>>
>>> str_out = map(lambda i,j: str[i:j+1], all_points[:-1:2], all_points[1::2])
>>> str_in = map(lambda i,j: str[i:j+1], all_points[1:-1:2], all_points[2::2])
>>> sum(map(list, zip(['(%s)' % s for s in str_in], str_out[1:])), [str_out[0]])
['MS', '(SEP)', 'PAGDV', '(VRQN)', 'NPC', '(CG)', 'GSKAC']

Probably not for elegance, but just because I can do it in a oneliner :)
>>> reduce(lambda a,ij:a[:-1]+[str[a[-1]:ij[0]],'('+str[ij[0]:ij[1]+1]+')',
ij[1]], split_points, [0])[:-1] + [str[split_points[-1][-1]+1:]]
['M', '(SEP)', 'PAGD', '(VRQN)', 'NP', '(CG)', 'SKAC']
Maybe you like it. Here some explanation:
In your question you pass one set of slices, and implicitly you want to have the complement set of slices as well (to generate the un-parenthesized [is that English?] slices). So basically, each slice [i,j] lacks the previous j. e.g. [7,10] lacks the 3 and [1,3] lacks the 0.
reduce processes lists and at each step passes the output so far (a) plus the next input element (ij). The trick is that apart from producing the plain output, we add each time an extra variable --- a sort of memory --- which is in the next step retrieved in a[-1]. In this particular example we store the last j value, and hence at all times we have the full information to provide both the unparenthesized and the parenthesized substring.
Finally, the memory is stripped with [:-1] and replaced by the remainder of the original str in [str[split_points[-1][-1]+1:]].

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Modifying a string under a specific condition - python

Related

Insert an item to a Python list without using insert()

More elegant way to implement regexp-like quantifiers

Function that takes a list of strings and returns another specific list

Python: finding lowest integer

Given a list of slices, how do I split a sequence by them?

Categories

Resources