How to remove character from tuples in list? - python

How to remove "(" ,")" form
[('(10', '40)'), ('(40', '30)'), ('(20', '20)')]
by python?

Straightforward, use list comprehension and literal_eval.
>>> from ast import literal_eval
>>> tuple_list = [('(10', '40)'), ('(40', '30)'), ('(20', '20)')]
>>> [literal_eval(','.join(i)) for i in tuple_list]
[(10, 40), (40, 30), (20, 20)]

Depending on how you are currently storing the list:
def to_int(s):
s = ''.join(ch for ch in s if ch.isdigit())
return int(s)
lst = [('(10', '40)'), ('(40', '30)'), ('(20', '20)')]
lst = [(to_int(a), to_int(b)) for a,b in lst] # => [(10, 40), (40, 30), (20, 20)]
or
import ast
s = "[('(10', '40)'), ('(40', '30)'), ('(20', '20)')]"
s = s.replace("'(", "'").replace(")'", "'")
lst = ast.literal_eval(s) # => [('10', '40'), ('40', '30'), ('20', '20')]
lst = [(int(a), int(b)) for a,b in lst] # => [(10, 40), (40, 30), (20, 20)]

>>> L = [('(10', '40)'), ('(40', '30)'), ('(20', '20)')]
>>> [tuple((subl[0].lstrip("("), subl[1].rstrip(")"))) for subl in L]
[('10', '40'), ('40', '30'), ('20', '20')]
Or if you wan the numbers in your tuples to eventually be ints:
>>> [tuple((int(subl[0].lstrip("(")), int(subl[1].rstrip(")")))) for subl in L]
[(10, 40), (40, 30), (20, 20)]

You can call .strip('()') on individual items (if they are strings, like in your example) to strip trailing ( and ).
There are multiple ways to apply that on single elements:
list comprehension (most pythonic)
a = [tuple(x.strip('()') for x in y) for y in a]
map and lambda (interesting to see)
Python 3:
def cleanup(a: "list<tuple<str>>") -> "list<tuple<int>>":
return list(map(lambda y: tuple(map(lambda x: x.strip('()'), y)), a))
a = cleanup(a)
Python 2:
def cleanup(a):
return map(lambda y: tuple(map(lambda x: x.strip('()'), y)), a)
a = cleanup(a)

Process the original string instead. Let's call it a.
On a='((10 40), (40 30), (20 20), (30 10))' , you can call
[tuple(x[1:-1].split(' ')) for x in a[1:-1].split(', ')]
The [1:-1] trims brackets from the string, the splits split strings into lists of strings.
The for is a comprehension.

s = "((10 40), (40 30), (20 20), (30 10))"
print [[int(x) for x in inner.strip(' ()').split()] for inner in s.split(',')]
# or if you actually need tuples:
tuple([tuple([int(x) for x in inner.strip(' ()').split()]) for inner in s.split(',')])

Related

Slice array of tuples by its elements

For example, I have an array of tuples.
l=[(456,33,1),
(556,22,1),
(123,33,2),
(557,32,2),
(435,21,2)]
I want make the arrays of lists by the last element in each tuple, so that
l=[[(456,33),
(556,22)],
[(123,33),
(557,32),
(435,21)]]
How could I do that? Is there any elegant way?
l = [(456, 33, 1), (556, 22, 1), (123, 33, 2), (557, 32, 2), (435, 21, 2)]
out = {}
for v in l:
out.setdefault(v[-1], []).append(v[:-1])
out = list(out.values())
print(out)
Prints:
[[(456, 33), (556, 22)],
[(123, 33), (557, 32), (435, 21)]]
Another way of doing this would be with itertools.groupby.
from itertools import groupby
from operator import itemgetter
res = [[x[:-1] for x in g] for _, g in groupby(sorted(l, key=itemgetter(2)), itemgetter(2))]
which produces the desired:
[[(456, 33), (556, 22)], [(123, 33), (557, 32), (435, 21)]]
Note that if you can guarantee that the original list l is sorted by the 3rd item in the tuples, you can replace sorted(l, key=itemgetter(2)) with just l.
An alternative attempt is using this:
l=[(456,33,1),
(556,22,1),
(123,33,2),
(557,32,2),
(435,21,2)]
dict_ = {}
for v in l:
val1, val2, id_ = v
dict_[id_] = [[val1, val2]] if id_ not in dict_ else dict_[id_] + [[val1, val2]]
l = [dict_[x] for x in dict_]

convert and process a dictionary to matrix in python

I have a big list in python like the following small example:
small example:
['GAATTCCTTGAGGCCTAAATGCATCGGGGTGCTCTGGTTTTGTTGTTGTTATTTCTGAATGACATTTACTTTGGTGCTCTTTATTTTGCGTATTTAAAAC', 'TAAGTCCCTAAGCATATATATAATCATGAGTAGTTGTGGGGAAAATAACACCATTAAATGTACCAAAACAAAAGACCGATCACAAACACTGCCGATGTTTCTCTGGCTTAAATTAAATGTATATACAACTTATATGATAAAATACTGGGC']
I want to make a new list in which every string will be converted to a new list and every list has some tuples. in fact I want to divide the length of each string by 10. the 1st tuple would be (1, 10) and the 2nd tuple would be (10, 20) until the end , depending on the length of the string. at the end, every string will be a list oftuples and finally I would have a list of lists.
in the small example the 1st string has 100 characters and the 2nd string has 150 characters.
for example the expected output for the small example would be:
new_list = [[(1, 10), (10, 20), (20, 30), (30, 40), (40, 50), (50, 60), (60, 70), (70, 80), (80, 90), (90, 100)], [(1, 10), (10, 20), (20, 30), (30, 40), (40, 50), (50, 60), (60, 70), (70, 80), (80, 90), (90, 100), (100, 110), (110, 120), (120, 130), (130, 140), (140, 150)]]
to make such list I made the following code but it does not return what I expect. do you know how to fix it?
mylist = []
valTup = list()
for count, char in enumerate(mylist):
if count % 10 == 0 and count > 0:
valTup.append(count)
else:
new_list.append(tuple(valTup))
I recommend to use the package boltons
boltons.iterutils
boltons.iterutils.chunked_iter(src, size) returns pieces of
the source iterable in size -sized chunks (this example was copied
from the docs):
>>> list(chunked_iter(range(10), 3))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
Example:
from boltons.iterutils import chunked_iter
adn = [
'GAATTCCTTGAGGCCTAAATGCATCGGGGTGCTCTGGTTTTGTTGTTGTTATTTCTGAATGACATTTACTTTGGTGCTCTTTATTTTGCGTATTTAAAAC',
'TAAGTCCCTAAGCATATATATAATCATGAGTAGTTGTGGGGAAAATAACACCATTAAATGTACCAAAACAAAAGACCGATCACAAACACTGCCGATGTTTCTCTGGCTTAAATTAAATGTATATACAACTTATATGATAAAATACTGGGC'
]
result = []
for s in adn:
result.append(list(chunked_iter(list(s), 10)))
print(result)
I suggest you the following solutions, the first one based on your code, the second one taking only one line, and finally the third one which is my preferred solution based on range(), zip() and slicing:
mylist = ['GAATTCCTTGAGGCCTAAATGCATCGGGGTGCTCTGGTTTTGTTGTTGTTATTTCTGAATGACATTTACTTTGGTGCTCTTTATTTTGCGTATTTAAAAC',
'TAAGTCCCTAAGCATATATATAATCATGAGTAGTTGTGGGGAAAATAACACCATTAAATGTACCAAAACAAAAGACCGATCACAAACACTGCCGATGTTTCTCTGGCTTAAATTAAATGTATATACAACTTATATGATAAAATACTGGGC']
# Here is the solution based on your code
resultlist = []
for s in mylist:
valTup = []
for count, char in enumerate(s, 1):
if count % 10 == 0:
valTup.append((count-10, count))
resultlist.append(valTup)
print(resultlist)
# Here is the one-line style solution
resultlist = [[(n-10, n) for n,char in enumerate(s, 1) if n % 10 == 0] for s in mylist]
print(resultlist)
# Here is my preferred solution
resultlist = []
for s in mylist:
temp = range(1+len(s))[::10]
resultlist.append(list(zip(temp[:-1], temp[1:])))
print(resultlist)
Are you looking for something like this?
mylist = ['GAATTCCTTGAGGCCTAAATGCATCGGGGTGCTCTGGTTTTGTTGTTGTTATTTCTGAATGACATTTACTTTGGTGCTCTTTATTTTGCGTATTTAAAAC', 'TAAGTCCCTAAGCATATATATAATCATGAGTAGTTGTGGGGAAAATAACACCATTAAATGTACCAAAACAAAAGACCGATCACAAACACTGCCGATGTTTCTCTGGCTTAAATTAAATGTATATACAACTTATATGATAAAATACTGGGC']
new_list1 = list()
new_list2 = list()
for i in range(len(mylist[0])/10):
if(10+i*10 <= len(mylist[0])):
new_list1.append(mylist[0][0+i*10:10+i*10])
else:
new_list1.append(mylist[0][0+i*10:])
for i in range(len(mylist[1])/10):
if(10+i*10 <= len(mylist[1])):
new_list2.append(mylist[1][0+i*10:10+i*10])
else:
new_list2.append(mylist[1][0+i*10:])
new_list = [new_list1,new_list2]
[['GAATTCCTTG', 'AGGCCTAAAT', 'GCATCGGGGT', 'GCTCTGGTTT',
'TGTTGTTGTT', 'ATTTCTGAAT', 'GACATTTACT', 'TTGGTGCTCT',
'TTATTTTGCG', 'TATTTAAAAC'], ['TAAGTCCCTA', 'AGCATATATA',
'TAATCATGAG', 'TAGTTGTGGG', 'GAAAATAACA', 'CCATTAAATG',
'TACCAAAACA', 'AAAGACCGAT', 'CACAAACACT', 'GCCGATGTTT',
'CTCTGGCTTA', 'AATTAAATGT', 'ATATACAACT', 'TATATGATAA',
'AATACTGGGC']]

How to find top n overlapping items in two lists of tuples (Python)

Given
a = [('AB', 11), ('CD', 12), ('EF', 13), ('GG', 1332)]
and
b = [('AB', 411), ('XX', 132), ('EF', 113), ('AF', 113), ('FF', 113)]
If n = 3, I want to only consider the top 3 elements in each lists and return tuples that have same first element (the string).
For example, I want to return ['AB','EF'] in this case.
How can I do this?
You could use Counter for this like:
Code:
a = [('AB', 11), ('CD', 12), ('EF', 13), ('GG', 1332)]
b = [('AB', 411), ('XX', 132), ('EF', 113), ('AF', 113), ('FF', 113)]
from collections import Counter
counts = Counter(x[0] for x in a[:3] + b[:3])
print([x for x, c in counts.items() if c == 2])
And without any imports, use a set:
print(set((x[0] for x in a[:3])).intersection(set((x[0] for x in b[:3]))))
Results:
['AB', 'EF']
{'AB', 'EF'}
Do you mean like this?
def overlapping(n, tups_a, tups_b):
overlapping = set(map(lambda x: x[0], tups_a[:n])).intersection(set(map(lambda x: x[0], tups_b[:n])))
return list(overlapping)
overlap = overlapping(3, a, b)
['AB', 'EF']
using set intersection (better complexity over list's in):
def overlapping(x,y, topn=3):
return {i[0] for i in x[:topn]} & {i[0] for i in y[:topn]}
overlapping(a,b)
outputs:
{'AB', 'EF'}
Exaplanation:
{i[0] for i in x[:topn]}
set comprehensions, equivalent to set(i[0] for i in x[:topn])
{...} & {...}
set intersection, equivalent to set(..).intersection(set(...))
Well first, we can start off with a for loop. We want to loop from 0 to n, check the tuples of a and b at those indices, and then check if that tuple's first element match.
matches = [a [index] [0] for index in range (n) if a [index] [0] == b [index] [0]]
Which does the same thing as:
matches = []
for index in range (n):
if a [index] [0] == b [index] [0]: matches.append a [index] [0]

Python. Best way to "zip" char and list

I want to "zip" char and list in Python:
An example:
char = '<'
list = [3, 23, 67]
"zip"(char, list)
>>> [('<', 3), ('<', 23), ('<', 67)]
How I'm using itertools.repeat():
itertools.izip(itertools.repeat(char, len(list)), list)
>>>[('<', 3), ('<', 23), ('<', 67)]
It works, but it so interesting to find more pythonic solution.
You don't need itertools here.
Using list comprehension:
>>> char = '<'
>>> lst = [3, 23, 67]
>>> [(char, n) for n in lst]
[('<', 3), ('<', 23), ('<', 67)]
BTW, don't use list as a variable name. It shadows builtin function list.
[(char, i) for i in list]
Naming your list as "list" is probably not a good idea btw., as this shadows the constructor for the internal list type.
If you want something equivalent to your use of itertools - using lazy generation for iteration - then you can use generator expressions. The syntax is pretty much equivalent to list comprehensions except you enclose the expression with paranthesis.
>>> c = '<'
>>> l = [3, 23, 67]
>>> my_gen = ((c, item) for item in l)
>>> for item in my_gen:
... print item
...
('<', 3)
('<', 23)
('<', 67)
For more info, here's the PEP that explains it: http://www.python.org/dev/peps/pep-0289/
If char is only ever going to be reused for all pairings, just use a list comprehension:
>>> [(char, i) for i in lst]
[('<', 3), ('<', 23), ('<', 67)]
If char is a string of characters, and you wanted to cycle through them when pairing (like zip() would for the shortest length sequence), use itertools.cycle():
>>> from itertools import cycle
>>> chars = 'fizz'
>>> lst = range(6)
>>> zip(chars, lst)
[('f', 0), ('i', 1), ('z', 2), ('z', 3)]
>>> zip(cycle(chars), lst)
[('f', 0), ('i', 1), ('z', 2), ('z', 3), ('f', 4), ('i', 5)]
Note how the characters of the string 'fizz' are reused to pair up with the numbers 4 and 5; they'll continue to be cycled to match any length list (which must be finite).
If you really want to use zip, here is how :
l = [3, 23, 67]
zip('<' * len(l), l)
[('<', 3), ('<', 23), ('<', 67)]
In further details, itertools.repeat(char, len(list)) is quite similar in result to '<' * 3. Also, both work with zip (you could write zip(itertools.repeat(char, len(list)), l)), too).

removing something from a list of tuples

Say I have a list:
[(12,34,1),(123,34,1),(21,23,1)]
I want to remove the 1 from each tuple in the list so it becomes
[(12,34),(123,34),(21,23)]
You want to truncate your tuples, use a list comprehension:
[t[:-1] for t in listoftuples]
or, as a simple demonstration:
>>> listoftuples = [(12,34,1),(123,34,1),(21,23,1)]
>>> [t[:-1] for t in listoftuples]
[(12, 34), (123, 34), (21, 23)]
Tuples are immutable, so you can not remove an item. However, you can create a new tuple from the old tuple not including the elements you do not want to. So, to delete an arbitrary item from each tuple from a list of tuples, you can do:
def deleteItem(lst, toDel):
return [tuple(x for x in y if x != toDel) for y in lst]
Result:
>>> lst = [(12,34,1),(123,34,1),(21,23,1)]
>>> deleteItem(lst, 1)
[(12, 34), (123, 34), (21, 23)]
>>> a=[(12, 34, 1), (123, 34, 1), (21, 23, 1)]
>>> [filter (lambda a: a != 1, x) for x in a]
[(12, 34), (123, 34), (21, 23)]
THis will remove all 1 from the tuple irrespective of index
Since you can't change tuples (as they are immutable), I suggest using a list of lists:
my_list = [[12,34,1],[123,34,1],[21,23,1]]
for i in my_list:
i.remove(1)
return my_list
This returns: [[12, 34], [123, 34], [21, 21]].
python 3.2
1. [(i,v)for i,v,c in list1]
2. list(map(lambda x:x[:2],list1))

Categories