Creating a new list based on lists of tuples - python

Let's assume there is a list of tuples:
for something in x.something()
print(something)
and it returns
('a', 'b')
('c', 'd')
('e', 'f')
('g', 'h')
('i', 'j')
And I have created two other lists containing certain elements from the x.something():
y = [('a', 'b'), ('c', 'd')]
z = [('e', 'f'), ('g', 'h')]
So I want to assign the tuples from x.something() to a new list based on y and z by
newlist = []
for something in x.something():
if something in 'y':
newlist.append('color1')
elif something in 'z':
newlist.append('color2')
else:
newlist.append('color3')
What I would like to have is the newlist looks like:
['color1', 'color1', 'color2', 'color2', 'color3']
But I've got
TypeError: 'in <string>' requires string as left operand, not tuple
What went wrong and how to fix it?

I think you want to get if something in y instead of if something in 'y' because they are two seperate lists, not strings:
newlist = []
for something in x.something():
if something in y:
newlist.append('color1')
elif something in z:
newlist.append('color2')
else:
newlist.append('color3')

You should remove the quotes from if something in 'y' because it assumes that you're checking if something is in the string 'y'. Same for z.

try this:
t = [('a', 'b'),
('c', 'd'),
('e', 'f'),
('g', 'h'),
('i', 'j')]
y = [('a', 'b'), ('c', 'd')]
z = [('e', 'f'), ('g', 'h')]
new_list = []
for x in t:
if x in y:
new_list.append('color1')
elif x in z:
new_list.append('color2')
else:
new_list.append('color3')
print(new_list)
output:
['color1', 'color1', 'color2', 'color2', 'color3']

Related

Why tuples in a set won't convert to any other type in a loop?

I'm trying to remove a certain item from a set of tuples. to do so I must convert the tuples to a list or a set (i.e. a mutable object). I'm trying to do in a for loop but the tuples won't convert and my item is yet to be removed.
a = [('A', 'C'), ('B', 'C'), ('B', 'C')]
for i in a:
i = list(i)
if 'C' in i:
i.remove('C')
print(a)
This is the output:
[('A', 'C'), ('B', 'C'), ('B', 'C')]
You got the right intuition. As your tuples are immutable, you need to create new ones.
However, in your code, you create lists, modify them, but fail to save them back in the original list.
You could use a list comprehension.
[tuple(e for e in t if e != 'C') for t in a]
Output:
[('A',), ('B',), ('B',)]
You are modifying the list but are not creating a new list.
Try this:
a = [('A', 'C'), ('B', 'C'), ('B', 'C')]
b = []
for i in a:
i = list(i)
if 'C' in i:
i.remove('C')
b.append(i)
print(b)

Select first item in each list

Here is my list:
[(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
Basically, I'd like to get:
[('A', 'C'), ('E', 'G')]
So, I'd like to select first elements from the lowest-level lists and build mid-level lists with them.
====================================================
Additional explanation below:
I could just zip them by
list(zip([w[0][0] for w in list1], [w[1][0] for w in list1]))
But later I'd like to add a condition: the second elements in the lowest level lists must be 'B' and 'D' respectively, so the final outcome should be:
[('A', 'C')] # ('E', 'G') must be sorted out
I'm a beginner, but can't find the case anywhere... Would be grateful for help.
I'd do it the following way
list = [(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
out = []
for i in list:
listAux = []
for j in i:
listAux.append(j[0])
out.append((listAux[0],listAux[1]))
print(out)
I hope that's what you're looking for.

Pandas shows inconsistency in rounding of floats to one decimal place

I have a data which looks like below
data = [[(21.2071607142856,)], [(Decimal('0.11904761904761904762'),)], [(9.54183035714285,)], [(9.54433035714284,)], [(17.1964285714286,)]]
As you can see, all of the values are float except one which is of type Decimal.
Now I need to limit the decimal to one place. So this is the script I use to do that using pandas
formatted_result_list = []
for sub_result in data:
formatted_result = pd.DataFrame(sub_result).round(1).fillna("").to_records(index=False).tolist()
formatted_result_list.append(formatted_result)
return formatted_result_list
This is what I get
[[(21.2,)], [(Decimal('0.11904761904761904762'),)], [(9.5,)], [(9.5,)], [(17.2,)]]
It is able to limit the floats to once decimal place but its unable to limit the value of type Decimal. So I change the third line to this
# use .astype(float)
formatted_result = pd.DataFrame(sub_result).astype(float).round(1).fillna("").to_records(index=False).tolist()
So now I get this
[[(21.2,)], [(0.1,)], [(9.5,)], [(9.5,)], [(17.2,)]]
But it doesn't work for data like this
data = [[('A', 204.593564568,), ('B', 217.421341061, 23.33), ('C', 237.296250326, 20.33), ('D', 217.464281998, 34.44), ('E', 206.329901299, 55.213)], [('F', 210.297625953,), ('G', 228.117692718, 34.22), ('H', 4, 0.99), ('I', 265.319671257, 90.99), ('K',)]]
Here it literally outputs the same result.
So what can I do to ensure if there is a decimal, convert it to float and round off and if there is a float, always round it off?
Data for test:
#data = [[(21.2071607142856,)], [(Decimal('0.11904761904761904762'),)], [(9.54183035714285,)], [(9.54433035714284,)], [(17.1964285714286,)]]
data = [[('A', Decimal(204.593564568),), ('B', 217.421341061, 23.33), ('C', 237.296250326, 20.33), ('D', 217.464281998, 34.44), ('E', 206.329901299, 55.213)], [('F', 210.297625953,), ('G', 228.117692718, 34.22), ('H', 4, 0.99), ('I', 265.319671257, 90.99), ('K',)]]
#data = [21.2071607142856,Decimal(204.593564568)]
I try to create general solution for working with tuples and scalars and also with Decimal:
from decimal import Decimal
def round_custom(x):
out = []
for y in x:
if isinstance(y, tuple):
L = [round(float(z), 2) if isinstance(z, (Decimal, float)) else z for z in y]
out.append(tuple(L))
elif isinstance(y, (Decimal, float)):
out.append(round(float(y), 2))
else:
return x
return pd.Series(out, name=x.name)
df = pd.DataFrame(data).apply(round_custom).values.tolist()
print (df)
[[('A', 204.59), ('B', 217.42, 23.33), ('C', 237.3, 20.33),
('D', 217.46, 34.44), ('E', 206.33, 55.21)], [('F', 210.3),
('G', 228.12, 34.22), ('H', 4, 0.99), ('I', 265.32, 90.99), ('K',)]]

How to extract colon separated values from the same line?

I am using python regular expressions. I want all colon separated values in a line.
e.g.
input = 'a:b c:d e:f'
expected_output = [('a','b'), ('c', 'd'), ('e', 'f')]
But when I do
>>> re.findall('(.*)\s?:\s?(.*)','a:b c:d')
I get
[('a:b c', 'd')]
I have also tried
>>> re.findall('(.*)\s?:\s?(.*)[\s$]','a:b c:d')
[('a', 'b')]
The following code works for me:
inpt = 'a:b c:d e:f'
re.findall('(\S+):(\S+)',inpt)
Output:
[('a', 'b'), ('c', 'd'), ('e', 'f')]
Use split instead of regex, also avoid giving variable name like keywords
:
inpt = 'a:b c:d e:f'
k= [tuple(i.split(':')) for i in inpt.split()]
print(k)
# [('a', 'b'), ('c', 'd'), ('e', 'f')]
The easiest way using list comprehension and split :
[tuple(ele.split(':')) for ele in input.split(' ')]
#driver values :
IN : input = 'a:b c:d e:f'
OUT : [('a', 'b'), ('c', 'd'), ('e', 'f')]
You may use
list(map(lambda x: tuple(x.split(':')), input.split()))
where
input.split() is
>>> input.split()
['a:b', 'c:d', 'e:f']
lambda x: tuple(x.split(':')) is function to convert string to tuple 'a:b' => (a, b)
map applies above function to all list elements and returns a map object (in Python 3) and this is converted to list using list
Result
>>> list(map(lambda x: tuple(x.split(':')), input.split()))
[('a', 'b'), ('c', 'd'), ('e', 'f')]

write the elements of list to file

Bigram is a list which looks like-
[('a', 'b'), ('b', 'b'), ('b', 'b'), ('b', 'c'), ('c', 'c'), ('c', 'c'), ('c', 'd'), ('d', 'd'), ('d', 'e')]
Now I am trying to wrote each element if the list as a separate line in a file with this code-
bigram = list(nltk.bigrams(s.split()))
outfile1.write("%s" % ''.join(ele) for ele in bigram)
but I am getting this error :
TypeError: write() argument must be str, not generator
I want the result as in file-
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
......
you're passing a generator comprehension to write, which needs strings.
If I understand correctly you want to write one representation of tuple per line.
You can achieve that with:
outfile1.write("".join('{}\n'.format(ele) for ele in bigram))
or
outfile1.writelines('{}\n'.format(ele) for ele in bigram)
the second version passes a generator comprehension to writelines, which avoids to create the big string in memory before writing to it (and looks more like your attempt)
it produces a file with this content:
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
('c', 'c')
('c', 'd')
('d', 'd')
('d', 'e')
Try this:
outfile1.writelines("{}\n".format(ele) for ele in bigram)
This is the operator precedence problem.
You want an expression like this:
("%s" % ''.join(ele)) for ele in bigram
Instead, you get it interpreted like this, where the part in the parens is indeed a generator:
"%s" % (''.join(ele) for ele in bigram)
Use the explicit parentheses.
Please note that ("%s" % ''.join(ele)) for ele in bigram is itself a generator. You need to call write on each element from it.
If you want to write each pair in a separate line, you have to add line separators explicitly. The easiest, to my mind, is an explicit loop:
for pair in bigram:
outfile.write("(%s, %s)\n" % pair)

Categories