Bigram is a list which looks like-
[('a', 'b'), ('b', 'b'), ('b', 'b'), ('b', 'c'), ('c', 'c'), ('c', 'c'), ('c', 'd'), ('d', 'd'), ('d', 'e')]
Now I am trying to wrote each element if the list as a separate line in a file with this code-
bigram = list(nltk.bigrams(s.split()))
outfile1.write("%s" % ''.join(ele) for ele in bigram)
but I am getting this error :
TypeError: write() argument must be str, not generator
I want the result as in file-
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
......
you're passing a generator comprehension to write, which needs strings.
If I understand correctly you want to write one representation of tuple per line.
You can achieve that with:
outfile1.write("".join('{}\n'.format(ele) for ele in bigram))
or
outfile1.writelines('{}\n'.format(ele) for ele in bigram)
the second version passes a generator comprehension to writelines, which avoids to create the big string in memory before writing to it (and looks more like your attempt)
it produces a file with this content:
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
('c', 'c')
('c', 'd')
('d', 'd')
('d', 'e')
Try this:
outfile1.writelines("{}\n".format(ele) for ele in bigram)
This is the operator precedence problem.
You want an expression like this:
("%s" % ''.join(ele)) for ele in bigram
Instead, you get it interpreted like this, where the part in the parens is indeed a generator:
"%s" % (''.join(ele) for ele in bigram)
Use the explicit parentheses.
Please note that ("%s" % ''.join(ele)) for ele in bigram is itself a generator. You need to call write on each element from it.
If you want to write each pair in a separate line, you have to add line separators explicitly. The easiest, to my mind, is an explicit loop:
for pair in bigram:
outfile.write("(%s, %s)\n" % pair)
Related
I'm trying to remove a certain item from a set of tuples. to do so I must convert the tuples to a list or a set (i.e. a mutable object). I'm trying to do in a for loop but the tuples won't convert and my item is yet to be removed.
a = [('A', 'C'), ('B', 'C'), ('B', 'C')]
for i in a:
i = list(i)
if 'C' in i:
i.remove('C')
print(a)
This is the output:
[('A', 'C'), ('B', 'C'), ('B', 'C')]
You got the right intuition. As your tuples are immutable, you need to create new ones.
However, in your code, you create lists, modify them, but fail to save them back in the original list.
You could use a list comprehension.
[tuple(e for e in t if e != 'C') for t in a]
Output:
[('A',), ('B',), ('B',)]
You are modifying the list but are not creating a new list.
Try this:
a = [('A', 'C'), ('B', 'C'), ('B', 'C')]
b = []
for i in a:
i = list(i)
if 'C' in i:
i.remove('C')
b.append(i)
print(b)
Let's assume there is a list of tuples:
for something in x.something()
print(something)
and it returns
('a', 'b')
('c', 'd')
('e', 'f')
('g', 'h')
('i', 'j')
And I have created two other lists containing certain elements from the x.something():
y = [('a', 'b'), ('c', 'd')]
z = [('e', 'f'), ('g', 'h')]
So I want to assign the tuples from x.something() to a new list based on y and z by
newlist = []
for something in x.something():
if something in 'y':
newlist.append('color1')
elif something in 'z':
newlist.append('color2')
else:
newlist.append('color3')
What I would like to have is the newlist looks like:
['color1', 'color1', 'color2', 'color2', 'color3']
But I've got
TypeError: 'in <string>' requires string as left operand, not tuple
What went wrong and how to fix it?
I think you want to get if something in y instead of if something in 'y' because they are two seperate lists, not strings:
newlist = []
for something in x.something():
if something in y:
newlist.append('color1')
elif something in z:
newlist.append('color2')
else:
newlist.append('color3')
You should remove the quotes from if something in 'y' because it assumes that you're checking if something is in the string 'y'. Same for z.
try this:
t = [('a', 'b'),
('c', 'd'),
('e', 'f'),
('g', 'h'),
('i', 'j')]
y = [('a', 'b'), ('c', 'd')]
z = [('e', 'f'), ('g', 'h')]
new_list = []
for x in t:
if x in y:
new_list.append('color1')
elif x in z:
new_list.append('color2')
else:
new_list.append('color3')
print(new_list)
output:
['color1', 'color1', 'color2', 'color2', 'color3']
Here is my list:
[(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
Basically, I'd like to get:
[('A', 'C'), ('E', 'G')]
So, I'd like to select first elements from the lowest-level lists and build mid-level lists with them.
====================================================
Additional explanation below:
I could just zip them by
list(zip([w[0][0] for w in list1], [w[1][0] for w in list1]))
But later I'd like to add a condition: the second elements in the lowest level lists must be 'B' and 'D' respectively, so the final outcome should be:
[('A', 'C')] # ('E', 'G') must be sorted out
I'm a beginner, but can't find the case anywhere... Would be grateful for help.
I'd do it the following way
list = [(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
out = []
for i in list:
listAux = []
for j in i:
listAux.append(j[0])
out.append((listAux[0],listAux[1]))
print(out)
I hope that's what you're looking for.
I am using python regular expressions. I want all colon separated values in a line.
e.g.
input = 'a:b c:d e:f'
expected_output = [('a','b'), ('c', 'd'), ('e', 'f')]
But when I do
>>> re.findall('(.*)\s?:\s?(.*)','a:b c:d')
I get
[('a:b c', 'd')]
I have also tried
>>> re.findall('(.*)\s?:\s?(.*)[\s$]','a:b c:d')
[('a', 'b')]
The following code works for me:
inpt = 'a:b c:d e:f'
re.findall('(\S+):(\S+)',inpt)
Output:
[('a', 'b'), ('c', 'd'), ('e', 'f')]
Use split instead of regex, also avoid giving variable name like keywords
:
inpt = 'a:b c:d e:f'
k= [tuple(i.split(':')) for i in inpt.split()]
print(k)
# [('a', 'b'), ('c', 'd'), ('e', 'f')]
The easiest way using list comprehension and split :
[tuple(ele.split(':')) for ele in input.split(' ')]
#driver values :
IN : input = 'a:b c:d e:f'
OUT : [('a', 'b'), ('c', 'd'), ('e', 'f')]
You may use
list(map(lambda x: tuple(x.split(':')), input.split()))
where
input.split() is
>>> input.split()
['a:b', 'c:d', 'e:f']
lambda x: tuple(x.split(':')) is function to convert string to tuple 'a:b' => (a, b)
map applies above function to all list elements and returns a map object (in Python 3) and this is converted to list using list
Result
>>> list(map(lambda x: tuple(x.split(':')), input.split()))
[('a', 'b'), ('c', 'd'), ('e', 'f')]
I have found many algorithms and approaches that talk about finding the shortest path between 2 points , but i have this situation where the data is modeled as :
[(A,B),(C,D),(B,C),(D,E)...] # list of possible paths
If we suppose i need the path from A to E , the result should be:
(A,B)=>(B,C)=>(C,D)=>(D,E)
but i can't find a pythonic way to do this search.
The Pythonic way is to to use a module if one exists. As in this case, we know, networkx is there , we can write
Implementation
import networkx as nx
G = nx.Graph([('A','B'),('C','D'),('B','C'),('D','E')])
path = nx.shortest_path(G, 'A', 'E')
Output
zip(path, path[1:])
[('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E')]
If you think of your points as vertices in a graph, your pairs as edges in that graph, then you can assign to edge graph edge a weight equal to the distance between your points.
Framed this way your problem is just the classic shortest path problem.
You asked for a Pythonic way to write it. The only advice I'd give is represent your graph as a dictionary, so that each key is a point, the returned values are a list of the other points directly reachable from that point. That will make traversing the graph faster. graph[C] -> [B, D] for your example.
Here is a solution using A*:
pip install pyformulas==0.2.8
import pyformulas as pf
transitions = [('A', 'B'), ('B', 'C'), ('C', 'A'), ('C', 'F'), ('D', 'F'), ('F', 'D'), ('F', 'B'), ('D', 'E'), ('E', 'C')]
initial_state = ('A',)
def expansion_fn(state):
valid_transitions = [tn for tn in transitions if tn[0] == state[-1]]
step_costs = [1 for t in valid_transitions]
return valid_transitions, step_costs
def goal_fn(state):
return state[-1] == 'E'
path = pf.discrete_search(initial_state, expansion_fn, goal_fn) # A*
print(path)
Output:
[('A',), ('A', 'B'), ('B', 'C'), ('C', 'F'), ('F', 'D'), ('D', 'E')]