Find Longest Weighted Path from DAG with Networkx in Python? - python

I need a algorithm for find the longest weighted path in a directed, acyclic graph networkx.MultiDiGraph(). My graph has weighted edges and many edges have a null value as the weighting. In networkx doc I have found nothing for solve this problem. My graph has the following structure:
>>> print graph.nodes()
[0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 20, 21, 22, 25, 26, 'end']
>>> print graph.edges()
[(0, 'end'), (1, 0), (1, 10), (1, 5), (2, 1), (2, 11), (2, 6), (3, 2), (3, 12), (3, 7), (4, 8), (4, 3), (4, 13), (5, 'end'), (6, 5), (6, 15), (7, 16), (7, 6), (8, 17), (8, 7), (10, 'end'), (11, 10), (11, 20), (11, 15), (12, 16), (12, 11), (12, 21), (13, 17), (13, 12), (13, 22), (15, 'end'), (16, 25), (16, 15), (17, 16), (17, 26), (20, 'end'), (21, 25), (21, 20), (22, 26), (22, 21), (25, 'end'), (26, 25)]
>>> print graph.edge[7][16]
{1: {'weight': 100.0, 'time': 2}}
>>> print graph.edge[7][6]
{0: {'weight': 0, 'time': 2}}
I find this her, but I have problems with the implementation:
networkx: efficiently find absolute longest path in digraph this solution is without weightings.
How to find the longest path with Python NetworkX? This solution transformed the weightings into negativ values, but my graph has null values… and the nx.dijkstra_path() does not support negative values.
Have anyone an idea or a solution to a similar problem found?

Take the solution in the link 1 and change the line:
pairs = [[dist[v][0]+1,v] for v in G.pred[node]] # incoming pairs
To something like:
pairs = [[dist[v][0]+edge['weight'],v] for u, edge in G[v][node] for v in G.pred[node]] # incoming pairs


Unexpected output when applying LDA trained model to given corpus

I have trained a LDA model using the following parameters:
>> model = gensim.models.ldamodel.LdaModel(corpus=corpus,
Then, I applied this model to a given corpus:
>> lda_corpus = model[corpus]
I was expecting lda_corpus to be a list of lists or 2D matrix, where the number of rows is the number of docs and the number of columns is the number of topics and each element matrix, a tuple of the form (topic_index, probability). However I am getting this weird result where some elements are again a list:
>> print(lda_model_1[corpus[0]])
>> ([(0, 0.012841966), (3, 0.073988825), (4, 0.05184835), (8, 0.38537887), (10, 0.022958927), (11, 0.24562633), (13, 0.05168812), (17, 0.06522224), (21, 0.024792604)], [(0, [11]), (1, [8, 3, 17, 13]), (2, [3, 17, 8, 13]), (3, [8, 3]), (4, [11]), (5, [8, 17, 3]), (6, [4]), (7, [4, 8]), (8, [8, 13, 3]), (9, [11]), (10, [8, 0]), (11, [8, 13, 0]), (12, [21]), (13, [11]), (14, [11]), (15, [8]), (16, [8, 11, 13, 0]), (17, [11]), (18, [11, 17]), (19, [8, 13, 17, 3]), (20, [17, 13, 8]), (21, [17, 11, 8]), (22, [11]), (23, [8]), (24, [8, 13]), (25, [8, 3, 13])], [(0, [(11, 1.0)]), (1, [(3, 0.15384258), (8, 0.71774876), (13, 0.011975089), (17, 0.11643356)]), (2, [(3, 0.45133045), (8, 0.21692151), (13, 0.09479065), (17, 0.23232804)]), (3, [(3, 0.24423833), (8, 0.75576156)]), (4, [(11, 1.0)]), (5, [(3, 0.02001735), (8, 1.6895359), (17, 0.2904468)]), (6, [(4, 1.0)]), (7, [(4, 1.2565874), (8, 0.7367453)]), (8, [(3, 0.05150538), (8, 0.8553984), (13, 0.07775658)]), (9, [(11, 2.0)]), (10, [(0, 0.13937186), (8, 0.8588695)]), (11, [(0, 0.023420962), (8, 0.7131521), (13, 0.263427)]), (12, [(21, 1.0)]), (13, [(11, 0.99124163)]), (14, [(11, 2.0)]), (15, [(8, 1.0)]), (16, [(0, 0.011193657), (8, 1.7189965), (11, 0.23104382), (13, 0.029387457)]), (17, [(11, 1.9989293)]), (18, [(11, 0.9135094), (17, 0.08400644)]), (19, [(3, 0.07146881), (8, 2.1837764), (13, 0.38799366), (17, 0.352704)]), (20, [(8, 0.22638415), (13, 0.24114841), (17, 0.52740365)]), (21, [(8, 0.02224951), (11, 0.24574266), (17, 0.7231928)]), (22, [(11, 1.0)]), (23, [(8, 1.0)]), (24, [(8, 0.972818), (13, 0.027181994)]), (25, [(3, 0.16742931), (8, 0.7671518), (13, 0.05224549)])])
I would appreciate any help.
The problem was related to model parameters. I was using the following config:
lda_model = gensim.models.ldamodel.LdaModel(corpus=corpus,
However, there were some of them that wasn't necessary and that were causing the trouble. The config I am using now is the following:
lda_model = gensim.models.ldamodel.LdaModel(corpus=corpus, id2word=id2word, num_topics=ntopics, \
update_every=1, chunksize=10000, passes=1)

Restart nested loop in python after condition is met

I have two ranges:
range_1 (0,10)
range_2 (11, 40)
I want to create a list of tuples from the two ranges above (range_1 and range_2) if the sum of any of the two elements in the two ranges is an even number.
Thus 0 from range_1 and 12 from range_2 = 12 which is even, the same with 1 from range_1 and 13 from range_2 = 14 which is even.
However I don't want to go through all the elements in range_2. Only 5 successful attempts are needed, then immediately I have to go back to the second iteration in range_1.
Thus for the first iteration:
(0, 12, 12), (0, 14, 14), (0, 16, 16), (0, 18, 18), (0, 20, 20)
then we go to the second iteration:
(1, 11, 12), (1, 13, 14), (1, 15, 16), (1, 17, 18), (1, 19, 20)
and so on till 9 in range_1:
(9, 11, 20), (9, 13, 22), (9, 15, 24), (9, 17, 26), (9, 19, 28)
Here is my code, which goes through all the elements, which is obviously wrong, because it goes through all the elements in range_2!
list_1 = []
for i in range(10):
for j in range(11,40):
if (i+j)%2 == 0:
list_1.append((i, j, (i+j)))
Just store a counter so that if you reach five then you break out of your nested for-loop:
list_1 = []
for i in range(10):
counter = 0
for j in range(11,40):
if (i+j)%2 == 0:
list_1.append((i, j, (i+j)))
counter += 1
if counter == 5:
which gives list_1 as:
[(0, 12, 12), (0, 14, 14), (0, 16, 16), (0, 18, 18), (0, 20, 20),
(1, 11, 12), (1, 13, 14), (1, 15, 16), (1, 17, 18), (1, 19, 20),
(2, 12, 14), (2, 14, 16), (2, 16, 18), (2, 18, 20), (2, 20, 22),
(3, 11, 14), (3, 13, 16), (3, 15, 18), (3, 17, 20), (3, 19, 22),
(4, 12, 16), (4, 14, 18), (4, 16, 20), (4, 18, 22), (4, 20, 24),
(5, 11, 16), (5, 13, 18), (5, 15, 20), (5, 17, 22), (5, 19, 24),
(6, 12, 18), (6, 14, 20), (6, 16, 22), (6, 18, 24), (6, 20, 26),
(7, 11, 18), (7, 13, 20), (7, 15, 22), (7, 17, 24), (7, 19, 26),
(8, 12, 20), (8, 14, 22), (8, 16, 24), (8, 18, 26), (8, 20, 28),
(9, 11, 20), (9, 13, 22), (9, 15, 24), (9, 17, 26), (9, 19, 28)]
It should be noted that this is not the most efficient way to go about creating your data structure. Clearly only ever other j-value generated in the inner for-loop will be used, which is wasteful.
Therefore you could specify a step for the j for-loop of 2 so that only other j-value is considered. However, you must be careful with the starting value now. If you were to always start at 11 and step in 2s then you would only get odd j-values and these could never combine with the current i-value to give an even number if i was even. Therefore you would have to change the j for-loop to start at 12 if i is even and 11 if i is odd.
As others have commented, you can simplify the problem quite a bit by constructing ranges that accomodate what you're trying to do. Here it is as a nested comprehension:
[(i, j, i+j) for i in range(0, 10) for j in range((11 if i % 2 else 12), 21, 2)]
If you're smart about creating your range, you can simplify the algorithm. Use a step to make the range skip every other j, adjust the start based on whether i is even, and set the end to 21 since it will only ever get that high.
list_1 = []
for i in range(10):
start = 12 if i % 2 == 0 else 11
for j in range(start, 21, 2):
list_1.append((i, j, i+j))
print(list_1[-5:]) # For testing
First two lines of output:
[(0, 12, 12), (0, 14, 14), (0, 16, 16), (0, 18, 18), (0, 20, 20)]
[(1, 11, 12), (1, 13, 14), (1, 15, 16), (1, 17, 18), (1, 19, 20)]

A given number from 2 to N should have each number divisible outputted as [(2,4), (2,6),( 2,8),( 3,9),( 3,12), (3,15) .....] in Python 2/3

We have a given number from 2 to N. Each number divisible should appear as below:
[(2,4), (2,6), (2,8),(3,9),(3,12),(3,15),(4,8),(4,12),(4,16) ...] upto N
I tried it myself (see below), but I am not getting the expected output as above (which is the one I want).
>>> [(a, b) for a in range(5) for b in range(5) if a%2 == 0 and b %2==0]
>>> [(0, 0), (0, 2), (0, 4), (2, 0), (2, 2), (2, 4), (4, 0), (4, 2), (4, 4)]
NOTE: Any number divisible by n (where n is a whole number - 1, 2, 3, 4 ...) is a multiple of n.
You can try:
n = int(input("Enter a number: "))
multiples = [(a, b) for a in range(2, n + 1) for b in range(2, n + 1) if b%a == 0 and a != b]
print (multiples)
where n is the number "below" which the number of multiples is printed for a specific number, but using n + 1 prints the number of multiples "up to" n (only if it is possible).
For example, when n = 10, it will give this output, [(2, 4), (2, 6), (2, 8), (2, 10), (3, 6), (3, 9), (4, 8), (5, 10)].
This has two conditions: b%a == 0, which makes sure that a is not zero, as b / 0 == math error and b%a checks whether the second number is a factor of the first number, or not.
Without b%a == 0, you would have:
[(2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (3, 2), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (4, 2), (4, 3), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (5, 2), (5, 3), (5, 4), (5, 6), (5, 7), (5, 8), (5, 9), (6, 2), (6, 3), (6, 4), (6, 5), (6, 7), (6, 8), (6, 9), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 8), (7, 9), (8, 2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), (8, 9), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9, 7), (9, 8)]
Since you do not want a equal to b (3 == 3), you can use a != b, so don't have to worry about getting (3, 3) or (4, 4) etc.
To get the maximum number of multiples for a good range of numbers ( 2 to n) , I would use a larger sample such as n = 50, which will give this output:
[(2, 4), (2, 6), (2, 8), (2, 10), (2, 12), (2, 14), (2, 16), (2, 18), (2, 20), (2, 22), (2, 24), (2, 26), (2, 28), (2, 30), (2, 32), (2, 34), (2, 36), (2, 38), (2, 40), (2, 42), (2, 44), (2, 46), (2, 48), (2, 50), (3, 6), (3, 9), (3, 12), (3, 15), (3, 18), (3, 21), (3, 24), (3, 27), (3, 30), (3, 33), (3, 36), (3, 39), (3, 42), (3, 45), (3, 48), (4, 8), (4, 12), (4, 16), (4, 20), (4, 24), (4, 28), (4, 32), (4, 36), (4, 40), (4, 44), (4, 48), (5, 10), (5, 15), (5, 20), (5, 25), (5, 30), (5, 35), (5, 40), (5, 45), (5, 50), (6, 12), (6, 18), (6, 24), (6, 30), (6, 36), (6, 42), (6, 48), (7, 14), (7, 21), (7, 28), (7, 35), (7, 42), (7, 49), (8, 16), (8, 24), (8, 32), (8, 40), (8, 48), (9, 18), (9, 27), (9, 36), (9, 45), (10, 20), (10, 30), (10, 40), (10, 50), (11, 22), (11, 33), (11, 44), (12, 24), (12, 36), (12, 48), (13, 26), (13, 39), (14, 28), (14, 42), (15, 30), (15, 45), (16, 32), (16, 48), (17, 34), (18, 36), (19, 38), (20, 40), (21, 42), (22, 44), (23, 46), (24, 48), (25, 50)]
Hope this helps!
[(a, b) for a in range(5) for b in range(5)]
gives tuples containing all the combinations of pairs of numbers starting from, and including 0, up to (but not including) 5.
a%2 == 0 is checking for a being even. Likewise for b%2 == 0, so you are finding tuples of even numbers from the unfiltered list above, hence your output.
If you don't want zero, start at 1, e.g. range(1,5) or range(2, 5).
In fact, you need to avoid zeros.
To check the second number is a factor of the first, check b%a, having made sure a won't be zero.
So more like this:
>>> [(a, b) for a in range(2, 5) for b in range(2, 5) if b%a == 0]
[(2, 2), (2, 4), (3, 3), (4, 4)]
To avoid (3,3) or other cases add a != b to the condition in the list comprehension.
[(a, b) for a in range(2, 5) for b in range(2, 5) if b%a == 0 and a!=b]
[(2, 4)]
If you then want each second number to just appear once, you can subsequently filter that list.

Joining two sets of tuples at a common value

setA = [(1, 25), (2, 24), (3, 23), (4, 22), (5, 21), (6, 20),
(7, 19), (8, 18), (9, 17), (10, 16), (11, 15), (12, 14),
(13, 13),(14, 12), (15, 11), (16, 10), (17, 9), (18, 8),
(19, 7),(20, 6), (21, 5), (22, 4), (23, 3), (24, 2), (25, 1)]
setB = [(1, 19), (2, 18), (3, 17), (4, 16), (5, 15), (6, 14), (7, 13),
(8, 12), (9, 11), (10, 10), (11, 9), (12, 8), (13, 7), (14, 6),
(15, 5), (16, 4), (17, 3), (18, 2), (19, 1)]
How can I combine the two sets using the first element of each tuple in each set as a common key value. So for tuple at position 1 in each set it would be (1,25) and (1,19) respectively. Joined together would yield: (25,1,19)
Note: that order of output tuple must be maintained. Example:
(setA value, common value, setB value)
(setA value, common value, setB value)etc...
Note: Must use Python 2.7x standard libraries
I'm trying to do something like [(a,b,c) for (a,b),(b,c) in zip(setA,setB)] but I don't fully understand the proper syntax and logic.
Thank you.
Seems like what you want can be implemented as easily as a dictionary lookup on setB inside a list comprehension.
mapping = dict(setB)
out = [(b, a, mapping.get(a)) for a, b in setA]
[(25, 1, 19),
(24, 2, 18),
(23, 3, 17),
(22, 4, 16),
(21, 5, 15),
(20, 6, 14),
(19, 7, 13),
(18, 8, 12),
(17, 9, 11),
(16, 10, 10),
(15, 11, 9),
(14, 12, 8),
(13, 13, 7),
(12, 14, 6),
(11, 15, 5),
(10, 16, 4),
(9, 17, 3),
(8, 18, 2),
(7, 19, 1),
(6, 20, None),
(5, 21, None),
(4, 22, None),
(3, 23, None),
(2, 24, None),
(1, 25, None)]
Since our lists have different size zip is not a solution.
One solution could be using zip_longest method from itertools package.
finalSet = [(b, a, c[1] if c is not None else c) for (a,b), c in zip_longest(*setA,*setB)]
(25, 1, 19)
(24, 2, 18)
(23, 3, 17)
(22, 4, 16)
(21, 5, 15)
(20, 6, 14)
(19, 7, 13)
(18, 8, 12)
(17, 9, 11)
(16, 10, 10)
(15, 11, 9)
(14, 12, 8)
(13, 13, 7)
(12, 14, 6)
(11, 15, 5)
(10, 16, 4)
(9, 17, 3)
(8, 18, 2)
(7, 19, 1)
(6, 20, None)
(5, 21, None)
(4, 22, None)
(3, 23, None)
(2, 24, None)
(1, 25, None)
setA = [(1, 25), (2, 24), (3, 23), (4, 22), (5, 21), (6, 20),
(7, 19), (8, 18), (9, 17), (10, 16), (11, 15), (12, 14),
(13, 13),(14, 12), (15, 11), (16, 10), (17, 9), (18, 8),
(19, 7),(20, 6), (21, 5), (22, 4), (23, 3), (24, 2), (25, 1)]
setB = [(1, 19), (2, 18), (3, 17), (4, 16), (5, 15), (6, 14), (7, 13),
(8, 12), (9, 11), (10, 10), (11, 9), (12, 8), (13, 7), (14, 6),
(15, 5), (16, 4), (17, 3), (18, 2), (19, 1)]
la, lb = len(setA), len(setB)
temp=[[setA[i][1] if i<la else None, i+1, setB[i][1] if i<lb else None] for i in range(0,max(la,lb))]
[[25, 1, 19],
[24, 2, 18],
[23, 3, 17],
[22, 4, 16],
[21, 5, 15],
[20, 6, 14],
[19, 7, 13],
[18, 8, 12],
[17, 9, 11],
[16, 10, 10],
[15, 11, 9],
[14, 12, 8],
[13, 13, 7],
[12, 14, 6],
[11, 15, 5],
[10, 16, 4],
[9, 17, 3],
[8, 18, 2],
[7, 19, 1],
[6, 20, None],
[5, 21, None],
[4, 22, None],
[3, 23, None],
[2, 24, None],
[1, 25, None]]
If you want setC in the same format as setA and setB. I think this workaround will do.
Entering values directly as tuple is not possible as tuples are immutable and hence we append the new tuples as list and then convert it to a tuple.
setC = []
while setA[0][i][0]== setB[0][i][0] and (i < min(len(setA[0]), len(setB[0]))-1):
setC.append((setA[0][i][1],setA[0][i][0], setB[0][i][1]))
setC = [tuple(setC)]

Plot Clustering Datapoints Python

I'm doing a clustering exercise and I have a list that looks something like this
[(1, 3), (2, 5), (2, 6), (1, 2), (1, 8)],
[(4, 7), (5, 5), (6, 4)]
[(8, 9), (10, 9), (11, 12), (10, 12)]
[(18, 20), (20, 29), (17, 16), (18, 22)]
Basically I made the clusters into an array of arrays. I was wondering how to plot this on Python, where different clusters have different colors. I've tried using mathplotlib, but I'm quite confused.
You can use plt.scatter:
import matplotlib.pyplot as plt
s = [[(1, 3), (2, 5), (2, 6), (1, 2), (1, 8)], [(4, 7), (5, 5), (6, 4)], [(8, 9), (10, 9), (11, 12), (10, 12)], [(18, 20), (20, 29), (17, 16), (18, 22)]]
c = iter(['b', 'g', 'r', 'c', 'm', 'y', 'k', 'w'])
for group in s:
plt.scatter(*zip(*group), color = next(c))
