partitionBy assigns partitions, but WHERE in each partition - python

With hash function:
balanceLoad = lambda x: bisect.bisect_left(boundary_array, -keyfunc(x))
Where boundary_array is [-64, -10, 35]
The folowing tells me which partition to assign each element to
rdd.partitionBy(numPartitions, balanceLoad)
However, is there a way to determine /control WHERE in each partition they are assigned / placed? {1,2,3} vs {3,2,1}.
For example when I do this:
rdd = CleanRDD(sc.parallelize(range(100), 4).map(lambda x: (x *((-1) ** x) , x)))
sortByKey(rdd, keyfunc=lambda key: key, ascending=False).collect()
Elements in each partition are in reverse order:
[(64, 64),
(66, 66),
(68, 68),
(70, 70),
(72, 72),
(74, 74),
(76, 76),
(78, 78),
(80, 80),
(82, 82),
(84, 84),
(86, 86),
(88, 88),
(90, 90),
(92, 92),
(94, 94),
(96, 96),
(98, 98),
(10, 10),
(12, 12),
(14, 14),
(16, 16),
(18, 18),
(20, 20),
(22, 22),
(24, 24),
(26, 26),
(28, 28),
(30, 30),
(32, 32),
(34, 34),
(36, 36),
(38, 38),
(40, 40),
(42, 42),
(44, 44),
(46, 46),
(48, 48),
(50, 50),
(52, 52),
(54, 54),
(56, 56),
(58, 58),
(60, 60),
(62, 62),
(-35, 35),
(-33, 33),
(-31, 31),
(-29, 29),
(-27, 27),
(-25, 25),
(-23, 23),
(-21, 21),
(-19, 19),
(-17, 17),
(-15, 15),
(-13, 13),
(-11, 11),
(-9, 9),
(-7, 7),
(-5, 5),
(-3, 3),
(-1, 1),
(0, 0),
(2, 2),
(4, 4),
(6, 6),
(8, 8),
(-99, 99),
(-97, 97),
(-95, 95),
(-93, 93),
(-91, 91),
(-89, 89),
(-87, 87),
(-85, 85),
(-83, 83),
(-81, 81),
(-79, 79),
(-77, 77),
(-75, 75),
(-73, 73),
(-71, 71),
(-69, 69),
(-67, 67),
(-65, 65),
(-63, 63),
(-61, 61),
(-59, 59),
(-57, 57),
(-55, 55),
(-53, 53),
(-51, 51),
(-49, 49),
(-47, 47),
(-45, 45),
(-43, 43),
(-41, 41),
(-39, 39),
(-37, 37)]
Notice that elements in each of the three groups are in reverse order.
How can I correct this?

Determine no, because an order of the shuffle is nondeterministic.
You can control the order but not as a part of the partitioning process or at least not in PySpark. Instead you can take a similar approach like sortByKey and enforce the order per partition afterwards:
def applyOrdering(iter):
"""Takes an itertools.chain object
and returns iterable with specific ordering"""
...
rdd.partitionBy(numPartitions, balanceLoad).mapPartitions(applyOrdering)
Note that iter may be to large fit into memory so you should either increase granularity or use sorting mechanism which doesn't require reading all data at once.

Related

Merge two list of tuples based on overlapp python

Given two lists x, y such that they both have been initialized as shown below:
x = [(0, 3), (5, 8), (16, 19), (21, 24), (28, 30), (40, 42), (46, 47), (50, 54), (58, 63), (69, 71)]
y = [(9, 10), (26, 27), (29, 31), (35, 36), (41, 43), (48, 49), (66, 67), (70, 72), (77, 78), (85, 86)]
I want to form a new list of tuples where each tuple has contiguous tuples from x and an overlapping tuple from y.
For the example above, the output would be:
[((5, 8) (9, 10) (16, 19)), ((21, 24) (26, 27) (28, 30)), ((28, 30) (29, 31) (40, 42)), ((28, 30) (35, 36) (40, 42)), ((40, 42) (41, 43) (46, 47)), ((46, 47) (48, 49) (50, 54)),((58, 63) (66, 67) (69, 71))]
My code:
lst = []
for i in range(len(x)):
if i+1 < len(x):
context = x[i],x[i+1]
for j in y:
if j[0] >= context[0][0] and j[0] <= context[1][0]:
lst.append((context[0],j,context[1]))
I need better and efficient ways to write this code.
You can use two variables to keep track of indices in x and y list. Using the conditions specified in the problem, these indices can be incremented whenever the given condition has been satisfied.
At every iteration, the algorithm checks if x[i][0] < y[j][0] and x[i+1][1] > y[j][1] ( The upper and lower bound provided by the contigous tuples in x. If this condition is true, we increment j (y-index) so that we can check if the next element lies in the given range. Else, we increment i (x-index) and repeat the process.
x = [(0, 3), (5, 8), (16, 19), (21, 24), (28, 30), (40, 42), (46, 47), (50, 54), (58, 63), (69, 71)]
y = [(9, 10), (26, 27), (29, 31), (35, 36), (41, 43), (48, 49), (66, 67), (70, 72), (77, 78), (85, 86)]
i = 0
j = 0
result = list()
while i < len(x) - 1 and j < len(y):
if y[j][0] > x[i][0] and y[j][1] < x[i + 1][1]:
result.append((x[i], y[j], x[i + 1]))
j += 1
else:
i += 1
print(result)
Output -
[((5, 8), (9, 10), (16, 19)),
((21, 24), (26, 27), (28, 30)),
((28, 30), (29, 31), (40, 42)),
((28, 30), (35, 36), (40, 42)),
((40, 42), (41, 43), (46, 47)),
((46, 47), (48, 49), (50, 54)),
((58, 63), (66, 67), (69, 71))]
You can use Python Sorting
from operator import itemgetter, attrgetter
output = sorted((x + y), key=itemgetter(0))

How to divide a specific number in a list?

A Mersenne Prime follows this formula 2^n-1. I have created a new type of factoring method for numbers which do not produce Mersenne primes. It is very abstract. Its premise is if a specific number is applied using modular math and the new number becomes (zero), it is not a Mersenne Prime Number. I submitted a paper to The Journal of Number Theory online, however it was rejected by the journal. I have attached it if you would like to look it over, I still feel my method is promising yet I'm no coding expert. This is a pdf I sent to the Journal of Number Theory My problem is in my new code I don't know how to divide a number in the list. The list enumerates ok yet I want to subtract z=11 from 253 which equals 242 than mod it by 121, however when I create a range from 1-254 I cannot seem to do this math. The reason I'm interested in this is 253//11=23 which is a factor of 2^11-1. I got this idea from a ratio page.
Type 1:11 and the second number is a 22 just add 1 and its 23.
Check it out
https://goodcalculators.com/ratio-calculator/
The formula will target any number in the range and what I'm looking for is a zero.
Additional details for grismar as per request:
Grismar and others,
What I have found is that Mersenne primes will produce fewer zero's below the number 11 vs. a number like 2^11-1. Also when you output the number by subtraction of z and then mod z*z you may find the number with the lowest factor in it after you divide it by z. The range must be large enough as to find that number, yet if is zero simply divide by z. Then for instance when you find 23 by dividing 11 into 253. You can divide 23 into 2047 and you should get 89. More than likely if you use a different number to check this factor you will get a fraction. So when checking using this method when you find a zero for a number which does not produce a Mersenne Prime number like. Lets pick 29. 536870911 รท 233 = 2304167 so you get a factor number not a fraction.
These are all the factors of 536870911
[1, 233, 1103, 256999, 2089, 486737, 2304167, 536870911]
If you would like even more details leave a comment please.
Programmer in learning looking for help here is my program:
1 should be the start range!
while True:
x = int(input("Use 1 for the start range to make this work correctly:
"))
i = int(input("End Range: "))
z = int(input("square of primes multiplied by a number plus z which
does not make a
mersenne prime, this finds its factor of z: "))
fact = [(i + 1, x) for i, x in enumerate(range(x, i))]
print([((int(i)-z) % (z*z)) if isinstance(i, str) else i for i in fact])
Maybe what you are trying is this, the int call is unnecessary since the values are integers from the start. Also, don't use the same variable i for different purposes:
calculations = [
(index + 1, (fact_tuple[0] - z) % (z*z)) for index, fact_tuple in enumerate(fact)
]
print(calculations) # with x = 1, i = 254, z = 11
>>> [(1, 111), (2, 112), (3, 113), (4, 114), (5, 115), (6, 116), (7, 117), (8, 118), (9, 119), (10, 120), (11, 0), (12, 1), (13, 2), (14, 3), (15, 4), (16, 5), (17, 6), (18, 7), (19, 8), (20, 9), (21, 10), (22, 11), (23, 12), (24, 13), (25, 14), (26, 15), (27, 16), (28, 17), (29, 18), (30, 19), (31, 20), (32, 21), (33, 22), (34, 23), (35, 24), (36, 25), (37, 26), (38, 27), (39, 28), (40, 29), (41, 30), (42, 31), (43, 32), (44, 33), (45, 34), (46, 35), (47, 36), (48, 37), (49, 38), (50, 39), (51, 40), (52, 41), (53, 42), (54, 43), (55, 44), (56, 45), (57, 46), (58, 47), (59, 48), (60, 49), (61, 50), (62, 51), (63, 52), (64, 53), (65, 54), (66, 55), (67, 56), (68, 57), (69, 58), (70, 59), (71, 60), (72, 61), (73, 62), (74, 63), (75, 64), (76, 65), (77, 66), (78, 67), (79, 68), (80, 69), (81, 70), (82, 71), (83, 72), (84, 73), (85, 74), (86, 75), (87, 76), (88, 77), (89, 78), (90, 79), (91, 80), (92, 81), (93, 82), (94, 83), (95, 84), (96, 85), (97, 86), (98, 87), (99, 88), (100, 89), (101, 90), (102, 91), (103, 92), (104, 93), (105, 94), (106, 95), (107, 96), (108, 97), (109, 98), (110, 99), (111, 100), (112, 101), (113, 102), (114, 103), (115, 104), (116, 105), (117, 106), (118, 107), (119, 108), (120, 109), (121, 110), (122, 111), (123, 112), (124, 113), (125, 114), (126, 115), (127, 116), (128, 117), (129, 118), (130, 119), (131, 120), (132, 0), (133, 1), (134, 2), (135, 3), (136, 4), (137, 5), (138, 6), (139, 7), (140, 8), (141, 9), (142, 10), (143, 11), (144, 12), (145, 13), (146, 14), (147, 15), (148, 16), (149, 17), (150, 18), (151, 19), (152, 20), (153, 21), (154, 22), (155, 23), (156, 24), (157, 25), (158, 26), (159, 27), (160, 28), (161, 29), (162, 30), (163, 31), (164, 32), (165, 33), (166, 34), (167, 35), (168, 36), (169, 37), (170, 38), (171, 39), (172, 40), (173, 41), (174, 42), (175, 43), (176, 44), (177, 45), (178, 46), (179, 47), (180, 48), (181, 49), (182, 50), (183, 51), (184, 52), (185, 53), (186, 54), (187, 55), (188, 56), (189, 57), (190, 58), (191, 59), (192, 60), (193, 61), (194, 62), (195, 63), (196, 64), (197, 65), (198, 66), (199, 67), (200, 68), (201, 69), (202, 70), (203, 71), (204, 72), (205, 73), (206, 74), (207, 75), (208, 76), (209, 77), (210, 78), (211, 79), (212, 80), (213, 81), (214, 82), (215, 83), (216, 84), (217, 85), (218, 86), (219, 87), (220, 88), (221, 89), (222, 90), (223, 91), (224, 92), (225, 93), (226, 94), (227, 95), (228, 96), (229, 97), (230, 98), (231, 99), (232, 100), (233, 101), (234, 102), (235, 103), (236, 104), (237, 105), (238, 106), (239, 107), (240, 108), (241, 109), (242, 110), (243, 111), (244, 112), (245, 113), (246, 114), (247, 115), (248, 116), (249, 117), (250, 118), (251, 119), (252, 120), (253, 0)]

How to determine two 2 Dimensional lists are exactl same?

This is a part of a large program. I have a list like
cnfn=[(1, -3), (2, -3), (-1, -2, 3), (-1, 4), (-2, 4), (1, 2, -4), (-4, -5), (4, 5), (-3, 6), (-5, 6), (3, 5, -6), (7, -8), (6, -8), (-7, -6, 8), (-6, 9), (-7, 9), (6, 7, -9), (-9, -10), (9, 10), (-8, 11), (-10, 11), (8, 10, -11), (7, -12), (4, -12), (-7, -4, 12), (-12, 13), (-3, 13), (12, 3, -13), (14, -16), (15, -16), (-14, -15, 16), (-16, -17), (16, 17), (-14, 18), (-15, 18), (14, 15, -18), (17, -19), (18, -19), (-17, -18, 19), (13, -20), (19, -20), (-13, -19, 20), (-20, -21), (20, 21), (-19, 22), (-13, 22), (19, 13, -22), (21, -23), (22, -23), (-21, -22, 23), (13, -24), (18, -24), (-13, -18, 24), (-24, 25), (-16, 25), (24, 16, -25), (26, -28), (27, -28), (-26, -27, 28), (-28, -29), (28, 29), (-26, 30), (-27, 30), (26, 27, -30), (29, -31), (30, -31), (-29, -30, 31), (25, -32), (31, -32), (-25, -31, 32), (-32, -33), (32, 33), (-31, 34), (-25, 34), (31, 25, -34), (33, -35), (34, -35), (-33, -34, 35), (25, -36), (30, -36), (-25, -30, 36), (-36, 37), (-28, 37), (36, 28, -37), (38, -40), (39, -40), (-38, -39, 40), (-40, -41), (40, 41), (-38, 42), (-39, 42), (38, 39, -42), (41, -43), (42, -43), (-41, -42, 43), (37, -44), (43, -44), (-37, -43, 44), (-44, -45), (44, 45), (-43, 46), (-37, 46), (43, 37, -46), (45, -47), (46, -47), (-45, -46, 47), (37, -48), (42, -48), (-37, -42, 48), (-48, 49), (-40, 49), (48, 40, -49), (-50, -51), (50, 51), (-51, 53), (-52, 53), (51, 52, -53), (-52, -54), (52, 54), (-54, 55), (-50, 55), (54, 50, -55), (53, -56), (55, -56), (-53, -55, 56), (-56, -57), (56, 57), (58, -59), (57, -59), (-58, -57, 59), (52, -60), (50, -60), (-52, -50, 60), (-59, 61), (-60, 61), (59, 60, -61), (56, -62), (58, -62), (-56, -58, 62), (-58, -63), (58, 63), (57, -64), (63, -64), (-57, -63, 64), (-62, 65), (-64, 65), (62, 64, -65), (-66, -67), (66, 67), (-67, 69), (-68, 69), (67, 68, -69), (-68, -70), (68, 70), (-70, 71), (-66, 71), (70, 66, -71), (69, -72), (71, -72), (-69, -71, 72), (-72, -73), (72, 73), (61, -74), (73, -74), (-61, -73, 74), (68, -75), (66, -75), (-68, -66, 75), (-74, 76), (-75, 76), (74, 75, -76), (72, -77), (61, -77), (-72, -61, 77), (-61, -78), (61, 78), (73, -79), (78, -79), (-73, -78, 79), (-77, 80), (-79, 80), (77, 79, -80), (-81, -82), (81, 82), (-82, 84), (-83, 84), (82, 83, -84), (-83, -85), (83, 85), (-85, 86), (-81, 86), (85, 81, -86), (84, -87), (86, -87), (-84, -86, 87), (-87, -88), (87, 88), (76, -89), (88, -89), (-76, -88, 89), (83, -90), (81, -90), (-83, -81, 90), (-89, 91), (-90, 91), (89, 90, -91), (87, -92), (76, -92), (-87, -76, 92), (-76, -93), (76, 93), (88, -94), (93, -94), (-88, -93, 94), (-92, 95), (-94, 95), (92, 94, -95), (-96, -97), (96, 97), (-97, 99), (-98, 99), (97, 98, -99), (-98, -100), (98, 100), (-100, 101), (-96, 101), (100, 96, -101), (99, -102), (101, -102), (-99, -101, 102), (-102, -103), (102, 103), (91, -104), (103, -104), (-91, -103, 104), (-104, -105), (104, 105), (-104, 106), (-105, 106), (104, 105, -106), (102, -107), (91, -107), (-102, -91, 107), (-91, -108), (91, 108), (103, -109), (108, -109), (-103, -108, 109), (-107, 110), (-109, 110), (107, 109, -110), (-1, 50), (1, -50), (-2, 52), (2, -52), (-7, 58), (7, -58), (-14, 66), (14, -66), (-15, 68), (15, -68), (-26, 81), (26, -81), (-27, 83), (27, -83), (-38, 96), (38, -96), (-39, 98), (39, -98), (-11, -65, -111), (-11, 65, 111), (11, -65, 111), (11, 65, -111), (-23, -80, -112), (-23, 80, 112), (23, -80, 112), (23, 80, -112), (-35, -95, -113), (-35, 95, 113), (35, -95, 113), (35, 95, -113), (-47, -106, -114), (-47, 106, 114), (47, -106, 114), (47, 106, -114), (-49, -110, -115), (-49, 110, 115), (49, -110, 115), (49, 110, -115), (111, 112, 113, 114, 115)]
And there is another list
cnfb=[(1, -3), (2, -3), (-1, -2, 3), (-1, 4), (-2, 4), (1, 2, -4), (4, 5), (-4, -5), (-3, 6), (-5, 6), (3, 5, -6), (7, -8), (6, -8), (-7, -6, 8), (-6, 9), (-7, 9), (6, 7, -9), (9, 10), (-9, -10), (-8, 11), (-10, 11), (8, 10, -11), (7, -12), (4, -12), (-7, -4, 12), (-12, 13), (-3, 13), (12, 3, -13), (14, -16), (15, -16), (-14, -15, 16), (16, 17), (-16, -17), (-14, 18), (-15, 18), (14, 15, -18), (17, -19), (18, -19), (-17, -18, 19), (13, -20), (19, -20), (-13, -19, 20), (20, 21), (-20, -21), (-19, 22), (-13, 22), (19, 13, -22), (21, -23), (22, -23), (-21, -22, 23), (13, -24), (18, -24), (-13, -18, 24), (-24, 25), (-16, 25), (24, 16, -25), (26, -28), (27, -28), (-26, -27, 28), (28, 29), (-28, -29), (-26, 30), (-27, 30), (26, 27, -30), (29, -31), (30, -31), (-29, -30, 31), (25, -32), (31, -32), (-25, -31, 32), (32, 33), (-32, -33), (-31, 34), (-25, 34), (31, 25, -34), (33, -35), (34, -35), (-33, -34, 35), (25, -36), (30, -36), (-25, -30, 36), (-36, 37), (-28, 37), (36, 28, -37), (38, -40), (39, -40), (-38, -39, 40), (40, 41), (-40, -41), (-38, 42), (-39, 42), (38, 39, -42), (41, -43), (42, -43), (-41, -42, 43), (37, -44), (43, -44), (-37, -43, 44), (44, 45), (-44, -45), (-43, 46), (-37, 46), (43, 37, -46), (45, -47), (46, -47), (-45, -46, 47), (37, -48), (42, -48), (-37, -42, 48), (-48, 49), (-40, 49), (48, 40, -49), (50, 51), (-50, -51), (-51, 53), (-52, 53), (51, 52, -53), (52, 54), (-52, -54), (-54, 55), (-50, 55), (54, 50, -55), (53, -56), (55, -56), (-53, -55, 56), (56, 57), (-56, -57), (58, -59), (57, -59), (-58, -57, 59), (52, -60), (50, -60), (-52, -50, 60), (-59, 61), (-60, 61), (59, 60, -61), (56, -62), (58, -62), (-56, -58, 62), (58, 63), (-58, -63), (57, -64), (63, -64), (-57, -63, 64), (-62, 65), (-64, 65), (62, 64, -65), (66, 67), (-66, -67), (-67, 69), (-68, 69), (67, 68, -69), (68, 70), (-68, -70), (-70, 71), (-66, 71), (70, 66, -71), (69, -72), (71, -72), (-69, -71, 72), (72, 73), (-72, -73), (61, -74), (73, -74), (-61, -73, 74), (68, -75), (66, -75), (-68, -66, 75), (-74, 76), (-75, 76), (74, 75, -76), (72, -77), (61, -77), (-72, -61, 77), (61, 78), (-61, -78), (73, -79), (78, -79), (-73, -78, 79), (-77, 80), (-79, 80), (77, 79, -80), (81, 82), (-81, -82), (-82, 84), (-83, 84), (82, 83, -84), (83, 85), (-83, -85), (-85, 86), (-81, 86), (85, 81, -86), (84, -87), (86, -87), (-84, -86, 87), (87, 88), (-87, -88), (76, -89), (88, -89), (-76, -88, 89), (83, -90), (81, -90), (-83, -81, 90), (-89, 91), (-90, 91), (89, 90, -91), (87, -92), (76, -92), (-87, -76, 92), (76, 93), (-76, -93), (88, -94), (93, -94), (-88, -93, 94), (-92, 95), (-94, 95), (92, 94, -95), (96, 97), (-96, -97), (-97, 99), (-98, 99), (97, 98, -99), (98, 100), (-98, -100), (-100, 101), (-96, 101), (100, 96, -101), (99, -102), (101, -102), (-99, -101, 102), (102, 103), (-102, -103), (91, -104), (103, -104), (-91, -103, 104), (104, 105), (-104, -105), (-104, 106), (-105, 106), (104, 105, -106), (102, -107), (91, -107), (-102, -91, 107), (91, 108), (-91, -108), (103, -109), (108, -109), (-103, -108, 109), (-107, 110), (-109, 110), (107, 109, -110), (35, 95, -111), (-35, -95, -111), (-35, 95, 111), (35, -95, 111), (23, 80, -112), (-23, -80, -112), (-23, 80, 112), (23, -80, 112), (49, 106, -113), (-49, -106, -113), (-49, 106, 113), (49, -106, 113), (47, 110, -114), (-47, -110, -114), (-47, 110, 114), (47, -110, 114), (11, 65, -115), (-11, -65, -115), (-11, 65, 115), (11, -65, 115), [111, 112, 113, 114, 115], (-26, 83), (26, -83), (-2, 50), (2, -50), (-38, 98), (38, -98), (-27, 81), (27, -81), (-39, 96), (39, -96), (-7, 58), (7, -58), (-14, 68), (14, -68), (-15, 66), (15, -66), (-1, 52), (1, -52)]
If I check with plane eye the look like having same values but if I put them in the same function the result is different. How can I determine those two have exactly same type and same value?
The two lists are NOT the same. That is why a function may be giving you a different result for the different lists.
To check if 2 lists are identical, you can do:
list1 == list2
So to give some examples:
>>> [1, 2, 3, 4, 5] == [1, 2, 3, 4, 5]
True
>>> [1, 2, 3, 4, 5] == [1, 2, 3, 4, 3]
False
>>> [1, 2, 3, 4, 5] == [5, 4, 3, 2, 1]
False
>>> [(1, 2), (3, 4)] == [(1, 2), (3, 4)]
True
>>> [(1, 2), (3, 4)] == [(1, 2), (3, 5)]
False
If you want to find what the differences are, you can do the following:
[e for e in list1 if e not in list2] + [e for e in list2 if e not in list1]
which I think is actually very readable for what it is.
So we could put that inside a function:
def comp(list1, list2):
return [e for e in list1 if e not in list2] + [e for e in list2 if e not in list1]
and some examples:
>>> comp([1, 2, 3], [1, 2, 3]) #should be empty as no differnence
[]
>>> comp([(1, 2), (3, 4)], [(1, 2), (3, 5)])
[(3, 4), (3, 5)]
>>> comp([(1, 2), (3, 4)], [(1, 2), (3, 5), (6, 7)])
[(3, 4), (3, 5), (6, 7)]

Sorting Tuples Python

I want to sort tuples using this method...
If (a1,b1) < (a2,b2) then a2>a1 or (a1==a2 and b2>b1).
The algorithm should not work in place, and it's expected that it will receive numbers in the range [0,99].
Input:
[(9, 7), (78, 24), (17, 74), (53, 81), (40, 43), (79, 82), (84, 46), (68, 53),
(92, 95), (60, 38), (20, 62), (72, 57)]
Output:
[(9, 7), (17, 74), (20, 62), (40, 43), (53, 81), (60, 38), (68, 53), (72, 57),
(78, 24), (79, 82), (84, 46), (92, 95)]
I thought of using the concept of counting sort since the time complexity has to be O(n), but then the list counter length would be 100*100. That wouldn't be a very efficient approach.
Do you have any suggestions?
sorted() built-in function should work just fine for your case, it compares the first element and if the first element is the same for two items, it then compares the 2nd element, etc.
In the following example, simple_list[0][0] and simple_list[1][0] are equal (4 and 4), so simple_list[0][1] and simple_list[1][1] (3 and 5) are compared:
>>> simple_list = [(4, 3), (4, 5), (1, 2)]
>>> sorted(simple_list)
[(1, 2), (4, 3), (4, 5)]
For your case, try the following:
tuples_list = [(9, 7), (78, 24), (17, 74), (53, 81), (40, 43), (79, 82), (84, 46), (68, 53), (92, 95), (60, 38), (20, 62), (72, 57)]
sorted_list = sorted(tuples_list)
Output:
>>> sorted(tuples_list)
[(9, 7), (17, 74), (20, 62), (40, 43), (53, 81), (60, 38), (68, 53), (72, 57), (78, 24), (79, 82), (84, 46), (92, 95)]

Append values of two strings into pairs

I start with two numpy arrays, the "x values" and the "y values":
import numpy as np
x = np.arange(100)
y = np.arange(100)
The output is
[ 0 1 2 3 4 ..... 96 97 98 99]
[ 0 1 2 3 4 ..... 96 97 98 99]
I would like to append these values together into an array of len() = 100 such that the output is
[ (0,0) (1,1) (2,2) (3,3) .... (98,98) (99,99) ]
How does one use indexing to both (A) put the pairs in the correct order and (B) put the paratheses ( and comma , in the correct order?
For your particular requirement, you can use the built-in zip function, which combines multiple lists at their corresponding indexes (that is ith index of all lists that are parameter to it in combined in the returned iterator).
Example -
import numpy as np
x = np.arange(100)
y = np.arange(100)
print(list(zip(x,y)))
>>> [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9), (10, 10), (11, 11), (12, 12), (13, 13), (14, 14), (15, 15), (16, 16), (17, 17), (18, 18), (19, 19), (20, 20), (21, 21), (22, 22), (23, 23), (24, 24), (25, 25), (26, 26), (27, 27), (28, 28), (29, 29), (30, 30), (31, 31), (32, 32), (33, 33), (34, 34), (35, 35), (36, 36), (37, 37), (38, 38), (39, 39), (40, 40), (41, 41), (42, 42), (43, 43), (44, 44), (45, 45), (46, 46), (47, 47), (48, 48), (49, 49), (50, 50), (51, 51), (52, 52), (53, 53), (54, 54), (55, 55), (56, 56), (57, 57), (58, 58), (59, 59), (60, 60), (61, 61), (62, 62), (63, 63), (64, 64), (65, 65), (66, 66), (67, 67), (68, 68), (69, 69), (70, 70), (71, 71), (72, 72), (73, 73), (74, 74), (75, 75), (76, 76), (77, 77), (78, 78), (79, 79), (80, 80), (81, 81), (82, 82), (83, 83), (84, 84), (85, 85), (86, 86), (87, 87), (88, 88), (89, 89), (90, 90), (91, 91), (92, 92), (93, 93), (94, 94), (95, 95), (96, 96), (97, 97), (98, 98), (99, 99)]
For Python 2.x , please note you do not need list(zip(...)) , since zip itself would return a list , but for Python 3.x , zip returns an iterator, and to print it we would need to convert it into a list.
You can use np.dstack to get the columns :
>>> np.dstack((x,y))
array([[[ 0, 0],
[ 1, 1],
[ 2, 2],
[ 3, 3],
[ 4, 4],
[ 5, 5],
[ 6, 6],
[ 7, 7],
[ 8, 8],
[ 9, 9],
...
[99, 99]]])
And if you want to get tuple instead of list you can use map to convert it to tuple:
>>> map(tuple,np.dstack((x,y))[0])
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9), (10, 10), (11, 11), (12, 12), (13, 13), (14, 14), (15, 15), (16, 16), (17, 17), (18, 18), (19, 19), (20, 20), (21, 21), (22, 22), (23, 23), (24, 24), (25, 25), (26, 26), (27, 27), (28, 28), (29, 29), (30, 30), (31, 31), (32, 32), (33, 33), (34, 34), (35, 35), (36, 36), (37, 37), (38, 38), (39, 39), (40, 40), (41, 41), (42, 42), (43, 43), (44, 44), (45, 45), (46, 46), (47, 47), (48, 48), (49, 49), (50, 50), (51, 51), (52, 52), (53, 53), (54, 54), (55, 55), (56, 56), (57, 57), (58, 58), (59, 59), (60, 60), (61, 61), (62, 62), (63, 63), (64, 64), (65, 65), (66, 66), (67, 67), (68, 68), (69, 69), (70, 70), (71, 71), (72, 72), (73, 73), (74, 74), (75, 75), (76, 76), (77, 77), (78, 78), (79, 79), (80, 80), (81, 81), (82, 82), (83, 83), (84, 84), (85, 85), (86, 86), (87, 87), (88, 88), (89, 89), (90, 90), (91, 91), (92, 92), (93, 93), (94, 94), (95, 95), (96, 96), (97, 97), (98, 98), (99, 99)]
>>>
You could use vstack
In [36]: xy = np.vstack((x,y)).T
In [37]: xy.shape
Out[37]: (100, 2)
In [38]: xy[0]
Out[38]: array([0, 0])
In [39]: xy[1]
Out[39]: array([1, 1])

Categories