List of list, compare the last item with python - python

I need to compare the elements of a list of list. My code is for two items inside of the list of list but when I have more than two I don't know how proceed.
My inputs have the same len ever. And I need to compare d[][:1] and if it is repeated check the d[][:-1] and print the d[] with the less d[][:-1]
The print I need: d = [[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
Code:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
if d[0][:-1] == d[1][:-1]:
if d[0][-1] < d[1][-1]:
d.remove(d[1])
else:
d.remove(d[0])
>>> print d
[[1, 2, 3, 4, 5], [1, 2, 3, 4, 4], [3, 2, 4, 2, 5], [3, 2, 4, 2, 1]]
Edited:
from operator import itemgetter
from itertools import groupby
d = [['4027221', 'MX', '0.4', 3],
['4027221', 'MX', '30', 1],
['4027222', 'MX', '0.4', 3],
['4027222', 'MX', '30', 1]]
d.sort()
d = [min(g, key=lambda s: s[-2]) for _, g in groupby(d, key=lambda s: s[:-2])]
[['4027221', 'MX', '0.4', 3], ['4027222', 'MX', '0.4', 3]]

You can use itertools.groupby to group the list by all but the last item first, and then sort the sub-lists by the last item with min:
from operator import itemgetter
from itertools import groupby
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
print([min(g, key=itemgetter(-1)) for _, g in groupby(d, key=lambda s: s[:-1])])
This outputs:
[[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]

if I understood well what you want, this should do the trick:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
mins = {}
for a_list in d:
list_key = ','.join(map(str, a_list[:-1]))
list_orderer = a_list[-1]
if list_key not in mins or mins[list_key] > list_orderer:
mins[list_key] = a_list
print(sorted(mins.values())) # [[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]
It works in Python 2 and 3, it does not require the input to be sorted and it does not require any dependency (which is not a real argument).

You can do it this way too:
d = [[1, 2, 3, 4, 5],
[1, 2, 3, 4, 6],
[1, 2, 3, 4, 4],
[3, 2, 4, 2, 5],
[3, 2, 4, 2, 1]]
sublists = list(set(tuple(i[:-1]) for i in d))
mins = [min([elem for elem in d if elem[:-1]==list(s)])for s in sublists]
print(mins)
Output:
[[3, 2, 4, 2, 1], [1, 2, 3, 4, 4]]

Scaling up blhsing's solution. For bigger data and skipping the need to sort.
import pandas as pd
cols = ['v1', 'v2', 'v3', 'v4', 'v5']
df = pd.DataFrame(d, columns=cols)
ndf = df.groupby(cols[:-1], as_index=False).min()
out = ndf.values.tolist()
print(out)
[[1, 2, 3, 4, 4], [3, 2, 4, 2, 1]]

You can use a dictionary, taking advantage of the fact that, as you iterate, only the last value will be attached to any given key. The solution does not require sorting.
d2 = {tuple(key): val for *key, val in d}
res = [list(k) + [v] for k, v in d2.items()]
print(res)
[[1, 2, 3, 4, 4],
[3, 2, 4, 2, 1]]
Note tuple conversion is required since lists are not hashable, so they cannot be used as dictionary keys.
Edit: as #JonClements suggests, you can write this more simply as:
res = list({tuple(el[:-1]): el for el in d}.values())

Related

How to increment an integer in a list iteratively within a while loop

Can anyone help me with understanding this please? I'm using a while loop to increment an integer within a list and then i'm trying to add the list to another list to create a list of lists. The list of lists has the expected number of elements but they all show the incremented integer as it would be at the end of the while loop not at each iteration of the while loop.
Here's what I've tried:
my_start_list = [1, 2, 3, 4]
my_end_list = []
while my_start_list[0] != 6:
my_start_list[0] += 1
my_end_list.append(my_start_list)
print(my_start_list)
print(my_end_list)
and here's what i get:
[6, 2, 3, 4]
[[6, 2, 3, 4], [6, 2, 3, 4], [6, 2, 3, 4], [6, 2, 3, 4], [6, 2, 3, 4]]
And I was kind of expecting:
[6, 2, 3, 4]
[[2, 2, 3, 4], [3, 2, 3, 4], [4, 2, 3, 4], [5, 2, 3, 4], [6, 2, 3, 4]]
Can anyone explain what is going on here or point me in a direction that could explain this?
Here the list you are appending is working as a pointer rather than an separate entity. so you need to hard copy the list at each iteration.
my_start_list = [1, 2, 3, 4]
my_end_list = []
c = []
while my_start_list[0] != 6:
my_start_list[0] = my_start_list[0] + 1
c = my_start_list.copy()
my_end_list.append(c)
print(my_start_list)
print(my_end_list)
Output:
[6, 2, 3, 4]
[[2, 2, 3, 4], [3, 2, 3, 4], [4, 2, 3, 4], [5, 2, 3, 4], [6, 2, 3, 4]]
​
You must create a copy the list you are going to append, otherwise you are always appending the same list:
my_start_list = [1, 2, 3, 4]
my_end_list = []
while my_start_list[0] != 6:
my_start_list[0] += 1
my_end_list.append(my_start_list.copy()) # appends a copy of the list
print(my_start_list)
print(my_end_list)
Output:
[6, 2, 3, 4]
[[2, 2, 3, 4], [3, 2, 3, 4], [4, 2, 3, 4], [5, 2, 3, 4], [6, 2, 3, 4]]

Rank elements in nested list without sorting list

Let's say I have a nested list:
list = [[10, 2, 8, 4], [12, 6, 4, 1], [8, 4, 3, 2], [9, 3, 4, 6]]
I want to rank the elements in the sublist against each other to create a new nested list with the rankings.
result = [[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
in the first sublist 10 would be 1st, 8 2nd, etc.
There are already some good solutions. Here just another one - functional approach for reference:
No 3rd library used.
lst = # your lists - don't use builtin "list"
def ranking(nums):
ranks = {x:i for i, x in enumerate(sorted(nums, reverse=True),1)}
return [ranks[x] for x in nums] # quick mapping back: O(1)
Calling it:
result = list(map(ranking, lst))
As already mentioned in the comment, you can use numpy.argsort, using it twice gives you the rank for the values, which need to be subtracted from len of the sub list to rank from highest to lowest, you can use List-Comprehension to do it for all the sub lists.
>>> import numpy as np
>>> lst = [[10, 2, 8, 4], [12, 6, 4, 1], [8, 4, 3, 2], [9, 3, 4, 6]]
>>> [(len(sub)-np.argsort(sub).argsort()).tolist() for sub in lst]
[[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
You can even use 2D numpy array and negate the values, then directly call argsort twice on the resulting array, and finally add 1:
>>> (-np.array(lst)).argsort().argsort()+1
array([[1, 4, 2, 3],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 4, 3, 2]], dtype=int64)
You can use scipy.stats.rankdata:
my_list = [[10, 2, 8, 4], [12, 6, 4, 1], [8, 4, 3, 2], [9, 3, 4, 6]]
from scipy.stats import rankdata
[list(len(l)+1-rankdata(l).astype(int)) for l in my_list]
output:
[[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
Without numpy/scipy:
[[sorted(li, reverse=True).index(x)+1 for x in li] for li in data]
[[1, 4, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4], [1, 4, 3, 2]]
Another solution with no external libraries, and with a better time complexity, just in case your sublists are a bit longer than 4 items (this has some overhead but I presume it is O(n log n) because of the call to sorted).
def rank_all(ls):
result = []
for subls in ls:
pairs = sorted([(subls[j],j) for j in range(len(subls))], reverse=True)
ranked = [0] * len(subls)
for j,p in enumerate(pairs):
ranked[p[1]]=j+1
result.append(ranked)
return result

Nested dictionary from 3 different length lists in Python

I want to make a nested Dictionary out of three different lists which are unequal in length.
These are the lists.
jaren = ['2017', '2018']
wedstrijden = ['NED', 'GER', 'GBR', 'USA']
eventresults = [[1, 2, 3, 4], [1,2], [1,2,3,4,5,6], [1,2,3,4,5,6,7,8,9,10], [3,2,1], [6,5,4,3,2,1], [4,5,6,3], [1,2,3,4,5,6,7]]
The output should be like:
main_dict = {'2017': {'NED':[1, 2, 3, 4], 'GER':[1,2], 'GBR':[1,2,3,4,5,6], 'USA':[1,2,3,4,5,6,7,8,9,10]},{'2018': {'NED':[3, 2, 1], 'GER':[6,5,4,3,2,1], 'GBR':[4,5,6,3], 'USA':[1,2,3,4,5,6,7]}}
My current output is:
main_dict = {'2017': {'NED':[1, 2, 3, 4], 'GER':[1, 2, 3, 4], 'GBR':[1, 2, 3, 4], 'USA':[1, 2, 3, 4]},{'2018': {'NED':[1,2], 'GER':[1,2], 'GBR':[1,2], 'USA':[1,2]}}
And I use this code:
main_dict = {}
for jaar, eventresult in zip(jaren, eventresults):
main_dict[jaar] = {}
for wedstrijd in wedstrijden:
main_dict[jaar][wedstrijd] = eventresult
Actually my list eventresults is a list of DataFrames instead of lists with integers.
Can someone give me a help with the code?
This is easily solved by zipping wedstrijden with an iterator over eventresults:
event_itr = iter(eventresults)
result = {}
for year in jaren:
result[year] = dict(zip(wedstrijden, event_itr))
# result:
# {'2017': {'NED': [1, 2, 3, 4], 'GER': [1, 2], 'GBR': [1, 2, 3, 4, 5, 6], 'USA': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]},
# '2018': {'NED': [3, 2, 1], 'GER': [6, 5, 4, 3, 2, 1], 'GBR': [4, 5, 6, 3], 'USA': [1, 2, 3, 4, 5, 6, 7]}}

Need to get set of numbers in cyclic order - Python

I am writing some python script to generate JSON. I constructed the jSON successfully. but stuck in getting selecting number in cyclic order.
let us say, i have a list of 1,2,3,4,5. I need to select first 4 numbers (1,2,3,4) here for first first item and 2,3,4,5 for second and 3,4,5,1 for third and it should go on till 30 times.
import json
import random
json_dict = {}
number = []
brokers = [1,2,3,4,5]
json_dict["version"] = version
json_dict["partitions"] = [{"topic": "topic1", "name": i,"replicas":
random.choice(brokers)} for i in range(0, 30)]
with open("output.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4)
Output
"version": "1",
"partitions": [
{
"topic": "topic1",
"name": 0,
"replicas": 1,2,3,4
},
{
"topic": "topic1",
"name": 1,
"replicas": 2,3,4,5
},
{
"topic": "topic1",
"name": 3,
"replicas": 3,4,5,1
Anyway, how can i achieve this?
In order to get a cyclic elements from your brokers list you can use deque from collections module and do a deque.rotation(-1) like this example:
from collections import deque
def grouper(iterable, elements, rotations):
if elements > len(iterable):
return []
b = deque(iterable)
for _ in range(rotations):
yield list(b)[:elements]
b.rotate(-1)
brokers = [1,2,3,4,5]
# Pick 4 elements from brokers and yield 30 cycles
cycle = list(grouper(brokers, 4, 30))
print(cycle)
Output:
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5,1, 2], [5, 1, 2, 3]]
Also, this is a way how to implement this solution to your final dict:
# in this example i'm using only 5 cycles
cycles = grouper(brokers, 4, 5)
partitions = [{"topic": "topic1", "name": i, "replicas": cycle_elem} for i, cycle_elem in zip(range(5), cycles)]
final_dict = {"version": "1", "partitions": partitions}
print(final_dict)
Output:
{'partitions': [{'name': 0, 'replicas': [1, 2, 3, 4], 'topic': 'topic1'}, {'name': 1, 'replicas': [2, 3, 4, 5], 'topic': 'topic1'}, {'name': 2, 'replicas': [3, 4, 5, 1], 'topic': 'topic1'}, {'name': 3, 'replicas': [4, 5, 1, 2], 'topic': 'topic1'}, {'name': 4, 'replicas': [5, 1, 2, 3], 'topic': 'topic1'}], 'version': '1'}
Here's a purely procedural solution which also adds flexibility of selecting any number of groups, of any size (even bigger than the original 'brokers' list) with any offset:
def get_subgroups(groups, base, size, offset=1):
# cover the group size > len(base) case by expanding the base
# this step is completely optional if your group size will never be bigger
base *= -(-size // len(base))
result = [] # storage for our groups
base_size = len(base) # no need to call len() all the time
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
tail = current_offset + size # end index for our current slice
end = min(tail, base_size) # normalize to the base size
group = base[current_offset:end] + base[:tail - end] # get our slice
result.append(group) # append it to our result storage
current_offset = (current_offset + offset) % base_size # increase our current offset
return result
brokers = [1, 2, 3, 4, 5]
print(get_subgroups(5, brokers, 4)) # 5 groups of size 4, with default offset
# prints: [[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 1], [4, 5, 1, 2], [5, 1, 2, 3]]
print(get_subgroups(3, brokers, 7, 2)) # 3 groups of size 7, with offset 2
# prints: [[1, 2, 3, 4, 5, 1, 2], [3, 4, 5, 1, 2, 3, 4], [5, 1, 2, 3, 4, 5, 1]]
And it does it in a O(N) time with a single loop.
If you're planning to run this on a very large generator, you can turn get_subgroups() function into a generator by forgoing the result collection and doing yield group instead of result.append(group). That way you can call it in a loop as: for group in get_subgroups(30, broker, 4): and store the group in whatever structure you want.
UPDATE
If memory is not an issue, we can optimize (processing-wise) this even more by expanding the whole base (or brokers in your case) to fit the whole set:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
result = [] # storage for our groups
current_offset = 0 # tracking current cycle offset
for i in range(groups): # use xrange() on Python 2.x instead
result.append(base[current_offset:current_offset+size])
current_offset += offset
return result
Or we can make it even faster with list comprehension if we don't need the ability to turn it into a generator:
def get_subgroups(groups, base, size, offset=1): # warning, heavy memory usage!
base *= -(-(offset * groups + size) // len(base))
return [base[i:i+size] for i in range(0, groups * offset, offset)]
# as previously mentioned, use xrange() on Python 2.x instead
This is a pretty cool problem, and I think I have a pretty cool solution:
items = [1, 2, 3, 4, 5]
[(items * 2)[x:x+4] for i in range(30) for x in [i % len(items)]]
which gives
[[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 1],
[4, 5, 1, 2],
[5, 1, 2, 3]]
What this is doing is taking your set of things and appending it to itself (items * 2 -> [1, 2, 3, 4, 5, 1, 2, 3 ,4, 5]), and then picking a starting place (x) by taking our loop iteration (i) and modulating it (probably not the right word) by the number of items we have (i in [x % len(items)]).

Generating and merging all combinations of multiple lists python

Similiar questions to this one have been asked before, but none exactly like it an and I'm kind of lost.
If I have 2 sets of lists (or a lists of lists)
listOLists = [[1,2,3],[1,3,2]]
listOLists2 = [[4,5,6],[4,6,5]]
And I want 'merge' the two lists to make
mergedLists = [[1,2,3,4,5,6],[1,3,2,4,5,6],[1,2,3,4,6,5],[1,3,2,4,6,5]]
How would I do this?
list1s=[[1,2,3],[3,2,1],[2,2,2]]
list2s=[[3,3,3],[4,4,4],[5,5,5]]
for indis1 in list1s:
for indis2 in list2s:
print(indis1 + indis2)
try and;
[1, 2, 3, 3, 3, 3]
[1, 2, 3, 4, 4, 4]
[1, 2, 3, 5, 5, 5]
[3, 2, 1, 3, 3, 3]
[3, 2, 1, 4, 4, 4]
[3, 2, 1, 5, 5, 5]
[2, 2, 2, 3, 3, 3]
[2, 2, 2, 4, 4, 4]
[2, 2, 2, 5, 5, 5]
You may use generator to simplify your code, like this:
a = [[1, 2, 3], [1, 3, 2], [2, 1, 3]]
b = [[4, 5, 6], [4, 6, 5], [5, 4, 6]]
c = [i + j for i in a for j in b]
print c
Output:
[[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 6, 5], [1, 2, 3, 5, 4, 6], [1, 3, 2, 4, 5, 6], [1, 3, 2, 4, 6, 5], [1, 3, 2, 5, 4, 6], [2, 1, 3, 4, 5, 6], [2, 1, 3, 4, 6, 5], [2, 1, 3, 5, 4, 6]]
list1 = [[1,2,3],[1,3,2]]
list2 = [[4,5,6],[4,6,5]]
mergedLists = []
for list1_inner in list1:
for list2_inner in list2:
mergedLists.append(list1_inner + list2_inner)
print(mergedLists)
A comparison of methods:
import itertools
import random
l1 = [[random.randint(1,100) for _ in range(100)]for _ in range(100)]
l2 = [[random.randint(1,100) for _ in range(100)]for _ in range(100)]
With itertools:
def itert(l1, l2):
[list(itertools.chain(*x)) for x in itertools.product(l1, l2)]
With for loops:
def forloops(list1, list2):
mergedLists = []
for list1_inner in list1:
for list2_inner in list2:
mergedLists.append(list1_inner + list2_inner)
With a simple Comprehension:
def comp(l1, l2):
[i + j for i in l1 for j in l2]
Speed
%time itert(l1, l2)
Wall time: 99.8 ms
%time comp(l1, l2)
Wall time: 31.3 ms
%time forloops(l1, l2)
Wall time: 46.9 ms

Categories