Ordering a list of items based on their second value

Ordering a list of items based on their second value - python

The function needs to be able to take a list such as:
[("Alice", [1, 2, 1, 1, 1, 1]), ("Bob", [3, 1, 5, 3, 2, 5]), ("Clare", [2, 3, 2, 2, 4, 2]), ("Dennis", [5, 4, 4, 4, 3, 4]), ("Eva", [4, 5, 3, 5, 5, 3])]
and process the information to order it by the total of each of the results, and output the data in the original format but starting with the person with the lowest score and working downwards and it has to break ties using the first value in each. (The first result for each person)
What I have written so far can take one entry and work out the total score:
def result(name, a):
a.remove(max(a))
score = 0
for i in a:
score = score + i
return score
But I need to be able to adapt this to take any number of entries and be able to out more than just the total.
I know I need to have the function work out the total scores but be able to keep the original sets in tact but I don't know how to interact with just one part of an entry and iterate through all of them doing the same.
I'm using Python 3.4.

If I've understood your question properly, you'd like to sort the list, and have the sort order defined by the sum of the numbers provided in each tuple. So, Alice's numbers add up to 7, Clare's add up to 15, so Alice is before Clare.
sorted() can take a function to override the normal sort order. For example, you'd want:
sorted(data, key=lambda entry: sum(entry[1]))
This will mean that in order to work out what's bigger and smaller, sorted() will look at the sum of the list of numbers, and compare those. For example, when looking at Alice, the lambda (anonymous) function will be given ("Alice", [1, 2, 1, 1, 1, 1]), and so entry[1] is [1, 2, 1, 1, 1, 1], sum(entry[1]) is then 7, and that's the number that sorted() will use to put it in the right place.

Related

Is there a way to randomly generate groups of numbers so that each number is repeated the same amount of times but only in any given group once?

Sorry for the long title, but I'm not sure how to shorten it. I'm trying to program an object that targets other objects. Each object is assigned an integer id starting from 0, and that's all that's really relavent here. I can access objects by id, so I jsut need to get the numbers. Each object should target the same amount of other objects, not target itself, and not target any other object more than once. I want to make each object randomly choose it's targets given these conditions, which is trivial. But this can make thing uneven, which I want to avoid. By uneven, I mean the amount of times an object is targeted by other objects is random. So what I want is for each object to not only have the same amount of targets, but be targeted the same amount of times, which would end up being the same amount. This leads to my problem. I need to ranomly generate groups of numbers from a given range. Each group will be the same size. Within each group, each number is unique. The order is irrelevent; these groups could just be sets. Overall, each number should appear the same mount of times. The amount of times each number is repeated and the size of the groups is the same, and given. Making sure each object's id isn't in the corresponding group can be done after the groups are generated, so that's not an issue.
Attempt 1: Guess and check
So my first thought was a simple guess-and-check method. Basically just randomly generating the groups of numbers and rejecting invalid numbers. Ignoring that this program sometimes ends up in an infinite loop (this was just a quick thing made for this), it gets the job done. But I feel like it's kinda inefficent, and there's still the whole infinite loop problem.
import random
from typing import Tuple, Set, Iterator
def randTargets(numObjs:int, numTargets:int) -> Iterator[Tuple[int, Set[int]]]:
# numObjs is the total number of objects
# numTargets is the number of other objects each object will target
# numTargets can be assumed to be in the range [2, numObjs)
numLeft = [numTargets] * numObjs
# numLeft represents how many times each object can be targeted
# i.e., if numLeft[i] is n, i can only be targeted n more time(s)
for i in range(numObjs):
targets = set()
# this can get caught in an infinite loop where the only
# target left is the object itself, but I'm too lazy to fix
# that right now. Just assume that doesn't happen lmao
while len(targets) < numTargets:
t = random.randrange(numObjs)
# checks that the target isn't the object itself, hasn't already
# been targeted too many times, and hasn't already been targeted
# by this item
if t != i and numLeft[t] > 0 and t not in targets:
targets.add(t)
numLeft[t] -= 1
yield i, targets
# check to make sure every object has been targeted exactly numTargets times
assert all(i == 0 for i in numLeft)
Attempt 2: Not really random
I tried to take a crack at this, but the best thing I could come up with wasn't exactly random.
def randTargetsSlightlyBetter(numObjs:int, numTargets:int) -> Iterator[Tuple[int, Set[int]]]:
# numObjs is the total number of objects
# numTargets is the number of other objects each object will target
# numTargets can be assumed to be in the range [2, numObjs)
objs = list(range(numObjs))
targets = []
for offset in random.sample(range(1, numObjs), k=numTargets):
# shifts the objs list to the right by offset, wrapping around
targets.append(objs[offset:] + objs[:offset])
for obj, *targets in zip(objs, *targets):
yield obj, targets
I feel like it might be kinda hard to tell what that does, so here:
# if numObjs = 4, objs looks like this:
[0, 1, 2, 3]
# let's assume instead of random offsets I just use (1, 2)
# targets would look like this:
[[1, 2, 3, 0],
[2, 3, 0, 1]]
# adding objs to the beginning of targets would get:
[[0, 1, 2, 3],
[1, 2, 3, 0],
[2, 3, 0, 1]]
# each column is a group, the first row being the object targeting the others
# transposing the list using zip(*targets), we get:
[(0, 1, 2),
(1, 2, 3),
(2, 3, 0),
(3, 0, 1)]
# which would be equivalent to zip(objs, *targets)
Trying to randomize this by shuffling the values of the inital list objs wouldn't do anything, because the individual objects are interchangable and the ids are arbitrary. So I thought to randomize how much the target lists are offset, which kind works. But this wouldn't be completely random, there would still be a pattern to things. Looking at the example with offsets of (1, 2), we can see that object 0 would target objects 1 and 2. Object 1 would target 2 and 3, and so on. While the pattern would be harder to see with randomized offsets, there still would be one, and that's what I'm trying to avoid.
Sorry if any of this was confusingly explained, I can have a wierd way of thinking about thing lmao. If anything needs clarifying, let me know.
TL;DR:
I have a range of integer numbers, and I want to randomly generate groups of numbers from this range in a way that no number appears more than once in any given group. The order of numbers in a group doesn't matter, and groups don't have to have every number, just a certain amount. Additionally, I want to do this so that overall, every number in the range appears multiple times between all the groups.
Example output:
>>>randGroups(range(4), repeats=2)
[{0, 2}, {1, 3}, {0, 3}, {1, 2}]
>>>randGroups(range(10), repeats=3)
[{1, 8, 9}, {1, 2, 8}, {5, 6, 7}, {2, 3, 6}, {4, 5, 9}, {5, 7, 9}, {0, 8, 9}, {0, 2, 8}, {1, 6, 7}, {0, 3, 4}]

Randomized via index as opposed to actual parameter, while it isn't the most random it might still count.
# import library
import random
# input parameters
range_ = 9
repeats = 4
# rotate array elements
def rotate(input, n):
return input[n:] + input[:n]
# shifts for shuffleing, as they are all different then no index can have same element
shifts = [i for i in range(0,range_)]
random.shuffle(shifts)
# arrays to partition for shuffling
to_shuffle = [[j for j in range(0,range_)] for i in range(0,repeats)]
# rotate arrays relative to shift
for i in range(0,repeats):
temp = rotate(to_shuffle[i],shifts[i])
to_shuffle[i] = temp
# exchange rows and columns so as to give proper output
output_arrays = [[to_shuffle[j][i] for j in range(0,repeats)] for i in range(0,range_)]
# random output
print(output_arrays)
Example output:
[[3, 8, 4, 5], [4, 0, 5, 6], [5, 1, 6, 7], [6, 2, 7, 8], [7, 3, 8, 0], [8, 4, 0, 1], [0, 5, 1, 2], [1, 6, 2, 3], [2, 7, 3, 4]]

In Numpy, how to use an array of items as the guide to determine the index of items in a second array?

This is hard to describe with a good title. Here is what I want to do:
I have a numpy array with unique items in it:
unique_arr = np.asarray([1, 4, 12, 5])
...then I have a second array that is very long, and has many occurrences of the items in the first array:
long_arr = np.asarray([12, 4, 4, 1, 12, 5, 5, ... ])
I'd like to make a third array that is the same length of long_arr, but instead of the items long_arr has, it has the indexes of those items in unique_arr:
long_idxs = something_magic(unique_arr, long_arr)
print(long_idxs)
>>> [2, 1, 1, 0, 2, 3, 3, ...]
Is there an efficient numpy-way of accomplishing this?

You can use searchsorted, but then you need to sort unique_arr first:
unique, idx = np.unique(unique_arr, return_index=True)
a = np.searchsorted(unique, long_arr)
long_idxs = idx[a]
Output:
array([2, 1, 1, 0, 2, 3, 3])
Note that searchsorted doesn't check for exact match, e.g. if long_arr contains 3, it would returns 1 still. You may need to validate the result.

How to sort iterable with Python without using stable sort?

So, I have an iterable in Input like this:
[4, 6, 2, 2, 6, 4, 4, 4]
And I want to sort it based on decreased frequency order. So that the result will be this:
[4, 4, 4, 4, 6, 6, 2, 2]
So what happened here is that, when an element has the same frequency of another one, they will be in the same order (6 appeared first so the 6 goes before the 2).
I tried to implement this mechanism using the sorted function but I have a big problem.
def frequency_sort(items):
return sorted(items, key=lambda elem: sum([True for i in items if i == elem]), reverse=True)
I know this short way is difficult to read but it just sort the array using the key parameter to extract the frequency of a number. But, the output is this:
[4, 4, 4, 4, 6, 2, 2, 6]
As you can see the output is a little different from what it should be. And that happened (I think) because sorted() is a function that does a "stable sort" i.e. a sort that will keep the order as it is if there are same keys.
So what is happening here is like a strong stable sort. I want more like a soft-sort that will take into account the order but will put the same elements next to each other.

You could use collections.Counter and use most_common that returns in descending order of frequency:
from collections import Counter
def frequency_sorted(lst):
counts = Counter(lst)
return [k for k, v in counts.most_common() for _ in range(v)]
result = frequency_sorted([4, 6, 2, 2, 6, 4, 4, 4])
print(result)
Output
[4, 4, 4, 4, 6, 6, 2, 2]
From the documentation on most_common:
Return a list of the n most common elements and their counts from the
most common to the least. If n is omitted or None, most_common()
returns all elements in the counter. Elements with equal counts are
ordered in the order first encountered

Bisect keep indexes of inserted items

I am using the bisect module to keep a list sorted while inserting numbers.
Lets say I have am going to insert three numbers 9, 2, 5 in this order.
The last state of this list would be obviously [2, 5, 9], however, is there any chance that I can find the index list that numbers are inserted into this list. For this list it would be [1, 2, 0]. So the list I need is the indexes [0, 1, 2] after the sort is happened which in bisect is in happening with each insertion, thats why I could not find a way. I could just sort it with key feature of the sorted function however I dont want to increase the complexity. So my question is this achievable with the bisect module ?
Here is the code I use,
import bisect
lst = []
bisect.insort(lst, 9)
bisect.insort(lst, 2)
bisect.insort(lst, 5)
print lst
Edit: Another example would be, i am going to insert the numbers 4, 7, 1, 2, 9 to some empty list. (Let's first assume without bisect, that I already have the numbers in the list)
[4, 7, 1, 2, 9]
# indexes [0, 1, 2, 3, 4], typical enumeration
after sorting,
[1, 2, 4, 7, 9]
# now the index list [2, 3, 0, 1, 4]
Can it be done with bisect without increasing complexity.
Note: The order of the insertion is not arbitrary. It is known, thats why I try to use indexes with bisect.

insort has no idea in what order the items were inserted. You'll have to add that logic yourself. One way to do so could be to insert 2-tuples consisting of the value and the index:
bisect.insort(lst, (9, 0))
bisect.insort(lst, (2, 1))
bisect.insort(lst, (5, 2))
You would need to keep track of the index yourself as you're adding objects, but as sequences are sorted first by the first item, then by the next, etc., this will still sort properly without any extra effort.

How to update a set?

Seems like using update should be pretty straight forward, and I think that I'm using it correctly, so it must be an error dealing with types or something else.
But anyway, here's the sit:
I'm doing coursework for a Coursera course (needless to say, answers minimizes or occluding code most helpful!) and am stuck on the last problem. The task is to return a set that contains all the documents which contain all the words in a query. The function takes an inverseIndex, a dictionary containing words as keys and the documents containing those words as values ex: {'a':[0,1],'be':[0,1,4].....}
The way I've attempted to implement this is pretty simple: get a set of sets, where each of the sets contains the list of document IDs, and then call .intersections(sets) to merge the sets into a set containing only the doc IDs of docs that contain all words in the query.
def andSearch(inverseIndex, query):
sets = set()
s = set()
for word in query:
s.update(inverseIndex[word])
print(inverseIndex[word])
print s
s.intersection(*sets)
return s
Unfortunately, this returns all the documents in the inverseIndex when it should only return the index '3'.
terminal output:
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
[0, 1, 3, 4]
[2, 3, 4]
set([0, 1, 2, 3, 4])
What's wrong?
Thanks so much!
sets = []
s = set()
for word in query:
sets.append(inverseIndex[word])
print sets
s.intersection(*sets)
return s
Output:
[[0, 1, 2, 3, 4], [0, 1, 2, 3], [0, 1, 2, 3, 4], [0, 1, 2, 3], [0, 1, 3, 4], [2, 3, 4]]
set([])
logout

You use update inside the loop. So, on each iteration you add the new pages to s. But you need to intersect those pages, because you need the pages, each of which contains all the words (not 'at least one word'). So you need to intersect on each iteration instead of updating.
Also, I'm not getting why you need sets at all.
This should work:
def andSearch(inverseIndex, query):
return set.intersection(*(set(inverseIndex[word]) for word in query))
This just produces the array of sets:
>>> [set(ii[word]) for word in query]
[set([0, 1]), set([0, 1, 4])]
And then I just call set.intersection to intersect them all.
About your question update.
It happens because s is empty.
Consider this example:
>>> s = set()
>>> s.intersection([1,2,3],[2,3,4])
set([])
To intersect sets just use set.intersection. But it accepts only sets as arguments. So you should convert lists of pages to sets of pages, or keep pages as sets in the dictionary.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Ordering a list of items based on their second value - python

Related

Is there a way to randomly generate groups of numbers so that each number is repeated the same amount of times but only in any given group once?

In Numpy, how to use an array of items as the guide to determine the index of items in a second array?

How to sort iterable with Python without using stable sort?

Bisect keep indexes of inserted items

How to update a set?

Categories

Resources