Related
I have two lists that have the same number of elements. In the first list, an element represents a category type. In the second list, an element represents a kind of cost. Indices corresponds to each other.
For example:
category_list = [1, 2, 2, 1, 3, 3, 3, 3, 4, 2]
cost_list = [30, 45, 21, 22, 21, 32, 11, 12, 13, 11]
On the condition that I pick one for every category, I would like to minimize the cost. How can I implement this? Faster, better. Thank you for your help.
I recommended this code. In here we use the prebuilt method 'zip' in python to iteratively get the category_list and cost_list so I supposed it is good practice and then I using conditional expression to create my logic. First I want to determine in that category already in my x_dict in that expression is true I using 'min' prebuilt method to compare the value that already exists and the new value of the related key. So it is helpful to manage our logic. Otherwise, that key does not already exist then we can added key-value pair for the first time. the dictionary key is must be unique that's why we using this logic.
category_list = [1, 2, 2, 1, 3, 3, 3, 3, 4, 2]
cost_list = [30, 45, 21, 22, 21, 32, 11, 12, 13, 11]
x_dict=dict()
for cat,cost in zip(category_list,cost_list):
if cat in x_dict:
x_dict[cat]=min(x_dict[cat],cost)
else:
x_dict[cat]=cost
print(x_dict)
Output
{1: 22, 2: 11, 3: 11, 4: 13}
We can use the zip method directly and convert it to the dictionary. It can filter the unique keys but the case is it given the last value for the related key which we use like given below.
x=dict(zip(category_list,cost_list))
print(x)
output
{1: 22, 2: 11, 3: 12, 4: 13}
You can see last cost of the given list assign to the related category. Thats we give that logic for our code.
You can store the minimum cost of each category in a dict.
category_list = [1, 2, 2, 1, 3, 3, 3, 3, 4, 2]
cost_list = [30, 45, 21, 22, 21, 32, 11, 12, 13, 11]
min_cst = dict()
for cat, cst in zip(category_list, cost_list):
if cat in min_cst:
min_cst[cat] = min(min_cst[cat], cst)
else:
min_cst[cat] = cst
print(min_cst)
# {1: 22, 2: 11, 3: 11, 4: 13}
The time complexity is O(N) and space complexity is O(N).
Since there are already three nearly-identical answers using dict, here is a different answer using min only once per category:
category_list = [1, 2, 2, 1, 3, 3, 3, 3, 4, 2]
cost_list = [30, 45, 21, 22, 21, 32, 11, 12, 13, 11]
pick_indices = [min((i for i in range(len(cost_list)) if category_list[i] == cat), key=lambda i: cost_list[i]) for cat in set(category_list)]
pick_totalcost = sum(cost_list[i] for i in pick_indices)
Or alternatively:
pick_costs = [min(cost for i,(cost,cat) in enumerate(zip(cost_list, category_list)) if cat == c) for c in set(category_list)]
pick_totalcost = sum(pick_costs)
Solution 1:
You can iterate over category_list and save a min of cost_list in a dictionary that key is category_list and you can use dictionary.get() and replace min with the minimum that exist before in the dictionary.
category_list = [1, 2, 2, 1, 3, 3, 3, 3, 4, 2]
cost_list = [30, 45, 21, 22, 21, 32, 11, 12, 13, 11]
dct_min = {}
for idx in range(len(category_list)):
min_item = dct_min.get(category_list[idx], cost_list[idx])
# alternative solution by thanks #Stef
# min_item = dct_min.setdefault(category_list[idx], cost_list[idx]); if cost_list[idx] < min_item: ...
if cost_list[idx] <= min_item:
dct_min[category_list[idx]] = cost_list[idx]
print(dct_min)
Output:
{1: 22, 2: 11, 3: 11, 4: 13}
Solution 2: (you can use zip then sort tuple first base index0 then index1 and convert sorted tuple to dictionary and we know in dictionary we have only one key.)
zip_lsts = list(zip(category_list, cost_list))
dict(sorted(zip_lsts, key=lambda element: (element[0], element[1]),reverse=True))
# {4: 13, 3: 11, 2: 11, 1: 22}
[{'_id': '5ebe39e41e1729d90de',
'modelId': '5ebe3536c289711579',
'lastAt': datetime.datetime(2020, 5, 15, 6, 42, 44, 79000),
'proId': '5ebe3536c2897115793dccfb',
'genId': '5ebe355ac2897115793dcd04',
'count':'ab'},
{'_id': '5ebe3a0d94fcb800fa474310',
'modelId': '5ebe3536c289711579',
'proId': '5ebe3536c2897115793d',
'genId': '5ebe355ac2897115793',
'lastAt': datetime.datetime(2020, 5, 15, 6, 43, 25, 105000),'count':'cd'}]
i've count in the above collection documents which is in encrypted form, how can i extract the counts for each day for a week and total count for a week and for a month.
Note: need to extract the sum of the counts
I have a list of hours starting from (0 is midnight).
hour = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
I want to generate a sequence of 3 consecutive hours randomly. Example:
[3,6]
or
[15, 18]
or
[23,2]
and so on. random.sample does not achieve what I want!
import random
hourSequence = sorted(random.sample(range(1,24), 2))
Any suggestions?
Doesn't exactly sure what you want, but probably
import random
s = random.randint(0, 23)
r = [s, (s+3)%24]
r
Out[14]: [16, 19]
Note: None of the other answers take in to consideration the possible sequence [23,0,1]
Please notice the following using itertools from python lib:
from itertools import islice, cycle
from random import choice
hours = list(range(24)) # List w/ 24h
hours_cycle = cycle(hours) # Transform the list in to a cycle
select_init = islice(hours_cycle, choice(hours), None) # Select a iterator on a random position
# Get the next 3 values for the iterator
select_range = []
for i in range(3):
select_range.append(next(select_init))
print(select_range)
This will print sequences of three values on your hours list in a circular way, which will also include on your results for example the [23,0,1].
You can try this:
import random
hour = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
index = random.randint(0,len(hour)-2)
l = [hour[index],hour[index+3]]
print(l)
You can get a random number from the array you already created hour and take the element that is 3 places afterward:
import random
def random_sequence_endpoints(l, span):
i = random.choice(range(len(l)))
return [hour[i], hour[(i+span) % len(l)]]
hour = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
result = random_sequence_endpoints(hour, 3)
This will work not only for the above hours list example but for any other list contain any other elements.
In this other SO post, a Python user asked how to group continuous numbers such that any sequences could just be represented by its start/end and any stragglers would be displayed as single items. The accepted answer works brilliantly for continuous sequences.
I need to be able to adapt a similar solution but for a sequence of numbers that have potentially (not always) varying increments. Ideally, how I represent that will also include the increment (so they'll know if it was every 3, 4, 5, nth)
Referencing the original question, the user asked for the following input/output
[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] # input
[(2,5), (12,17), 20]
What I would like is the following (Note: I wrote a tuple as the output for clarity but xrange would be preferred using its step variable):
[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] # input
[(2,5,1), (12,17,1), 20] # note, the last element in the tuple would be the step value
And it could also handle the following input
[2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20] # input
[(2,8,2), (12,17,1), 20] # note, the last element in the tuple would be the increment
I know that xrange() supports a step so it may be possible to even use a variant of the other user's answer. I tried making some edits based on what they wrote in the explanation but I wasn't able to get the result I was looking for.
For anyone that doesn't want to click the original link, the code that was originally posted by Nadia Alramli is:
ranges = []
for key, group in groupby(enumerate(data), lambda (index, item): index - item):
group = map(itemgetter(1), group)
if len(group) > 1:
ranges.append(xrange(group[0], group[-1]))
else:
ranges.append(group[0])
The itertools pairwise recipe is one way to solve the problem. Applied with itertools.groupby, groups of pairs whose mathematical difference are equivalent can be created. The first and last items of each group are then selected for multi-item groups or the last item is selected for singleton groups:
from itertools import groupby, tee, izip
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
def grouper(lst):
result = []
for k, g in groupby(pairwise(lst), key=lambda x: x[1] - x[0]):
g = list(g)
if len(g) > 1:
try:
if g[0][0] == result[-1]:
del result[-1]
elif g[0][0] == result[-1][1]:
g = g[1:] # patch for duplicate start and/or end
except (IndexError, TypeError):
pass
result.append((g[0][0], g[-1][-1], k))
else:
result.append(g[0][-1]) if result else result.append(g[0])
return result
Trial: input -> grouper(lst) -> output
Input: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 5, 1), (12, 17, 1), 20]
Input: [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), (12, 17, 1), 20]
Input: [2, 4, 6, 8, 12, 12.4, 12.9, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), 12, 12.4, 12.9, (13, 17, 1), 20] # 12 does not appear in the second group
Update: (patch for duplicate start and/or end values)
s1 = [i + 10 for i in xrange(0, 11, 2)]; s2 = [30]; s3 = [i + 40 for i in xrange(45)]
Input: s1+s2+s3
Output: [(10, 20, 2), (30, 40, 10), (41, 84, 1)]
# to make 30 appear as an entry instead of a group change main if condition to len(g) > 2
Input: s1+s2+s3
Output: [(10, 20, 2), 30, (41, 84, 1)]
Input: [2, 4, 6, 8, 10, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 12, 2), (13, 17, 1), 20]
You can create an iterator to help grouping and try to pull the next element from the following group which will be the end of the previous group:
def ranges(lst):
it = iter(lst)
next(it) # move to second element for comparison
grps = groupby(lst, key=lambda x: (x - next(it, -float("inf"))))
for k, v in grps:
i = next(v)
try:
step = next(v) - i # catches single element v or gives us a step
nxt = list(next(grps)[1])
yield xrange(i, nxt.pop(0), step)
# outliers or another group
if nxt:
yield nxt[0] if len(nxt) == 1 else xrange(nxt[0], next(next(grps)[1]), nxt[1] - nxt[0])
except StopIteration:
yield i # no seq
which give you:
In [2]: l1 = [2, 3, 4, 5, 8, 10, 12, 14, 13, 14, 15, 16, 17, 20, 21]
In [3]: l2 = [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]
In [4]: l3 = [13, 14, 15, 16, 17, 18]
In [5]: s1 = [i + 10 for i in xrange(0, 11, 2)]
In [6]: s2 = [30]
In [7]: s3 = [i + 40 for i in xrange(45)]
In [8]: l4 = s1 + s2 + s3
In [9]: l5 = [1, 2, 5, 6, 9, 10]
In [10]: l6 = {1, 2, 3, 5, 6, 9, 10, 13, 19, 21, 22, 23, 24}
In [11]:
In [11]: for l in (l1, l2, l3, l4, l5, l6):
....: print(list(ranges(l)))
....:
[xrange(2, 5), xrange(8, 14, 2), xrange(13, 17), 20, 21]
[xrange(2, 8, 2), xrange(12, 17), 20]
[xrange(13, 18)]
[xrange(10, 20, 2), 30, xrange(40, 84)]
[1, 2, 5, 6, 9, 10]
[xrange(1, 3), 5, 6, 9, 10, 13, 19, xrange(21, 24)]
When the step is 1 it is not included in the xrange output.
Here is a quickly written (and extremely ugly) answer:
def test(inArr):
arr=inArr[:] #copy, unnecessary if we use index in a smart way
result = []
while len(arr)>1: #as long as there can be an arithmetic progression
x=[arr[0],arr[1]] #take first two
arr=arr[2:] #remove from array
step=x[1]-x[0]
while len(arr)>0 and x[1]+step==arr[0]: #check if the next value in array is part of progression too
x[1]+=step #add it
arr=arr[1:]
result.append((x[0],x[1],step)) #append progression to result
if len(arr)==1:
result.append(arr[0])
return result
print test([2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20])
This returns [(2, 8, 2), (12, 17, 1), 20]
Slow, as it copies a list and removes elements from it
It only finds complete progressions, and only in sorted arrays.
In short, it is shitty, but should work ;)
There are other (cooler, more pythonic) ways to do this, for example you could convert your list to a set, keep removing two elements, calculate their arithmetic progression and intersect with the set.
You could also reuse the answer you provided to check for certain step sizes. e.g.:
ranges = []
step_size=2
for key, group in groupby(enumerate(data), lambda (index, item): step_size*index - item):
group = map(itemgetter(1), group)
if len(group) > 1:
ranges.append(xrange(group[0], group[-1]))
else:
ranges.append(group[0])
Which finds every group with step size of 2, but only those.
I came across such a case once. Here it goes.
import more_itertools as mit
iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] # input
x = [list(group) for group in mit.consecutive_groups(iterable)]
output = [(i[0],i[-1]) if len(i)>1 else i[0] for i in x]
print(output)
I'm trying to find the intersection list of 5 lists of datetime objects. I know the intersection of lists question has come up a lot on here, but my code is not performing as expected (like the ones from the other questions).
Here are the first 3 elements of the 5 lists with the exact length of the list at the end.
[datetime.datetime(2014, 8, 14, 19, 25, 6), datetime.datetime(2014, 8, 14, 19, 25, 7), datetime.datetime(2014, 8, 14, 19, 25, 9)] # length 38790
[datetime.datetime(2014, 8, 14, 19, 25, 6), datetime.datetime(2014, 8, 14, 19, 25, 7), datetime.datetime(2014, 8, 14, 19, 25, 9)] # length 38818
[datetime.datetime(2014, 8, 14, 19, 25, 6), datetime.datetime(2014, 8, 14, 19, 25, 7), datetime.datetime(2014, 8, 14, 19, 25, 9)] # length 38959
[datetime.datetime(2014, 8, 14, 19, 25, 6), datetime.datetime(2014, 8, 14, 19, 25, 7), datetime.datetime(2014, 8, 14, 19, 25, 9)] # length 38802
[datetime.datetime(2014, 8, 14, 19, 25, 6), datetime.datetime(2014, 8, 14, 19, 25, 7), datetime.datetime(2014, 8, 14, 19, 25, 9)] # length 40415
I've made a list of these lists called times. I've tried 2 methods of intersecting.
Method 1:
intersection = times[0] # make intersection the first list
for i in range(len(times)):
if i == 0:
continue
intersection = [val for val in intersection if val in times[i]]
This method results in a list with length 20189 and takes 104 seconds to run.
Method 2:
intersection = times[0] # make intersection the first list
for i in range(len(times)):
if i == 0:
continue
intersection = list(set(intersection) & set(times[i]))
This method results in a list with length 20148 and takes 0.1 seconds to run.
I've run into 2 problems with this. The first problem is that the two methods yield different size intersections and I have no clue why. And the other problem is that the datetime object datetime.datetime(2014, 8, 14, 19, 25, 6) is clearly in all 5 lists (see above) but when I print (datetime.datetime(2014, 8, 14, 19, 25, 6) in intersection) it returns False.
Your first list times[0] has duplicate elements; this is the reason for inconsistency. If you would do intersection = list(set(times[0])) in your first snippet, the problem would go away.
As for your second code, the code will be faster if you never do changes between lists and sets:
intersection = set(times[0]) # make a set of the first list
for timeset in times[1:]:
intersection.intersection_update(timeset)
# if necessary make into a list again
intersection = list(intersection)
And actually since intersection supports multiple iterables as separate arguments. you can simply replace all your code with:
intersection = set(times[0]).intersection(*times[1:])
For the in intersection problem, is the instance an actual datetime.datetime or just pretending to be? At least the timestamps seem not to be timezone aware.
Lists can have duplicate items, which can cause inconsistencies with the length. To avoid these duplicates, you can turn each list of datetimes into a set:
map(set, times)
This will give you a list of sets (with duplicate times removed). To find the intersection, you can use set.intersection:
intersection = set.intersection(*map(set, times))
With your example, intersection will be this set:
set([datetime.datetime(2014, 8, 14, 19, 25, 9), datetime.datetime(2014, 8, 14, 19, 25, 6), datetime.datetime(2014, 8, 14, 19, 25, 7)])
There might be duplicated times and you can do it simply like this:
Python3:
import functools
result = functools.reduce(lambda x, y: set(x) & set(y), times)
Python2:
result = reduce(lambda x, y: set(x) & set(y), times)
intersection = set(*times[:1]).intersection(*times[1:])