How to store calculated values? - python

I have been trying to write code which gives the solution of the number of ways of reaching a sum, which is specified. This is very similar to the subset sums problem which I have found online (Finding all possible combinations of numbers to reach a given sum).
I modified slightly the code so that it reuses numbers multiple times.
object_list = [(2, 50), (3, 100), (5, 140)] # the first number in the tuple is my weight and the second is my cost
max_weight = 17
weight_values = [int(i[0]) for i in object_list]
cost_values = [int(i[1]) for i in object_list]
def subset_sum(objects, max_weight, weights=[]):
w = sum(weights)
if w == max_weight:
print("sum(%s)=%s" % (weights, max_weight))
if w >= max_weight:
return
for i in range(len(objects)):
o = objects[i]
subset_sum(objects, max_weight, weights + [o])
if __name__ == "__main__":
subset_sum(weight_values, max_weight)
print(subset_sum(weight_values, max_weight))
This gives the solution:
sum([2, 2, 2, 2, 2, 2, 2, 3])=17
sum([2, 2, 2, 2, 2, 2, 3, 2])=17
sum([2, 2, 2, 2, 2, 2, 5])=17
sum([2, 2, 2, 2, 2, 3, 2, 2])=17
...
So on so forth.
Unlike the original I am using a list of tuples and then taking the first value of the tuple to make a list. This is the same with the last value.
The part I am currently stuck on is how to store these values and reuse them in the next part of the code.I had a look at this post but I couldn't understand it (Python: How to store the result of an executed function and re-use later?).
So I want to store the part of the solution which stores [2, 2, 2, 2, 2, 2, 2, 3] from the solution sum([2, 2, 2, 2, 2, 2, 2, 3])=17. I want to do this for all solutions because in the next step i am going to replace the numbers with the next part of the tuple (so 2 will be replaced by 50 because the tuple that 2 is in is (2,50)). Then I am going to use this to print another sum value with the replaced numbers and print the highest value (probably going to sort the solutions from highest to lowest and print the first value in the list)
I tried using a dictionary to try and replace the values after the calculation but i couldn't manage to do it.
I tried:
dictionary = dict(zip(weight_values, cost_values))
Any help is appreciated. before anyone asks Ia have looked online for solutions and have no one else to ask help from, since the only person i know who has a background in coding is my brother who isn't at home

Related

Applying an iterable mask, checking it against a value - if value doesn't satisfy the mask condition, move to the next value which does

I currently have some code where I've created a mask which checks to see if a variable matches the first position in a sequence, called index_pos_overload. If it matches, the variable is chosen, and the check ends. However, I want to be able to use this mask to not only check if the number satisfies the condition of the mask, but if it doesn't move along to the next value in the sequence which does. It's essentially to pick out a row in my pandas data column, hyst. My code currently looks like this:
import pandas as pd
from itertools import chain
hyst = pd.DataFrame({"test":[12, 4, 5, 4, 1, 3, 2, 5, 10, 9, 7, 5, 3, 6, 3, 2 ,1, 5, 2]})
possible_overload_cycle = 1
index_pos_overload = chain.from_iterable((hyst.index[i])
for i in range(0, len(hyst)-1, 5))
if (possible_overload_cycle == index_pos_overload):
hyst_overload_cycle = possible_overload_cycle
else:
hyst_overload_cycle = 5 #next value in iterable where index_pos_overload is true
The expected output of hyst_overload_cycle should be this:
print(hyst_overload_cycle)
5
I've included my logic as to how I think this should work - possible_overload_cycle = 1 does not point to the first position in the dataframe, so hyst_overload_cycle should return as 5, the first position in the mask. I hope I've made sense, as I can't quite seem to work out how I would go about this programatically.
If I understood you correctly, it may be simpler than you think:
index_pos_overload can be an array / list, there is no need to use complex constructs to store a sequence of values
to find the first non-zero value from index_pos_overload, one can simply use np.nonzero()[0][0] (the first [0] is to select the dimension, the second is to select the index within that axis) and use array indexing of that on the original index_pos_overload array
The code would look like:
import numpy as np
import pandas as pd
hyst = pd.DataFrame({"test":[12, 4, 5, 4, 1, 3, 2, 5, 10, 9, 7, 5, 3, 6, 3, 2 ,1, 5, 2]})
possible_overload_cycle = 1
index_pos_overload = np.array([hyst.index[i] for i in range(0, len(hyst)-1, 5)])
if possible_overload_cycle in index_pos_overload:
hyst_overload_cycle = possible_overload_cycle
else:
hyst_overload_cycle = index_pos_overload[np.nonzero(index_pos_overload)[0][0]]
print(hyst_overload_cycle)
# 5

Reason why numpy rollaxis is so confusing?

The behavior of the numpy rollaxis function confuses me.
The documentation says:
Roll the specified axis backwards, until it lies in a given position.
And for the start parameter:
The axis is rolled until it lies before this position.
To me, this is already somehow inconsistent.
Ok, straight forward example (from the documentation):
>>> a = np.ones((3,4,5,6))
>>> np.rollaxis(a, 1, 4).shape
(3, 5, 6, 4)
The axis at index 1 (4) is rolled backward till it lies before index 4.
Now, when the start index is smaller than the axis index, we have this behavior:
>>> np.rollaxis(a, 3, 1).shape
(3, 6, 4, 5)
Instead of shifting the axis at index 3 before index 1, it ends up at 1.
Why is that? Why isn't the axis always rolled to the given start index?
NumPy v1.11 and newer includes a new function, moveaxis, that I recommend using instead of rollaxis (disclaimer: I wrote it!). The source axis always ends up at the destination, without any funny off-by-one issues depending on whether start is greater or less than end:
import numpy as np
x = np.zeros((1, 2, 3, 4, 5))
for i in range(5):
print(np.moveaxis(x, 3, i).shape)
Results in:
(4, 1, 2, 3, 5)
(1, 4, 2, 3, 5)
(1, 2, 4, 3, 5)
(1, 2, 3, 4, 5)
(1, 2, 3, 5, 4)
Much of the confusion results from our human intuition - how we think about moving an axis. We could specify a number of roll steps (back or forth 2 steps), or a location in the final shape tuple, or location relative to the original shape.
I think the key to understanding rollaxis is to focus on the slots in the original shape. The most general statement that I can come up with is:
Roll a.shape[axis] to the position before a.shape[start]
before in this context means the same as in list insert(). So it is possible to insert before the end.
The basic action of rollaxis is:
axes = list(range(0, n))
axes.remove(axis)
axes.insert(start, axis)
return a.transpose(axes)
If axis<start, then start-=1 to account for the remove action.
Negative values get +=n, so rollaxis(a,-2,-3) is the same as np.rollaxis(a,2,1). e.g. a.shape[-3]==a.shape[1]. List insert also allows a negative insert position, but rollaxis doesn't make use of that feature.
So the keys are understanding that remove/insert pair of actions, and understanding transpose(x).
I suspect rollaxis is intended to be a more intuitive version of transpose. Whether it achieves that or not is another question.
You suggest either omitting the start-=1 or applying across the board
Omitting it doesn't change your 2 examples. It only affects the rollaxis(a,1,4) case, and axes.insert(4,1) is the same as axes.insert(3,1) when axes is [0,2,3]. The 1 is still placed at the end. Changing that test a bit:
np.rollaxis(a,1,3).shape
# (3, 5, 4, 6) # a.shape[1](4) placed before a.shape[3](6)
without the -=1
# transpose axes == [0, 2, 3, 1]
# (3, 5, 6, 4) # the 4 is placed at the end, after 6
If instead -=1 applies always
np.rollaxis(a,3,1).shape
# (3, 6, 4, 5)
becomes
(6, 3, 4, 5)
now the 6 is before the 3, which was the original a.shape[0]. After the roll 3 is the the a.shape[1]. But that's a different roll specification.
It comes down to how start is defined. Is a postion in the original order, or a position in the returned order?
If you prefer to think of start as an index position in the final shape, wouldn't it be simpler to drop the before part and just say 'move axis to dest slot'?
myroll(a, axis=3, dest=0) => (np.transpose(a,[3,0,1,2])
myroll(a, axis=1, dest=3) => (np.transpose(a,[0,2,3,1])
Simply dropping the -=1 test might do the trick (omiting the handling of negative numbers and boundaries)
def myroll(a,axis,dest):
x=list(range(a.ndim))
x.remove(axis)
x.insert(dest,axis)
return a.transpose(x)
a = np.arange(1*2*3*4*5).reshape(1,2,3,4,5)
np.rollaxis(a,axis,start)
'axis' is the index of the axis to be moved starting from 0. In my example the axis at position 0 is 1.
'start' is the index (again starting at 0) of the axis that we would like to move our selected axis before.
So, if start=2, the axis at position 2 is 3, therefor the selected axis will be before the 3.
Examples:
>>> np.rollaxis(a,0,2).shape # the 1 will be before the 3.
(2, 1, 3, 4, 5)
>>> np.rollaxis(a,0,3).shape # the 1 will be before the 4.
(2, 3, 1, 4, 5)
>>> np.rollaxis(a,1,2).shape # the 2 will be before the 3.
(1, 2, 3, 4, 5)
>>> np.rollaxis(a,1,3).shape # the 2 will be before the 4.
(1, 3, 2, 4, 5)
So, after the roll the number at axis before the roll will be placed just before the number at start before the roll.
If you think of rollaxis like this it is very simple and makes perfect sense, though it's strange that they chose to design it this way.
So, what happens when axis and start are the same? Well, you obviously can't put a number before itself, so the number doesn't move and the instruction becomes a no-op.
Examples:
>>> np.rollaxis(a,1,1).shape # the 2 can't be moved to before the 2.
(1, 2, 3, 4, 5)
>>> np.rollaxis(a,2, 2).shape # the 3 can't be moved to before the 3.
(1, 2, 3, 4, 5)
How about moving the axis to the end? Well, there's no number after the end, but you can specify start as after the end.
Example:
>>> np.rollaxis(a,1,5).shape # the 2 will be moved to the end.
(1, 3, 4, 5, 2)
>>> np.rollaxis(a,2,5).shape # the 3 will be moved to the end.
(1, 2, 4, 5, 3)
>>> np.rollaxis(a,4,5).shape # the 5 is already at the end.
(1, 2, 3, 4, 5)

Python-search function

I want to write a search function that takes in a value x and a sorted sequence and returns the position that the value should go to by iterating through the elements of the sequence starting from the first element. The position that x should go to in the list should be the first position such that it will be less than or equal to the next element in the list.
Example:>>> search(-5, (1, 5, 10))——0
>>> search(3, (1, 5, 10))——1
Building a list of every item would be a bit of a waste of resources if there were big gaps in the list, instead you can just iterate through each list item until the input is bigger than the value.
In terms of your code -
def search(input,inputList):
for i in range( len( inputList ) ):
if inputList[i]>input:
return i
return len( inputList )
print search(-5, (1, 5, 10))
#Result: 0
print search(3, (1, 5, 10))
#Result: 1
To insert it into the list, this would work, I split the list in 2 based on the index and add the value in the middle.
def insert(input,inputList):
index = search(input,inputList) #Get where the value should be inserted
newInput = [input]+list(inputList[index:]) #Add the end of the list to the input
if index:
newInput = list(inputList[:index])+newInput #Add the start of the list if the index isn't 0
return newInput
print insert(-5, (1, 5, 10))
#Result: (-5, 1, 5, 10)
print insert(3, (1, 5, 10))
#Result: (1, 3, 5, 10)
since someone has answered a similar question, I will just draw a rough skeleton of what u may want to do.
declare a list and populate it with your stuff;
mylist = [1, 2, 3, 5, 5, 6, 7]
then just make a function and iterate the list;
def my_func( x, mylist):
for i in mylist:
if((mylist[i] == x)|| (mylist[i] > x)):
return i
Given 3 in list (1, 2, 3, 4, 5), the function should return index 2.
Given 3 in list (1, 2, 4, 5, 6), it should still return 2
You may want to check my python code, because I have not checked this for errors, I am assuming you know some python and if you have the skeleton, you should crack it. And Oh, python cares about the tabbibg I did.

Number of distinct items between two consecutive uses of an item in realtime

I'm working on an problem that finds the distance- the number of distinct items between two consecutive uses of an item in realtime. The input is read from a large file (~10G), but for illustration I'll use a small list.
from collections import OrderedDict
unique_dist = OrderedDict()
input = [1, 4, 4, 2, 4, 1, 5, 2, 6, 2]
for item in input:
if item in unique_dist:
indx = unique_dist.keys().index(item) # find the index
unique_dist.pop(item) # pop the item
size = len(unique_dist) # find the size of the dictionary
unique_dist[item] = size - indx # update the distance value
else:
unique_dist[item] = -1 # -1 if it is new
print input
print unique_dist
As we see, for each item I first check if the item is already present in the dictionary, and if it is, I update the value of the distance or else I insert it at the end with the value -1. The problem is that this seems to be very inefficient as the size grows bigger. Memory isn't a problem, but the pop function seems to be. I say that because, just for the sake if I do:
for item in input:
unique_dist[item] = random.randint(1,99999)
the program runs really fast. My question is, is there any way I could make my program more efficient(fast)?
EDIT:
It seems that the actual culprit is indx = unique_dist.keys().index(item). When I replaced that with indx = 1. The program was orders of magnitude faster.
According to a simple analysis I did with the cProfile module, the most expensive operations by far are OrderedDict.__iter__() and OrderedDict.keys().
The following implementation is roughly 7 times as fast as yours (according to the limited testing I did).
It avoids the call to unique_dist.keys() by maintaining a list of items keys. I'm not entirely sure, but I think this also avoids the call to OrderedDict.__iter__().
It avoids the call to len(unique_dist) by incrementing the size variable whenever necessary. (I'm not sure how expensive of an operation len(OrderedDict) is, but whatever)
def distance(input):
dist= []
key_set= set()
keys= []
size= 0
for item in input:
if item in key_set:
index= keys.index(item)
del keys[index]
del dist[index]
keys.append(item)
dist.append(size-index-1)
else:
key_set.add(item)
keys.append(item)
dist.append(-1)
size+= 1
return OrderedDict(zip(keys, dist))
I modified #Rawing's answer to overcome the overhead caused by the lookup and insertion time taken by set data structure.
from random import randint
dist = {}
input = []
for x in xrange(1,10):
input.append(randint(1,5))
keys = []
size = 0
for item in input:
if item in dist:
index = keys.index(item)
del keys[index]
keys.append(item)
dist[item] = size-index-1
else:
keys.append(item)
dist[item] = -1
size += 1
print input
print dist
How about this:
from collections import OrderedDict
unique_dist = OrderedDict()
input = [1, 4, 4, 2, 4, 1, 5, 2, 6, 2]
for item in input:
if item in unique_dist:
indx = unique_dist.keys().index(item)
#unique_dist.pop(item) # dont't pop the item
size = len(unique_dist) # now the directory is one element to big
unique_dist[item] = size - indx - 1 # therefor decrement the value here
else:
unique_dist[item] = -1 # -1 if it is new
print input
print unique_dist
[1, 4, 4, 2, 4, 1, 5, 2, 6, 2]
OrderedDict([(1, 2), (4, 1), (2, 2), (5, -1), (6, -1)])
Beware that the entries in unique_dist are now ordered by there first occurrence of the item in the input; yours were ordered by there last occurrence:
[1, 4, 4, 2, 4, 1, 5, 2, 6, 2]
OrderedDict([(4, 1), (1, 2), (5, -1), (6, -1), (2, 1)])

Number of permutations of a set with an element limited to some positions

Say I have a combination of digits [1, 3, 5, 0, 9]. How can I calculate the number of permutations of the combination where 0 is not at the first position? Also, there may be more than one 0 in the combination.
The literal translation of your problem into python code would be:
>>> from itertools import permutations
>>> len([x for x in permutations((1, 3, 5, 0, 9)) if x[0]!=0])
96
But note that this actually calculates all the permutations, which would take a long time when the sequence gets long enough.
If all you are interested is the number of possible permutations fitting your restrictions, you'd be better off calculating that number via combinatorial considerations as fredtantini mentioned.
Let's say that you are talking about list (sets are not ordered and cannot have an item more than once).
Calculate the number of permutation is a mathematical problem that can be dealed without python: the number of permutation of a set of length 5 is 5!. As you don't want all the permutations that start with 0, the total number is 5!-4!=96.
Python has the module itertools with the permutation function. You can use list comprehension to filter the results and calculate the length:
>>>[l for l in permutations(list({1, 3, 5, 0, 9})) if l[0]!=0]
[(9, 0, 3, 5, 1), (9, 0, 3, 1, 5), ..., (1, 5, 3, 9, 0)]
>>>len([l for l in permutations(list({1, 3, 5, 0, 9})) if l[0]!=0])
96
In case I am able to understand your problem, then following logic should work:
a = [1, 3, 5, 0, 9]
import itertools
perm = list(itertools.permutations(a))
perm_new = []
for i in range(len(perm)):
if perm[i][0] != 0:
perm_new.append(perm[i])

Categories