This question already has an answer here:
How to iterate through array combinations with constant sum efficiently?
(1 answer)
Closed 6 years ago.
Say you have n items each ranging from 1-100. How can I get go over all possible variations within the range?
Example:
3 stocks A, B and C
Working to find possible portfolio allocation.
A - 0 0 0 1 2 1 1
B - 0 1 2 ... 0 0 ... 1 2
C - 100 99 98 99 98 98 97
Looking for an efficient way to get a matrix of all possible outcomes.
Sum should add up to 100 and cover all possible variations for n elements.
How I'd do it:
>>> import itertools
>>> cp = itertools.product(range(101),repeat=3)
>>> portfolios = list(p for p in cp if sum(p)==100)
But that creates unnecessary combinations. See discussions of integer partitioning to avoid that. E.g., Elegant Python code for Integer Partitioning
Related
let's say I have the following line:
l = Line(Point(25, 0), Point(25, 25))
and I have a dataframe (df) which contains 2500 points, something like:
x y
0 0 49
1 13 48
2 0 47
3 5 46
4 9 45
...
How can I efficiently examine if the lines formed by each and every combination of those points intersects with the above line?
Note that I am using the intersection function from the sympy library.
And note that using two nested loop takes forever... not efficient.
This question already has answers here:
Accessing the index in 'for' loops
(26 answers)
Closed 1 year ago.
Here I have a list:
some_list = [a','r','p','i','l','a','z','a','r','l','i','i','l','z','p']
I want some function to index each of the characters in the list with an unique index.
So the code should be something like:
for char in some_list:
char_index = some_list.magic_index(char)
print(char_index)
magic_index should be a function that returns a number from 0 to 14 incrementally for each character.
The output should be something like:
0
1
2
3
4
4
5
6
7
8
9
10
11
12
13
14
I know this isn't really indexing each character, but I just want some function to return a value from 0 to 14 for each character, so that each character has their own unique number from 0 to 14.
I know this is kind of a dumb question, it is some how just very hard for me. If someone know how to solve this, please give me some help. Thank you!
Use enumerate and build a map of characters to indices:
>>> magic_index = {c: i for i, c in enumerate(some_list)}.get
>>> magic_index('a')
7
I have a large list of sub-lists (approx. 16000) that I want to find where the repeating pattern starts and ends. I am not 100% sure that there is a repeat, however I have a strong reason to believe so, due to the diagonals that appear within the sub-list sequence. The structure of a list of sub-lists is preferred, as it is used that way for other things in this script. The data looks like this:
data = ['1100100100000010',
'1001001000000110',
'0010010000001100',
'0100100000011011', etc
I do not have any time constraints, however the fastest method would not be frown upon. The code should be able to return the starting/ending sequence and location within the list, to be called upon in the future. If there is an arrangement of the data that would be more useful, I can try to reformat it if necessary. Python is something that I have been learning for the past few months, so I am not quite able to just create my own algorithms from scratch just yet. Thank you!
Here's some fairly simple code that scans a string for adjacent repeating subsequences. Set minrun to the length of the smallest subsequences that you want to check. For each match, the code prints the starting index of the first subsequence, the length of the subsequence, and the subsequence itself.
data = [
'1100100100000010',
'1001001000000110',
'0010010000001100',
'0100100000011011',
]
data = ''.join(data)
minrun = 3
lendata = len(data)
for runlen in range(minrun, lendata // 2):
i = 0
while i < lendata - runlen * 2:
s1 = data[i:i + runlen]
s2 = data[i + runlen:i + runlen * 2]
if s1 == s2:
print(i, runlen, s1)
i += runlen
else:
i += 1
output
1 3 100
4 3 100
8 3 000
15 3 010
18 3 010
23 3 000
32 3 001
38 3 000
47 3 001
53 3 000
17 15 001001000000110
32 15 001001000000110
Note that we get the same sequence of length 3 at index 15 and 18 = 15 + 3 : 010; that indicates that there are 3 adjacent copies of 010. Similarly, there are 3 adjacent copies of the sequence at index 17 of length 15.
I am researching how python implements dictionaries. One of the equations in the python dictionary implementation relates the pseudo random probing for an empty dictionary slot using the equation
j = ((j*5) + 1) % 2**i
which is explained here.
I have read this question, How are Python's Built In Dictionaries Implemented?, and basically understand how dictionaries are implemented.
What I don't understand is why/how the equation:
j = ((j*5) + 1) % 2**i
cycles through all the remainders of 2**i. For instance, if i = 3 for a total starting size of 8. j goes through the cycle:
0
1
6
7
4
5
2
3
0
if the starting size is 16, it would go through the cycle:
0 1 6 15 12 13 2 11 8 9 14 7 4 5 10 3 0
This is very useful for probing all the slots in the dictionary. But why does it work ? Why does j = ((j*5)+1) work but not j = ((j*6)+1) or j = ((j*3)+1) both of which get stuck in smaller cycles.
I am hoping to get a more intuitive understanding of this than the equation just works and that's why they used it.
This is the same principle that pseudo-random number generators use, as Jasper hinted at, namely linear congruential generators. A linear congruential generator is a sequence that follows the relationship X_(n+1) = (a * X_n + c) mod m. From the wiki page,
The period of a general LCG is at most m, and for some choices of factor a much less than that. The LCG will have a full period for all seed values if and only if:
m and c are relatively prime.
a - 1 is divisible by all prime factors of m.
a - 1 is divisible by 4 if m is divisible by 4.
It's clear to see that 5 is the smallest a to satisfy these requirements, namely
2^i and 1 are relatively prime.
4 is divisible by 2.
4 is divisible by 4.
Also interestingly, 5 is not the only number that satisfies these conditions. 9 will also work. Taking m to be 16, using j=(9*j+1)%16 yields
0 1 10 11 4 5 14 15 8 9 2 3 12 13 6 7
The proof for these three conditions can be found in the original Hull-Dobell paper on page 5, along with a bunch of other PRNG-related theorems that also may be of interest.
Question : A set of numbers separated by space is passed as input. The program must print the largest snake sequence present in the numbers. A snake sequence is made up of adjacent numbers such that for each number, the number on the right or left is +1 or -1 of it's value. If multiple snake sequences of maximum length is possible print the snake sequence appearing in the natural input order.
Example Input/Output 1:
Input:
9 8 7 5 3 0 1 -2 -3 1 2
Output:
3 2 1 0 1
Example Input/Output 2:
Input:
-5 -4 -3 -1 0 1 4 6 5 4 3 4 3 2 1 0 2 -3 9
Output:
6 5 4 3 4 3 2 1 0 -1 0 1 2
Example Input/Output 3:
Input:
5 6 7 9 8 8
Output:
5 6 7 8 9 8
I have searched online & have only found references to find a snake sequence when a grid of numbers is given & not an array.
My Solution so far :
Create a 2D Array containing all the numbers from input as 1 value and the 2nd value being the max length sequence that can be generated starting from that number. But this doesn't always generate the max length sequence and doesn't work at all when there are 2 snakes of max length.
Assuming that the order in the original set of numbers does not matter, as seems to be the case in your question, this seems to be an instance of the Longest Path Problem, which is NP-hard.
Think of it that way: You can create a graph from your numbers, with edges between all pairs of nodes that have a difference of one. Now, the longest simple (acyclic) path in this graph is your solution. Your first example would correspond to this graph and path. (Note that there are two 1 nodes for the two ones in the input set.)
While this in itself does not solve your problem, it should help you getting started finding an algorithm to solve (or approximate) it, now that you know a better/more common name for the problem.
One algorithm works like this: Starting from each of the numbers, determine the "adjacent" numbers and do sort of a depth-first search through the graph to determine the longest path. Remember to temporarily remove the visited nodes from the graph. This has a worstcase complexity of O(2n) 1), but apparently it's sufficient for your examples.
def longest_snake(numbers, counts, path):
best = path
for n in sorted(counts, key=numbers.index):
if counts[n] > 0 and (path == [] or abs(path[-1] - n) == 1):
counts[n] -= 1
res = longest_snake(numbers, counts, path + [n])
if len(res) > len(best):
best = res
counts[n] += 1
return best
Example:
>>> from collections import Counter
>>> numbers = list(map(int, "9 8 7 5 3 0 1 -2 -3 1 2".split()))
>>> longest_snake(numbers, Counter(numbers), [])
[3, 2, 1, 0, 1]
Note that this algorithm will reliably find a maximum "snake" sequence, using no number more often than allowed. However, it may not find the specific sequence that's expected as the output, i.e. "the snake sequence appearing in the natural input order", whatever that's supposed to mean.
To get closer to the "natural order", you might try the numbers in the same order as they appear in the input (as I did with sorted), but that does not work perfectly, either. Anyway, I'm sure you can figure out the rest by yourself.
1) In this special case, the graph has a branching factor of 2, thus O(2n); in the more general case, the complexity would be closer to O(n!).