Splitting math sums with python - python

This is a simple question that has been bothering me for a while now.
I am attempting to rewrite my code to be parallel, and in the process I need to split up a sum to be done on multiple nodes and then add those small sums together. The piece that I am working with is this:
def pia(n, i):
k = 0
lsum = 0
while k < n:
p = (n-k)
ld = (8.0*k+i)
ln = pow(16.0, p, ld)
lsum += (ln/ld)
k += 1
return lsum
where n is the limit and i is an integer. Does anyone have some hints on how to split this up and get the same result in the end?
Edit: For those asking, I'm not using pow() but a custom version to do it efficiently with floating point:
def ssp(b, n, m):
ssp = 1
while n>0:
if n % 2 == 1:
ssp = (b*ssp) % m
b = (b**2) % m
n = n // 2
return ssp

Since the only variable that's used from one pass to the next is k, and k just increments by one each time, it's easy to split the calculation.
If you also pass k into pia, then you'll have both a definable starting and ending points, and you can split this up into as many pieces as you want, and at the end, add all the results together. So something like:
# instead of pia(20000, i), use pia(n, i, k) and run
result = pia(20000, i, 10000) + pia(10000, i, 0)
Also, since n is used to both set the limits and in the calculation directly, these two uses need to be split.
from math import pow
def pia(nlimit, ncalc, i, k):
lsum = 0
while k < nlimit:
p = ncalc-k
ld = 8.0*k+i
ln = ssp(16., p, ld)
lsum += ln/ld
k += 1
return lsum
if __name__=="__main__":
i, ncalc = 5, 10
print pia(10, ncalc, i, 0)
print pia(5, ncalc, i, 0) + pia(10, ncalc, i, 5)

Looks like I found a way. What I did was in the sum I had each node calculate a portion (ex. node one calculates k=1, node 2 k=2, node 3 k=3, node 4 k=4, node 1 k=5...) and then gathered them up and added them.

Related

Absolute value of a difference in gurobi with abs_() not working

I've got a problem with a gurobi program which is supposed to find a certain number of distinct shortest paths in a graph with a length not exceeding maxLength by using an LP. For making sure that the different paths are distinct I tried to sum up the number of arcs where one path is different from another. y[a,i,j] should be one if paths i and j are different in arc a and zero otherwise.
I tried to achieve that by taking the difference between x[a,i] and x[a,j] at every arc and then expect the sum over all arcs for every combination of i and j to be greater one. Everything above that are just constraints for a regular min cost flow. Somehow my problem is infeasible for any of the test instances if I want more than 1 path. Any ideas? Thanks in advance.
def findXShortestPaths(V, A, pred, succ, start, end, cost, amount, maxLength, origin, destination):
model = Model("Shortest Path")
I = range(amount)
x = model.addVars(A, I, vtype = GRB.BINARY, name = "x")
y = model.addVars(A, I, I, vtype = GRB.INTEGER, name = "y")
z = model.addVars(A,I,I,vtype=GRB.BINARY,name="z")
model.setObjective(quicksum(cost[a] * x[a, i] for a in A for i in I), GRB.MINIMIZE)
model.addConstrs(quicksum(x[a,i] for a in pred[v]) - quicksum(x[a,i] for a in succ[v]) == 0 for i in I for v in V if v != origin and v != destination)
model.addConstrs(quicksum(x[a,i] for a in succ[origin]) == 1 for i in I)
model.addConstrs(quicksum(x[a,i] for a in pred[destination]) == 1 for i in I)
model.addConstrs(x[a,i] + x[b,i] <= 1 for i in I for a in A for b in A if end[a] == start[b] and end[b] == start[a])
model.addConstrs(y[a,i,j]==x[a,i]-x[a,j] for a in A for i in I for j in I)
model.addConstrs(z[a,i,j]== abs_(y[a,i,j]) for a in A for i in I for j in I)
model.addConstrs(quicksum(z[a,i,j] for a in A) >= 1 for i in I for j in I if i != j)
model.addConstrs(quicksum(x[a,i]*cost[a] for a in A) <= maxLength for i in I)
model.optimize()

Finding first pair of numbers in array that sum to value

Im trying to solve the following Codewars problem: https://www.codewars.com/kata/sum-of-pairs/train/python
Here is my current implementation in Python:
def sum_pairs(ints, s):
right = float("inf")
n = len(ints)
m = {}
dup = {}
for i, x in enumerate(ints):
if x not in m.keys():
m[x] = i # Track first index of x using hash map.
elif x in m.keys() and x not in dup.keys():
dup[x] = i
for x in m.keys():
if s - x in m.keys():
if x == s-x and x in dup.keys():
j = m[x]
k = dup[x]
else:
j = m[x]
k = m[s-x]
comp = max(j,k)
if comp < right and j!= k:
right = comp
if right > n:
return None
return [s - ints[right],ints[right]]
The code seems to produce correct results, however the input can consist of array with up to 10 000 000 elements, so the execution times out for large inputs. I need help with optimizing/modifying the code so that it can handle sufficiently large arrays.
Your code inefficient for large list test cases so it gives timeout error. Instead you can do:
def sum_pairs(lst, s):
seen = set()
for item in lst:
if s - item in seen:
return [s - item, item]
seen.add(item)
We put the values in seen until we find a value that produces the specified sum with one of the seen values.
For more information go: Referance link
Maybe this code:
def sum_pairs(lst, s):
c = 0
while c<len(lst)-1:
if c != len(lst)-1:
x= lst[c]
spam = c+1
while spam < len(lst):
nxt= lst[spam]
if nxt + x== s:
return [x, nxt]
spam += 1
else:
return None
c +=1
lst = [5, 6, 5, 8]
s = 14
print(sum_pairs(lst, s))
Output:
[6, 8]
This answer unfortunately still times out, even though it's supposed to run in O(n^3) (since it is dominated by the sort, the rest of the algorithm running in O(n)). I'm not sure how you can obtain better than this complexity, but I thought I might put this idea out there.
def sum_pairs(ints, s):
ints_with_idx = enumerate(ints)
# Sort the array of ints
ints_with_idx = sorted(ints_with_idx, key = lambda (idx, num) : num)
diff = 1000000
l = 0
r = len(ints) - 1
# Indexes of the sum operands in sorted array
lSum = 0
rSum = 0
while l < r:
# Compute the absolute difference between the current sum and the desired sum
sum = ints_with_idx[l][1] + ints_with_idx[r][1]
absDiff = abs(sum - s)
if absDiff < diff:
# Update the best difference
lSum = l
rSum = r
diff = absDiff
elif sum > s:
# Decrease the large value
r -= 1
else:
# Test to see if the indexes are better (more to the left) for the same difference
if absDiff == diff:
rightmostIdx = max(ints_with_idx[l][0], ints_with_idx[r][0])
if rightmostIdx < max(ints_with_idx[lSum][0], ints_with_idx[rSum][0]):
lSum = l
rSum = r
# Increase the small value
l += 1
# Retrieve indexes of sum operands
aSumIdx = ints_with_idx[lSum][0]
bSumIdx = ints_with_idx[rSum][0]
# Retrieve values of operands for sum in correct order
aSum = ints[min(aSumIdx, bSumIdx)]
bSum = ints[max(aSumIdx, bSumIdx)]
if aSum + bSum == s:
return [aSum, bSum]
else:
return None

Optimal Search Tree Using Python - Code Analysis

First of all, sorry about the naive question. But I couldn't find help elsewhere
I'm trying to create an Optimal Search Tree using Dynamic Programing in Python that receives two lists (a set of keys and a set of frequencies) and returns two answers:
1 - The smallest path cost.
2 - The generated tree for that smallest cost.
I basically need to create a tree organized by the most accessed items on top (most accessed item it's the root), and return the smallest path cost from that tree, by using the Dynamic Programming solution.
I've the following implemented code using Python:
def optimalSearchTree(keys, freq, n):
#Create an auxiliary 2D matrix to store results of subproblems
cost = [[0 for x in xrange(n)] for y in xrange(n)]
#For a single key, cost is equal to frequency of the key
#for i in xrange (0,n):
# cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in xrange (2,n):
for i in xrange(0,n-L+1):
j = i+L-1
cost[i][j] = sys.maxint
for r in xrange (i,j):
if (r > i):
c = cost[i][r-1] + sum(freq, i, j)
elif (r < j):
c = cost[r+1][j] + sum(freq, i, j)
elif (c < cost[i][j]):
cost[i][j] = c
return cost[0][n-1]
def sum(freq, i, j):
s = 0
k = i
for k in xrange (k,j):
s += freq[k]
return s
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
print(optimalSearchTree(keys, freq, n))
I'm trying to output the answer 1. The smallest cost for that tree should be 142 (the value stored on the Matrix Position [0][n-1], according to the Dynamic Programming solution). But unfortunately it's returning 0. I couldn't find any issues in that code. What's going wrong?
You have several very questionable statements in your code, definitely inspired by C/Java programming practices. For instance,
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
I think you think you calculate the number of items in the list. However, n is not 3:
sys.getsizeof(keys)/sys.getsizeof(keys[0])
3.142857142857143
What you need is this:
n = len(keys)
One more find: elif (r < j) is always True, because r is in the range between i (inclusive) and j (exclusive). The elif (c < cost[i][j]) condition is never checked. The matrix c is never updated in the loop - that's why you always end up with a 0.
Another suggestion: do not overwrite the built-in function sum(). Your namesake function calculates the sum of all items in a slice of a list:
sum(freq[i:j])
import sys
def optimalSearchTree(keys, freq):
#Create an auxiliary 2D matrix to store results of subproblems
n = len(keys)
cost = [[0 for x in range(n)] for y in range(n)]
storeRoot = [[0 for i in range(n)] for i in range(n)]
#For a single key, cost is equal to frequency of the key
for i in range (0,n):
cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in range (2,n+1):
for i in range(0,n-L+1):
j = i + L - 1
cost[i][j] = sys.maxsize
for r in range (i,j+1):
c = (cost[i][r-1] if r > i else 0)
c += (cost[r+1][j] if r < j else 0)
c += sum(freq[i:j+1])
if (c < cost[i][j]):
cost[i][j] = c
storeRoot[i][j] = r
return cost[0][n-1], storeRoot
if __name__ == "__main__" :
keys = [10,12,20]
freq = [34,8,50]
print(optimalSearchTree(keys, freq))

Faster algorithm for finding number of paths between two nodes

I am trying to answer a question on an online judge in Python, but I am exceeding both the time limit and memory limit. The question is pretty much asking for the number of all paths from a start node to an end node. Full question specifications can be seen here.
This is my code:
import sys
lines = sys.stdin.read().strip().split('\n')
n = int(lines[0])
dict1 = {}
for i in xrange(1, n+1):
dict1[i] = []
for i in xrange(1, len(lines) - 1):
numbers = map(int, lines[i].split())
num1 = numbers[0]
num2 = numbers[1]
dict1[num2].append(num1)
def pathfinder(start, graph, count):
new = []
if start == []:
return count
for i in start:
numList = graph[i]
for j in numList:
if j == 1:
count += 1
else:
new.append(j)
return pathfinder(new, graph, count)
print pathfinder([n], dict1, 0)
What the code does is it starts at the end node, and works its way up to the top by exploring all neighboring nodes. I made essentially a breadth first search algorithm, but its taking up too much space and time. How can I improve this code to make it more efficient? Is my approach wrong and how should I fix it?
Since the graph is acyclic there is a topological ordering which we can immediately see to be 1, 2, ..., n. So we can use dynamic programming the same way it is used to solve the longest path problem. In a list paths the element paths[i] stores how many paths would there be from 1 to i. The update would be simple - for each edge (i,j) where i is from our topological order we do paths[j] += path[i].
from collections import defaultdict
graph = defaultdict(list)
n = int(input())
while True:
tokens = input().split()
a, b = int(tokens[0]), int(tokens[1])
if a == 0:
break
graph[a].append(b)
paths = [0] * (n+1)
paths[1] = 1
for i in range(1, n+1):
for j in graph[i]:
paths[j] += paths[i]
print(paths[n])
Note that what you are implementing is not actually BFS since you don't mark which vertices you've visited making your start to grow out of proportion.
Test the graph
for i in range(1, n+1):
dict1[i] = list(range(i-1, 0, -1))
If you print the size of start you can see that the max value it gets for a given n grows exactly as binomial(n, floor(n/2)) which is ~4^n/sqrt(n). Note also that BFS is not what you want since it is not possible to count the number of paths in that way.
import sys
from collections import defaultdict
def build_matrix(filename, x):
# A[i] stores number of paths from node x to node i.
# O(n) to build parents_of_node
parents_of_node = defaultdict(list)
with open(filename) as infile:
num_nodes = int(infile.readline())
A = [0] * (num_nodes + 1) # A[0] is dummy variable. Not used.
for line in infile:
if line == "0 0":
break
u, v = map(int, line.strip().split())
parents_of_node[v].append(u)
# Initialize all direct descendants of x to 1
if u == x:
A[v] = 1
# Number of paths from x to i = sum(number of paths from x to parent of i)
for i in xrange(1, num_nodes + 1): # O(n)
A[i] += sum(A[p] for p in parents_of_node[i]) # O(max fan-in of graph), assuming O(1) for accessing dict.
# Total time complexity to build A is O(n * (max_fan-in of graph))
return A
def main():
filename = sys.argv[1]
x = 1 # Find number of paths from x
y = 4 # to y
A = build_matrix(filename, x)
print(A[y])
What you are doing is a DFS (not a BFS) in that code...
Here's a link to a good solution...
EDITED:
Use this approach instead...
http://www.geeksforgeeks.org/find-paths-given-source-destination/

Longest arithmetic progression with a hole

The longest arithmetic progression subsequence problem is as follows. Given an array of integers A, devise an algorithm to find the longest arithmetic progression in it. In other words find a sequence i1 < i2 < … < ik, such that A[i1], A[i2], …, A[ik] form an arithmetic progression, and k is maximal. The following code solves the problem in O(n^2) time and space. (Modified from http://www.geeksforgeeks.org/length-of-the-longest-arithmatic-progression-in-a-sorted-array/ . )
#!/usr/bin/env python
import sys
def arithmetic(arr):
n = len(arr)
if (n<=2):
return n
llap = 2
L = [[0]*n for i in xrange(n)]
for i in xrange(n):
L[i][n-1] = 2
for j in xrange(n-2,0,-1):
i = j-1
k = j+1
while (i >=0 and k <= n-1):
if (arr[i] + arr[k] < 2*arr[j]):
k = k + 1
elif (arr[i] + arr[k] > 2*arr[j]):
L[i][j] = 2
i -= 1
else:
L[i][j] = L[j][k] + 1
llap = max(llap, L[i][j])
i = i - 1
k = j + 1
while (i >=0):
L[i][j] = 2
i -= 1
return llap
arr = [1,4,5,7,8,10]
print arithmetic(arr)
This outputs 4.
However I would like to be able to find arithmetic progressions where up to one value is missing. So if arr = [1,4,5,8,10,13] I would like it to report that there is a progression of length 5 with one value missing.
Can this be done efficiently?
Adapted from my answer to Longest equally-spaced subsequence. n is the length of A, and d is the range, i.e. the largest item minus the smallest item.
A = [1, 4, 5, 8, 10, 13] # in sorted order
Aset = set(A)
for d in range(1, 13):
already_seen = set()
for a in A:
if a not in already_seen:
b = a
count = 1
while b + d in Aset:
b += d
count += 1
already_seen.add(b)
# if there is a hole to jump over:
if b + 2 * d in Aset:
b += 2 * d
count += 1
while b + d in Aset:
b += d
count += 1
# don't record in already_seen here
print "found %d items in %d .. %d" % (count, a, b)
# collect here the largest 'count'
I believe that this solution is still O(n*d), simply with larger constants than looking without a hole, despite the two "while" loops inside the two nested "for" loops. Indeed, fix a value of d: then we are in the "a" loop that runs n times; but each of the inner two while loops run at most n times in total over all values of a, giving a complexity O(n+n+n) = O(n) again.
Like the original, this solution is adaptable to the case where you're not interested in the absolute best answer but only in subsequences with a relatively small step d: e.g. n might be 1'000'000, but you're only interested in subsequences of step at most 1'000. Then you can make the outer loop stop at 1'000.

Categories