I developed a SCIP/MIP model using LP relaxation which relies on branching on 0-1 variables. However, it is quite inefficient as I have not figured out how to use relevant SCIP callbacks.
Here is my code:
isMIP = False
while True:
model.optimize()
if isMIP:
print("Optimal value:", model.getObjVal())
break
else:
print("Intermediate value:", model.getObjVal())
x,y,u = model.data
fracvars = []
for j in y:
w = model.getVal(y[j])
if w > 0.001 and w < 0.999:
fracvars.append([j,abs(w-0.5)])
if fracvars:
fracvars.sort(key = itemgetter(1))
min_var, min_value = min([(val[0],val[1]) for val in fracvars])
model.freeTransform()
model.chgVarType(y[min_var],"I") # the very inefficient part...
print("Integer constraint on y[%s]" % min_var)
else:
isMIP = True
Could anyone help me speed up the code? Many thanks.
Please see http://scip.zib.de/doc-5.0.1/html/BRANCH.php for how to write a branching rule and http://scip.zib.de/doc-5.0.1/html/SEPA.php for cutting plane separators (I am still not sure what you want to do exactly...). This is the description for C plugins, but the equivalents should exist in PySCIPOpt or should be easy to add if you know what you need.
I have been tasked with implementing a local (non-interactive) differential privacy mechanism. I am working with a large database of census data. The only sensitive attribute is "Number of children" which is a numerical value ranging from 0 to 13.
I decided to go with the Generalized Random Response mechanism as it seems like the most intuitive method. This mechanism is described here and presented here.
After loading each value into an array (ignoring the other attributes for now), I perform the perturbation as follows.
d = 14 # values may range from 0 to 13
eps = 1 # epsilon level of privacy
p = (math.exp(eps)/(math.exp(eps)+d-1))
q = 1/(math.exp(eps)+d-1)
p_dataset = []
for row in dataset:
coin = random.random()
if coin <= p:
p_dataset.append(row)
else:
p_dataset.append(random.randint(0,13))
Unless I have misinterpreted the definition, I believe this will guarantee epsilon differential privacy on p_dataset.
However, I am having difficulty understanding how the aggregator must interpret this dataset. Following the presentation above, I attempted to implement a method for estimating the number of individuals who answered a particular value.
v = 0 # we are estimating the number of individuals in the dataset who answered 0
nv = 0 # number of users in the perturbed dataset who answered the value
for row in p_dataset:
if row == v:
nv += 1
Iv = nv * p + (n - nv) * q
estimation = (Iv - (n*q)) / (p-q)
I do not know if I have correctly implemented the method described as I do not completely understand what it is doing, and cannot find a clear definition.
Regardless, I used this method to estimate the total amount of individuals who answered each value in the dataset with a value for epsilon ranging from 1 to 14, and then compared this to the actual values. The results are below (please excuse the formatting).
As you can see, the utility of the dataset suffers greatly for low values of epsilon. Additionally, when executed multiple times, there was relatively little deviation in estimations, even for small values of epsilon.
For example, when estimating the number of participants who answered 0 and using an epsilon of 1, all estimations seemed to be centered around 1600, with the largest distance between estimations being 100. Considering the actual value of this query is 5969, I am led to believe that I may have implemented something incorrectly.
Is this the expected behaviour of the Generalized Random Response mechanism, or have I made a mistake in my implementation?
I think when getting a false answer, we cannot directly use p_dataset.append(random.randint(0,13)), because it contains true answer
max_v = 13
min_v = 0
for row in dataset: #row就是dataset里的真实值
coin = random.random()
if coin <= p:
p_dataset.append(row)
else:
ans = []
if row == min_v:
ans = np.arange(min_v + 1, max_v + 1).tolist()
elif row == max_v:
ans = np.arange(min_v, max_v).tolist()
else:
a = np.arange(min_v, row).tolist()
b = np.arange(row + 1, max_v + 1).tolist()
[ans.append(i) for i in a]
[ans.append(i) for i in b]
p_dataset.append(random.sample(ans, 1)) # 这其实有一点问题 应该是真实值以外的其他值 这样写还包括真实值
I am trying to solve a question on an online judge about calculating all shortest paths on a complete graph. Full problem specifications can be seen here. However, I am exceeding the memory limit required. Here is the part of the code that does Dijkstra's algorithm:
n = int(raw_input())
dict1 = [[""for i in xrange(n+1)]for j in xrange(n+1)]
edges = [0]
for i in xrange(n):
x,y = map(int, raw_input().split())
edges.append((x,y))
for i,coord1 in enumerate(edges):
for j,coord2 in enumerate(edges):
if i==j or i==0 or j==0:
continue
x1,y1 = coord1
x2,y2 = coord2
weight = (x1-x2)*(x1-x2) + (y1-y2)*(y1-y2)
dict1[i][j] = weight
dict1[j][i] = weight
x = int(raw_input())
times = []
vertices = {i:1e13 if i!= x else 0 for i in xrange(1,n+1)}
while len(vertices)>0:
minimum = min(vertices.items(), key=lambda x: x[1])[0]
currentCost = vertices[minimum]
times.append(currentCost)
del vertices[minimum]
for neighbour,newWeight in enumerate(dict1[minimum]):
if neighbour in vertices and newWeight != "":
if currentCost + newWeight < vertices[neighbour]:
vertices[neighbour] = currentCost + newWeight
The code uses the original algorithm without the priority queue because of the better time complexity. Even though this gives the right answer, I have a feeling the memory exceeding has something to do with the way I am storing the weights, considering they can be as large as 10^12. Is there another way I can store the weights that will use less memory, or is something else causing the problem?
Your problem has nothing to do with big weights (10^12 is not a big number). If you want to see that this is the case (try dividing them by some number like 1000 to see that it will fail as well).
The problem is that you do not use priority queue and this deteriorate the time complexity to O(V^2) and if you will use a priority queue, you will get O(E + V log(V)).
So implement a normal Dijkstra and will get your answer accepted.
Sorry, have not read that this is a planar graph and that it is dense. Knowing that your graph consists of 2d points, you can take advantage of the distance heuristics and use A* algorithm.
Vexed is a popular puzzle game, with many versions available (some of them GPL free software). It is very suitable for small screen devices; versions are available for Android, iOS, etc. I discovered it on the PalmOS platform.
Just for fun, I'd like to write a solver that will solve Vexed levels.
Vexed is a block-sliding puzzle game. Here are the rules in a nutshell:
0) Each level is a grid of squares, bounded by an impassible border. In any level there will be some solid squares, which are impassible. There are some number of blocks of various colors; these could be resting on the bottom border, resting on solid squares, or resting on other blocks (of a different color). Most levels are 8x8 or smaller.
1) The only action you can take is to slide a block to the left or to the right. Each square traveled by a block counts as one move.
2) There is gravity. If, after you slide a block, it is no longer resting on a solid square or another block, it will fall until it comes to rest on another block, a solid square, or the bottom border. Note that you cannot ever lift it up again.
3) Any time two or more blocks of the same color touch, they disappear. Note that chains are possible: if a supporting block disappears, blocks that rested upon it will fall, which could lead to more blocks of the same color touching and thus disappearing.
4) The goal is to make all blocks disappear in the minimum number of moves. Each level has a "par score" which tells you the minimum number of moves. (In the original PalmOS game, the "par score" wasn't necessarily the minimum, but in the Android version I play these days it is the minimum.)
Here is the SourceForge project with the source for the PalmOS version of the game:
http://sourceforge.net/projects/vexed/
I'm an experienced software developer, but I haven't done really any work in AI sort of stuff (pathfinding, problem-solving, etc.) So I'm looking for advice to get me pointed in the right direction.
At the moment, I can see two basic strategies for me to pursue:
0) Just write a brute-force solver, probably in C for the speed, that cranks through every possible solution for every game and returns a list of all solutions, best one first. Would this be a reasonable approach, or would the total number of possible moves make this too slow? I don't think any levels exist larger than 10x10.
1) Learn some AI-ish algorithms, and apply them in a clever way to solve the problem, probably using Python.
Note that the source for PalmOS Vexed includes a solver. According to the author, "The solver uses A* with pruning heuristics to find solutions."
http://www.scottlu.com/Content/Vexed.html
So, one strategy I could pursue would be to study the A* algorithm and then study the C++ code for the existing solver and try to learn from that.
I'm going to tag this with Python and C tags, but if you think I should be using something else, make your sales pitch and I'll consider it!
Here is ASCII art of a level from "Variety 25 Pack"; level 48, "Dark Lord". I am able to solve most levels but this one has, well, vexed me. Par score for this level is 25 moves, but I have not yet solved it at all!
__________
|## g####|
|## # b##|
|## # p##|
|#g ###|
|bp ###|
|p# p g |
==========
In this picture, the borders are underscores, vertical bars, and equals characters. Filled-in squares are '#'. Open spaces are space characters. Colored blocks are 'g' (green), 'b' (blue) and 'p' (purple).
By the way, I'll likely make the input file format to the solver be ASCII art of the levels, just like this but without the fussy line border characters.
Thanks for any advice!
EDIT:
I have accepted an answer. Thank you to the people who gave me answers.
This is a semi-brute-force solver. It isn't using A* but it is cutting short unprofitable branches of the tree.
It reads in a simple text file with the level data. A letter is a block, a '_' (underscore) is an open space, and a '#' is a filled-in space.
#!/usr/bin/env python
#
# Solve levels from the game Vexed.
from collections import Counter
import sys
level_blocks = set(chr(x) for x in range(ord('a'), ord('z')+1))
level_other = set(['_', '#'])
level_valid = set().union(level_blocks, level_other)
def prn_err(s='\n'):
sys.stderr.write(s)
sys.stderr.flush()
def validate_llc(llc):
if len(llc) == 0:
raise ValueError, "need at least one row of level data"
w = len(llc[0])
if w < 2:
raise ValueError, "level data not wide enough"
for i, row in enumerate(llc):
if len(row) != w:
s = "level data: row %d is different length than row 0"
raise ValueError, s % i
for j, ch in enumerate(row):
if ch not in level_valid:
s = "char '%c' at (%d, %d) is invalid" % (ch, i, j)
raise ValueError, s
class Info(object):
pass
info = Info()
info.width = 0
info.height = 0
info.spaces = set()
info.boom_blocks = set()
info.best_solution = 9999999999
info.title = "unknown"
class Level(object):
"""
Hold the state of a level at a particular move. self.parent points
to the previous state, from a previous move, so the solver builds a
tree representing the moves being considered. When you reach a solution
(a state where there are no more blocks) you can walk up the tree
back to the root, and you have the chain of moves that leads to that
solution."""
def __init__(self, x):
if isinstance(x, Level):
self.blocks = dict(x.blocks)
self.colors = dict(x.colors)
self.parent = x
self.s_move = ''
self.rank = x.rank + 1
else:
if isinstance(x, basestring):
# allow to init from a one-line "data" string
# example: "___;___;r_r"
x = x.split(';')
# build llc: list of rows, each row a list of characters
llc = [[ch for ch in row.strip()] for row in x]
llc.reverse()
info.width = len(llc[0])
info.height = len(llc)
validate_llc(llc)
# Use llc data to figure out the level, and build self.blocks
# and info.spaces. self.blocks is a dict mapping a coordinate
# tuple to a block color; info.spaces is just a set of
# coordinate tuples.
self.blocks = {}
for y in range(info.height):
for x in range(info.width):
loc = (x, y)
c = llc[y][x]
if c == '_':
# it's a space
info.spaces.add(loc)
elif c in level_blocks:
# it's a block (and the block is in a space)
self.blocks[loc] = c
info.spaces.add(loc)
else:
# must be a solid square
assert(c == '#')
# colors: map each block color onto a count of blocks.
self.colors = Counter(self.blocks.values())
# parent: points to the level instance that holds the state
# previous to the state of this level instance.
self.parent = None
# s_move: a string used when printing out the moves of a solution
self.s_move = 'initial state:'
# rank: 0 == initial state, +1 for each move
self.rank = 0
self.validate()
print "Solving:", info.title
print
sys.stdout.flush()
if self._update():
print "level wasn't stable! after updating:\n%s\n" % str(self)
def lone_color(self):
return any(count == 1 for count in self.colors.values())
def is_solved(self):
return sum(self.colors.values()) == 0
def validate(self):
if info.height == 0:
raise ValueError, "need at least one row of level data"
if info.width < 2:
raise ValueError, "level data not wide enough"
if self.lone_color():
raise ValueError, "cannot have just one of any block color"
for x, y in info.spaces:
if not 0 <= x < info.width or not 0 <= y < info.height:
raise ValueError, "Bad space coordinate: " + str(loc)
for x, y in self.blocks:
if not 0 <= x < info.width or not 0 <= y < info.height:
raise ValueError, "Bad block coordinate: " + str(loc)
if any(count < 0 for count in self.colors.values()):
raise ValueError, "cannot have negative color count!"
colors = Counter(self.blocks.values())
for k0 in [key for key in self.colors if self.colors[key] == 0]:
del(self.colors[k0]) # remove all keys whose value is 0
if colors != self.colors:
raise ValueError, "self.colors invalid!\n" + str(self.colors)
def _look(self, loc):
"""
return color at location 'loc', or '_' if empty, or '#' for a solid sqaure.
A bad loc does not raise an error; it just returns '#'.
"""
if loc in self.blocks:
return self.blocks[loc]
elif loc in info.spaces:
return '_'
else:
return '#'
def _lookxy(self, x, y):
loc = x, y
return self._look(loc)
def _board_mesg(self, mesg, loc):
x, y = loc
return "%s %c(%d,%d)" % (mesg, self._look(loc), x, y)
def _blocked(self, x, y):
return self._lookxy(x, y) != '_'
def _s_row(self, y):
return ''.join(self._lookxy(x, y) for x in xrange(info.width))
def data(self, ch_join=';'):
return ch_join.join(self._s_row(y)
for y in xrange(info.height - 1, -1, -1))
# make repr() actually print a representation
def __repr__(self):
return type(self).__name__ + "(%s)" % self.data()
# make str() work
def __str__(self):
return self.data('\n')
def _move_block(self, loc_new, loc_old):
self.blocks[loc_new] = self.blocks[loc_old]
del(self.blocks[loc_old])
def _explode_block(self, loc):
if loc in info.boom_blocks:
return
info.boom_blocks.add(loc)
color = self.blocks[loc]
self.colors[color] -= 1
def _try_move(self, loc, d):
x, y = loc
if not d in ('<', '>'):
raise ValueError, "d value '%c' invalid, must be '<' or '>'" % d
if d == '<':
x_m = (x - 1)
else:
x_m = (x + 1)
y_m = y
loc_m = (x_m, y_m)
if self._blocked(x_m, y_m):
return None # blocked, so can't move there
# Not blocked. Let's try the move!
# Make a duplicate level...
m = Level(self)
# ...try the move, and see if anything falls or explodes...
m._move_block(loc_m, loc)
m._update()
if m.lone_color():
# Whoops, we have only one block of some color. That means
# no solution can be found by considering this board.
return None
# finish the update
m.s_move = self._board_mesg("move:", loc) + ' ' + d
m.parent = self
return m
def _falls(self, loc):
x, y = loc
# blocks fall if they can, and only explode when at rest.
# gravity loop: block falls until it comes to rest
if self._blocked(x, y - 1):
return False # it is already at rest
while not self._blocked(x, y - 1):
# block is free to fall so fall one step
y -= 1
loc_m = (x, y)
self._move_block(loc_m, loc)
return True # block fell to new location
def _explodes(self, loc):
x, y = loc
exploded = False
color = self._look(loc)
# look left, right, up, and down for blocks of same color
for e_loc in [(x-1, y), (x+1, y), (x, y-1)]:
if e_loc in self.blocks and self.blocks[e_loc] == color:
self._explode_block(e_loc)
exploded = True
if exploded:
self._explode_block(loc)
return exploded
def _update(self):
c = 0
while True:
# TRICKY: sum() works on functions that return a bool!
# As you might expect, True sums as 1 and False as 0.
f = sum(self._falls(loc) for loc in self.blocks)
e = sum(self._explodes(loc) for loc in self.blocks)
for loc in info.boom_blocks:
del(self.blocks[loc])
info.boom_blocks.clear()
c += f + e
if (f + e) == 0:
# no blocks fell or exploded; board is stable, update is done
break
return c
def print_moves(self):
lst = [self]
a = self
while a.parent:
a = a.parent
lst.append(a)
lst.reverse()
for i, a in enumerate(lst):
if i:
print "Move %d of %d" % (i, len(lst) - 1)
print a.s_move
print a
print
def solve(self):
c = 0
seen = set()
solutions = []
seen.add(self.data())
q = []
if self.is_solved():
solutions.append(self)
else:
q.append(self)
while q:
a = q.pop(0)
# Show dots while solver is 'thinking' to give a progress
# indicator. Dots are written to stderr so they will not be
# captured if you redirect stdout to save the solution.
c += 1
if c % 100 == 0:
prn_err('.')
if a.rank > info.best_solution:
# We cannot beat or even match the best solution.
# No need to think any more about this possibility.
# Just prune this whole branch of the solution tree!
continue
for loc in a.blocks:
for d in ('<', '>'):
m = a._try_move(loc, d)
if not m or m.data() in seen:
continue
if m.is_solved():
if info.best_solution > a.rank:
print "\nnew best solution: %d moves" % m.rank
info.best_solution = a.rank
else:
print "\nfound another solution: %d moves" % m.rank
solutions.append(m)
else:
seen.add(m.data())
q.append(m)
print
print "Considered %d different board configurations." % c
print
solutions.sort(key=lambda a: a.rank)
for n, a in enumerate(solutions):
print "solution %d): %d moves" % (n, a.rank)
a.print_moves()
if not solutions:
print "no solutions found!"
def load_vex_file(fname):
with open(fname, "rt") as f:
s = f.next().strip()
if s != "Vexed level":
raise ValueError, "%s: not a Vexed level file" % fname
s = f.next().strip()
if not s.startswith("title:"):
raise ValueError, "%s: missing title" % fname
info.title = s[6:].lstrip() # remove "title:"
for s in f:
if s.strip() == "--":
break
return Level(f)
if __name__ == "__main__":
if len(sys.argv) == 1:
print "Usage vexed_solver <vexed_level_file.vex>"
sys.exit(1)
fname = sys.argv[1]
level = load_vex_file(fname)
level.solve()
Here is an example level file:
Vexed level
title: 25-48, "Dark Lord"
--
##_f####
##_#_c##
##_#_p##
#f___###
cp___###
p#_p_f__
On my computer, it solves "Dark Lord" in almost exactly 10 seconds, considering 14252 different board configurations. I wrote in Python 2.x instead of Python 3, because I want to try this with PyPy and see how fast it becomes.
Next I should work on applying A* to this. I guess I can make a metric like "better to move an orange block toward another orange block than away" and try to work that in. But I do want all the solutions to pop out, so maybe I'm done already. (If there are three solutions that are all the minimum number of moves, I want to see all three.)
I welcome comments on this Python program. I had fun writing it!
EDIT: I did try this with PyPy but I never updated this until now. On the computer I used with PyPy, the solver could solve the "Dark Lord" level in 10 seconds using CPython; that dropped to 4 seconds with PyPy. The cool part is that I could see the speedup as the JIT kicked in: this program prints dots as it is working, and under PyPy I can see the dots start out slower and then just accelerate. PyPy is nifty.
Studying Wikipedia may be better than studying the actual source code. A* is written out pretty clearly there. But that feels like cheating, doesn't it?
As all good ideas, A* is actually pretty obvious in retrospective. It's fun trying to work it through, and there are a few nice insights along the way. Here's how you get to it:
Write the brute-force solver. You'll need much of what you write in the more advanced versions: a game state, and a description of getting from one state to another. You'll also end up removing duplicate states. You should have a queue of some sort for states to be considered, a set of states you've already done, and structure to hold the best solution(s) found so far. And a method that takes a state from the queue and generates a state's „neighbor“ states (ones reachable from it). That's the basic structure of classical AI algorithms. Note that you're technically „generating“ or „exploring“ a huge graph here.
After that, add a simple pruning algorithm: if a state has only one block of some color left, there's no need to consider it further. See if you can come up with other pruning algorithms (i.e. ones that mark a state as „unsolvable“). A good pruning algorithm will eliminate lots of pointless states, thus justifying the time it takes to run the pruning itself.
Then, introduce a heuristic score: rank each state with a number that tells you how „good“ the state looks – about how much more solving will it take. Make your queue a priority queue. This will allow you to consider the „best looking“ states first, so the program should come up with a solution faster. But, the first solution found may not actually be the best, so to be sure that you find the best one, you still need to run the whole program.
Store the minimum cost (number of moves) that you took to get to each state. Remember to update it if you find a better path. Take the states with the lowest sum of their cost and their heuristic score first; those will more likely lead to a better solution.
And here comes A*. You need to modify your heuristic function so that it doesn't overestimate the distance to the goal, i.e. it can be lower than the number of moves you will actually need, but not higher. Then, note that if you found a solution, its heuristic score will be 0. And, any state where the sum of its cost and heuristic is more than the cost of a solution can't lead to a better solution. So, you can prune that state. But since you're taking the states in order, once you hit that threshold you can just stop and return, since all other states in the queue would be pruned as well.
All that's left now is perfecting your heuristic: it can never overestimate, but the better estimate it gives the less time A* will take. The better the heuristic, the better your results. Take care that the heuristic doesn't take so much time to complete – you wouldn't want, say, generating the solution by brute force, even though it would give the perfect answer :)
Wikipedia has some more discussion and possible improvements, if you get this far. But the best improvements you can make at this point will likely come from improving the heuristic function.
Maybe translate it into a classical planning problem (using PDDL syntax). Then you can try out some planners that are freely available.
E.g. try Fast Forward.
I'm trying to model this problem (for details on it, http://www.mpi-hd.mpg.de/personalhomes/bauke/LABS/index.php)
I've seen that the proven minimum for a sequence of 10 digits is 13. However, my application seems to be getting 12 quite frequently. This implies some kind of error in my program. Is there an obvious error in the way I've modeled those summations in this code?
def evaluate(self):
self.fitness = 10000000000 #horrible practice, I know..
h = 0
for g in range(1, len(self.chromosome) - 1):
c = self.evaluateHelper(g)
h += c**2
self.fitness = h
def evaluateHelper(self, g):
"""
Helper for evaluate function. The c sub g function.
"""
totalSum = 0
for i in range(len(self.chromosome) - g - 1):
product = self.chromosome[i] * self.chromosome[(i + g) % (len(self.chromosome))]
totalSum += product
return totalSum
I can't spot any obvious bug offhand, but you're making things really complicated, so maybe a bug's lurking and hiding somewhere. What about
def evaluateHelper(self, g):
return sum(a*b for a, b in zip(self.chromosome, self.chomosome[g:]))
this should return the same values you're computing in that subtle loop (where I think the % len... part is provably redundant). Similarly, the evaluate method seems ripe for a similar 1-liner. But, anyway...
There's a potential off-by-one issue: the formulas in the article you point to are summing for g from 1 to N-1 included -- you're using range(1, len(...)-1), whereby N-1 is excluded. Could that be the root of the problem you observe?
Your bug was here:
for i in range(len(self.chromosome) - g - 1):
The maximum value for i will be len(self.chromosome) - g - 2, because range is exclusive. Thus, you don't consider the last pair. It's basically the same as your other bug, just in a different place.