b-tree pseudocode confusion

b-tree pseudocode confusion - python

I am trying to implement b-tree from pseudocode, here is some explanation about b-tree:
http://cs.utsa.edu/~dj/ut/utsa/cs3343/lecture17.html
http://www.di.ufpb.br/lucidio/Btrees.pdf
http://homepages.ius.edu/RWISMAN/C455/html/notes/Chapter18/BT-Basics.htm
So i want to implement the code in python, but only one thing is not clear for me, what is the purpose of "t" in this code:
def bTreeInsert(T, k): #k is the key
r = T.root #r - root node
if r.n == 2*t - 1: #t = ???
s = AlocateNode()
T.root = s
s.leaf = False
s.n = 0
s.c[1] = r
bTreeSplitChildren(s, 1)
bTreeInsertNonfull(s, k)
else:
bTreeInsertNonfull(r, l)
Any ideas?

t is the minimum degree of the tree, i.e. the minimum number of children each node in the tree must have (and also half of the maximum number of children each node may have).

Related

Dijkstra variation works most of the time, but gives wrong output on a test that I do not have access to

first of all let me pretext this by the fact that this is for an university homework, so I only want hints and not solutions.
The problem consists of finding the path from s to t that has the smallest maximum amount of snow on one of the edges, while choosing shortest distance for tie breaking. (i. e. if multiple edges with the same snow amount are considered, we take the one with the shortest length). For any two vertices, there is at most one edge that connects them.
n - number of vertices
m - number of edges
s - source
t - target
a - list of edge beginnings
b - list of edge ends
w - list of lengths corresponding to the edges
c - list of amounts of snow corresponding to the edges
I would really appreciate the help as I've been racking my head over this for a long time.
I tried this.
import heapq
# # intersections, # roads, start list, end list, len list, snow list
def snowy_road(n, m, s, t, a, b, w, c):
# Initialize distances to all nodes as infinite except for the source node, which has distance 0
distancesc = [float("inf")] * n
distancesc[s-1] = 0
distancesw = [float("inf")] * n
distancesw[s-1] = 0
# Create a set to store the nodes that have been visited
visited = set()
# Create a priority queue to store the nodes that still need to be processed
# We will use the distance to the source node as the priority
# (i.e. the next node to process will be the one with the smallest distance to the source)
pq = []
pq.append((0, 0, s-1))
while pq:
# Pop the node with the smallest distance from the priority queue
distc, distw, u = heapq.heappop(pq)
# Skip the node if it has already been visited
if u in visited:
continue
# Mark the node as visited
visited.add(u)
# Update the distances to all adjacent nodes
for i in range(m):
if a[i] == u+1:
v = b[i]-1
elif b[i] == u+1:
v = a[i]-1
else:
continue
altc = c[i]
if distc != float("inf"):
altc = max(distc, altc)
altw = distw + w[i]
if altc < distancesc[v]:
distancesc[v] = altc
distancesw[v] = altw
heapq.heappush(pq, (altc, altw, v))
elif altc == distancesc[v] and altw < distancesw[v]:
distancesw[v] = altw
heapq.heappush(pq, (altc, altw, v))
# Return the distance to the target node
return distancesc[t-1], distancesw[t-1]

If anyone is wondering, I solved this using two consequent Dijkstras. The first one finds the value of the shortest bottleneck path, the second one takes this value into account and doest a shortest path normal Dijkstra on the graph using w, while considering only edges that have a snow coverage <= than the bottleneck we found. Thank you to everyone that responded!

Time limit exceeded when finding tree height

this is my code to find the height of a tree of up to 10^5 nodes. May I know why I get the following error?
Warning, long feedback: only the beginning and the end of the feedback message is shown, and the middle was replaced by " ... ". Failed case #18/24: time limit exceeded
Input:
100000
Your output:
stderr:
(Time used: 6.01/3.00, memory used: 24014848/2147483648.)
Is there a way to speed up this algo?
This is the exact problem description:
Problem Description
Task. You are given a description of a rooted tree. Your task is to compute and output its height. Recall
that the height of a (rooted) tree is the maximum depth of a node, or the maximum distance from a
leaf to the root. You are given an arbitrary tree, not necessarily a binary tree.
Input Format. The first line contains the number of nodes 𝑛. The second line contains 𝑛 integer numbers
from −1 to 𝑛 − 1 — parents of nodes. If the 𝑖-th one of them (0 ≤ 𝑖 ≤ 𝑛 − 1) is −1, node 𝑖 is the root,
otherwise it’s 0-based index of the parent of 𝑖-th node. It is guaranteed that there is exactly one root.
It is guaranteed that the input represents a tree.
Constraints. 1 ≤ 𝑛 ≤ 105
Output Format. Output the height of the tree.
# python3
import sys, threading
from collections import deque, defaultdict
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
class TreeHeight:
def read(self):
self.n = int(sys.stdin.readline())
self.parent = list(map(int, sys.stdin.readline().split()))
def compute_height(self):
height = 0
nodes = [[] for _ in range(self.n)]
for child_index in range(self.n):
if self.parent[child_index] == -1:
# child_index = child value
root = child_index
nodes[0].append(root)
# do not add to index
else:
parent_index = None
counter = -1
updating_child_index = child_index
while parent_index != -1:
parent_index = self.parent[updating_child_index]
updating_child_index = parent_index
counter += 1
nodes[counter].append(child_index)
# nodes[self.parent[child_index]].append(child_index)
nodes2 = list(filter(lambda x: x, nodes))
height = len(nodes2)
return(height)
def main():
tree = TreeHeight()
tree.read()
print(tree.compute_height())
threading.Thread(target=main).start()

First, why are you using threading? Threading isn't good. It is a source of potentially hard to find race conditions and confusing complexity. Plus in Python, thanks to the GIL, you often don't get any performance win.
That said, your algorithm essentially looks like this:
for each node:
travel all the way to the root
record its depth
If the tree is entirely unbalanced and has 100,000 nodes, then for each of 100,000 nodes you have to visit an average of 50,000 other nodes for roughly 5,000,000,000 operations. This takes a while.
What you need to do is stop constantly traversing the tree back to the root to find the depths. Something like this should work.
import sys
class TreeHeight:
def read(self):
self.n = int(sys.stdin.readline())
self.parent = list(map(int, sys.stdin.readline().split()))
def compute_height(self):
height = [None for _ in self.parent]
todo = list(range(self.n))
while 0 < len(todo):
node = todo.pop()
if self.parent[node] == -1:
height[node] = 1
elif height[node] is None:
if height[self.parent[node]] is None:
# We will try again after doing our parent
todo.append(node)
todo.append(self.parent[node])
else:
height[node] = height[self.parent[node]] + 1
return max(height)
if __name__ == "__main__":
tree = TreeHeight()
tree.read()
print(tree.compute_height())
(Note, I switched to a standard indent, and then made that indent 4. See this classic study for evidence that an indent in the 2-4 range is better for comprehension than an indent of 8. And, of course, the pep8 standard for Python specifies 4 spaces.)
Here is the same code showing how to handle accidental loops, and hardcode a specific test case.
import sys
class TreeHeight:
def read(self):
self.n = int(sys.stdin.readline())
self.parent = list(map(int, sys.stdin.readline().split()))
def compute_height(self):
height = [None for _ in self.parent]
todo = list(range(self.n))
in_redo = set()
while 0 < len(todo):
node = todo.pop()
if self.parent[node] == -1:
height[node] = 1
elif height[node] is None:
if height[self.parent[node]] is None:
if node in in_redo:
# This must be a cycle, lie about its height.
height[node] = -100000
else:
in_redo.add(node)
# We will try again after doing our parent
todo.append(node)
todo.append(self.parent[node])
else:
height[node] = height[self.parent[node]] + 1
return max(height)
if __name__ == "__main__":
tree = TreeHeight()
# tree.read()
tree.n = 5
tree.parent = [-1, 0, 4, 0, 3]
print(tree.compute_height())

Dijkstra algorithm in python using dictionaries

Dear computer science enthusiasts,
I have stumbled upon an issue when trying to implement the Dijkstra-algorithm to determine the shortest path between a starting node and all other nodes in a graph.
To be precise I will provide you with as many code snippets and information as I consider useful to the case. However, should you miss anything, please let me know.
I implemented a PQueue class to handle Priority Queues of each individual node and it looks like this:
class PQueue:
def __init__(self):
self.items = []
def push(self, u, value):
self.items.append((u, value))
# insertion sort
j = len(self.items) - 1
while j > 0 and self.items[j - 1][1] > value:
self.items[j] = self.items[j - 1] # Move element 1 position backwards
j -= 1
# node u now belongs to position j
self.items[j] = (u, value)
def decrease_key(self, u, value):
for i in range(len(self.items)):
if self.items[i][0] == u:
self.items[i][1] = value
j = i
break
# insertion sort
while j > 0 and self.items[j - 1][1] > value:
self.items[j] = self.items[j - 1] # Move element 1 position backwards
j -= 1
# node u now belongs to position j
self.items[j] = (u, value)
def pop_min(self):
if len(self.items) == 0:
return None
self.items.__delitem__(0)
return self.items.index(min(self.items))
In case you're not too sure about what the Dijkstra-algorithm is, you can refresh your knowledge here.
Now to get to the actual problem, I declared a function dijkstra:
def dijkstra(self, start):
# init
totalCosts = {} # {"node"= cost,...}
prevNodes = {} # {"node"= prevNode,...}
minPQ = PQueue() # [[node, cost],...]
visited = set()
# start init
totalCosts[str(start)] = 0
prevNodes[str(start)] = start
minPQ.push(start, 0)
# set for all other nodes cost to inf
for node in range(self.graph.length): # #nodes
if node != start:
totalCosts[str(node)] = np.inf
while len(minPQ.items) != 0: # Main loop
# remove smallest item
curr_node = minPQ.items[0][0] # get index/number of curr_node
minPQ.pop_min()
visited.add(curr_node)
# check neighbors
for neighbor in self.graph.adj_list[curr_node]:
# check if visited
if neighbor not in visited:
# check cost and put it in totalCost and update prev node
cost = self.graph.val_edges[curr_node][neighbor] # update cost of curr_node -> neighbor
minPQ.push(neighbor, cost)
totalCosts[str(neighbor)] = cost # update cost
prevNodes[str(neighbor)] = curr_node # update prev
# calc alternate path
altpath = totalCosts[str(curr_node)] + self.graph.val_edges[curr_node][neighbor]
# val_edges is a adj_matrix with values for the connecting edges
if altpath < totalCosts[str(neighbor)]: # check if new path is better
totalCosts[str(neighbor)] = altpath
prevNodes[str(neighbor)] = curr_node
minPQ.decrease_key(neighbor, altpath)
Which in my eyes should solve the problem mentioned above (optimal path for a starting node to every other node). But it does not. Can someone help me clean up this mess that I have been trying to debug for a while now. Thank you in advance!
Assumption:
In fact I realized that my dictionaries used to store the previously visited nodes (prevNodes) and the one where I save the corresponding total cost of visiting a node (totalCosts) are unequally long. And I do not understand why.

Dijsktra's Shortest Path Algorithm

When I run this, the end output is a table with columns:
Vertex - DisVal - PrevVal - Known.
The two nodes connected to my beginning node show the correct values, but none of the others end up getting updated. I can include the full program code if anyone wants to see, but I know the problem is isolated here. I think it may have to do with not changing the index the right way. This is a simple dijsktra's btw, not the heap/Q version.
Here's the rest of the code: http://ideone.com/UUOUn8
The adjList looks like this: [1: 2, 4, 2: 6, 3, ...] where it shows each node connected to a vertex. DV = distance value (weight), PV = previous value (node), known = has it bee visited
def dijkstras(graph, adjList):
pv = [None] * len(graph.nodes)
dv = [999]*len(graph.nodes)
known = [False] * len(graph.nodes)
smallestV = 9999
index = 0
dv[0] = 0
known[0] = True
for i in xrange(len(dv)):
if dv[i] < smallestV and known[i]:
smallestV = dv[i]
index = i
known[index] = True
print smallestV
print index
for edge in adjList[index]:
if (dv[index]+graph.weights[(index, edge)] < dv[edge]):
dv[edge] = dv[index] + graph.weights[(index, edge)]
pv[edge] = index
printTable(dv, pv, known)

The first iteration sets smallestV and index to 0 unconditionally, and they never change afterwards (assuming non-negative weights).
Hard to tell what you are trying to do here.

delete random edges of a node till degree = 1 networkx

If I create a bipartite graph G using random geomtric graph where nodes are connected within a radius. I then want to make sure all nodes have a particular degree (i.e. only one or two edges).
My main aim is to take one of the node sets (i.e node type a) and for each node make sure it has a maximum degree set by me. So for instance if a take node i that has a degree of 4, delete random edges of node i until its degree is 1.
I wrote the following code to run in the graph generator after generating edges. It deletes edges but not until all nodes have the degree of 1.
for n in G:
mu = du['G.degree(n)']
while mu > 1:
G.remove_edge(u,v)
if mu <=1:
break
return G
full function below:
import networkx as nx
import random
def my_bipartite_geom_graph(a, b, radius, dim):
G=nx.Graph()
G.add_nodes_from(range(a+b))
for n in range(a):
G.node[n]['pos']=[random.random() for i in range(0,dim)]
G.node[n]['type'] = 'A'
for n in range(a, a+b):
G.node[n]['pos']=[random.random() for i in range(0,dim)]
G.node[n]['type'] = 'B'
nodesa = [(node, data) for node, data in G.nodes(data=True) if data['type'] == 'A']
nodesb = [(node, data) for node, data in G.nodes(data=True) if data['type'] == 'B']
while nodesa:
u,du = nodesa.pop()
pu = du['pos']
for v,dv in nodesb:
pv = dv['pos']
d = sum(((a-b)**2 for a,b in zip(pu,pv)))
if d <= radius**2:
G.add_edge(u,v)
for n in nodesa:
mu = du['G.degree(n)']
while mu > 1:
G.remove_edge(u,v)
if mu <=1:
break
return G
Reply to words like jared. I tried using you code plus a couple changes I had to make:
def hamiltPath(graph):
maxDegree = 2
remaining = graph.nodes()
newGraph = nx.Graph()
while len(remaining) > 0:
node = remaining.pop()
neighbors = [n for n in graph.neighbors(node) if n in remaining]
if len(neighbors) > 0:
neighbor = neighbors[0]
newGraph.add_edge(node, neighbor)
if len(newGraph.neighbors(neighbor)) >= maxDegree:
remaining.remove(neighbor)
return newGraph
This ends up removing nodes from the final graph which I had hoped it would not.

Suppose we have a Bipartite graph. If you want each node to have degree 0, 1 or 2, one way to do this would be the following. If you want to do a matching, either look up the algorithm (I don't remember it), or change maxDegree to 1 and I think it should work as a matching instead. Regardless, let me know if this doesn't do what you want.
def hamiltPath(graph):
"""This partitions a bipartite graph into a set of components with each
component consisting of a hamiltonian path."""
# The maximum degree
maxDegree = 2
# Get all the nodes. We will process each of these.
remaining = graph.vertices()
# Create a new empty graph to which we will add pairs of nodes.
newGraph = Graph()
# Loop while there's a remaining vertex.
while len(remaining) > 0:
# Get the next arbitrary vertex.
node = remaining.pop()
# Now get its neighbors that are in the remaining set.
neighbors = [n for n in graph.neighbors(node) if n in remaining]
# If this list of neighbors is non empty, then add (node, neighbors[0])
# to the new graph.
if len(neighbors) > 0:
# If this is not an optimal algorithm, I suspect the selection
# a vertex in this indexing step is the crux. Improve this
# selection and the algorthim might be optimized, if it isn't
# already (optimized in result not time or space complexity).
neighbor = neighbors[0]
newGraph.addEdge(node, neighbor)
# "node" has already been removed from the remaining vertices.
# We need to remove "neighbor" if its degree is too high.
if len(newGraph.neighbors(neighbor)) >= maxDegree:
remaining.remove(neighbor)
return newGraph
class Graph:
"""A graph that is represented by pairs of vertices. This was created
For conciseness, not efficiency"""
def __init__(self):
self.graph = set()
def addEdge(self, a, b):
"""Adds the vertex (a, b) to the graph"""
self.graph = self.graph.union({(a, b)})
def neighbors(self, node):
"""Returns all of the neighbors of a as a set. This is safe to
modify."""
return (set(a[0] for a in self.graph if a[1] == node).
union(
set(a[1] for a in self.graph if a[0] == node)
))
def vertices(self):
"""Returns a set of all of the vertices. This is safe to modify."""
return (set(a[1] for a in self.graph).
union(
set(a[0] for a in self.graph)
))
def __repr__(self):
result = "\n"
for (a, b) in self.graph:
result += str(a) + "," + str(b) + "\n"
# Remove the leading and trailing white space.
result = result[1:-1]
return result
graph = Graph()
graph.addEdge("0", "4")
graph.addEdge("1", "8")
graph.addEdge("2", "8")
graph.addEdge("3", "5")
graph.addEdge("3", "6")
graph.addEdge("3", "7")
graph.addEdge("3", "8")
graph.addEdge("3", "9")
graph.addEdge("3", "10")
graph.addEdge("3", "11")
print(graph)
print()
print(hamiltPath(graph))
# Result of this is:
# 10,3
# 1,8
# 2,8
# 11,3
# 0,4

I don't know if it is your problem but my wtf detector is going crazy when I read those two final blocks:
while nodesa:
u,du = nodesa.pop()
pu = du['pos']
for v,dv in nodesb:
pv = dv['pos']
d = sum(((a-b)**2 for a,b in zip(pu,pv)))
if d <= radius**2:
G.add_edge(u,v)
for n in nodesa:
mu = du['G.degree(n)']
while mu > 1:
G.remove_edge(u,v)
if mu <=1:
break
you never go inside the for loop, since nodesa needs to be empty to reach it
even if nodesa is not empty, if mu is an int, you have an infinite loop in in your last nested while since you never modify it.
even if you manage to break from this while statement, then you have mu > 1 == False. So you immediatly break out of your for loop
Are you sure you are doing what you want here? can you add some comments to explain what is going on in this part?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

b-tree pseudocode confusion - python

t is the minimum degree of the tree, i.e. the minimum number of children each node in the tree must have (and also half of the maximum number of children each node may have).

Related

Dijkstra variation works most of the time, but gives wrong output on a test that I do not have access to

Time limit exceeded when finding tree height

Dijkstra algorithm in python using dictionaries

Dijsktra's Shortest Path Algorithm

delete random edges of a node till degree = 1 networkx

Categories

Resources