My tree has the following structure:
tree={'0':('1','2','3'), '1':('4'), '2':('5','6'), '3':(), '4':('7','8'), '8':('9','10','11')}
How can I wrote Python code to retrieve all given child nodes of a particular node?
For example, if I give it node 4, the code should retrieve 7,8,9,10,11.
For node 2, it should retrieve 5, 6 and so on.
I just started learning the basics of Python but I have no idea how to implement this for non-binary trees..
You can use a queue. Once you've gotten the user's requested value, push it into the queue. Then, while the queue isn't empty, pop a value, print it, check the dict, and if the current value is a key in the dict, add each of those values to the queue to check them in the next pass.
import queue
tree={'0':('1','2','3'), '1':('4'), '2':('5','6'), '3':(), '4':('7','8'), '8':('9','10','11')}
num = input("what you want ")
q = queue.Queue()
q.put(num)
while not q.empty():
n = q.get()
for s in n:
print(s)
if s in tree:
q.put(tree[s])
Demo
Note that if you have a tree tree={'0':('1'), '1':('0')}, or any other circular reference, this code will run forever. Be careful!
Related
I was wondering if there was any concurrent structure like queue in python but with the ability to remove a specific element.
Example:
import queue
#with queue would be
q = queue.Queue()
#put some element
q.put(elem)
#i want to delete a specific element
#but queue does not provide this method
q.remove(elem)
What could I use?
Actually Python lists work like what you are looking for. As a matter of fact, the translation of your code (which requires no imports) should look like this:
#Create the list
q = [element1, element2, element3...]
#Insert element
q.insert(position, element4)
#Insert element in the end
q.append(element4)
#Remove element
del(q[position])
So that you can manage it as desired.
I hope that helps you.
On the Python official docs here, the following is mentioned regarding heaps:
A nice feature of this sort is that you can efficiently insert new
items while the sort is going on, provided that the inserted items are
not “better” than the last 0’th element you extracted. This is
especially useful in simulation contexts, where the tree holds all
incoming events, and the “win” condition means the smallest scheduled
time. When an event schedules other events for execution, they are
scheduled into the future, so they can easily go into the heap
I can only think of the following simple algorithm to implement a scheduler using heap:
# Priority queue using heap
pq = []
# The first element in the tuple represents the time at which the task should run.
task1 = (1, Task(...))
task2 = (2, Task(...))
add_task(pq, task1)
add_task(pq, task2)
# Add a few more root-level tasks
while pq:
next_task = heapq.heappop()
next_task.perform()
for child_task in next_task.get_child_tasks():
# Add new child tasks if available
heapq.heappush(pq, child_task)
In this, where does sorting even come into the picture?
And even if the future child tasks have a time for the 'past', still this algorithm would work correctly.
So, why is the author warning about the child events only being scheduled for the future??
And what does this mean:
you can efficiently insert new items while the sort is going on,
provided that the inserted items are not “better” than the last 0’th
element you extracted.
Heap are used as data structure for priority queue, in fact the fundamental in a min heap is that you have the lowest priority on top (or in max heap the higher priority on top). Therefore you can always extract lowest or highest element without search it.
You can always insert new element during the sorting, try to look how the heapSort works. Every time you need to build your heap and then extract the maximum value and put it on the end of the array, after you decrement the heap.length of 1.
If you already sorted some numbers: [..., 13, 15, 16] and you insert a new number that is higher of the last element that is extracted (13 = 0’th element) you will get a wrong solution, because you will extract the new number but you won't put it in the right place: [1, 2, 5, 7, 14, 13, 15, 16]. It will be placed before 13 because it swap the element on heap.length position.
This is obviously wrong so you can only insert element that are less of the 0’th element.
I recently started programming in Python (3.5) and I am trying to solve a simple Breadth first search problem in Python (see code)
import queue
import networkx as nx
def bfs(graph, start, target):
frontier = queue.Queue()
frontier.put(start)
explored = list()
while not frontier.empty():
state = frontier.get()
explored.append(state)
print(explored)
if state == target:
return 'success'
print(graph.neighbors(state))
for neighbor in graph.neighbors(state):
if neighbor not in explored:
frontier.put(state)
return 'Failure to find path'
The code returns an infinite loop where it seems that frontier.get() does not delete the item from the queue. This makes the while loop infinite, as the first value in the queue is always the start node defined in the function input. The variable state is in each while loop the same (always the start node).
What am I doing wrong? As from what I understood the queue should move from the start node to the neighbours of the start node and therefore a loop should not occur.
Two things. First, I assume everything from the while on down ought to be indented by one level.
If I'm reading your algorithm correctly, I believe the error is on the last line before the return. You have:
frontier.put(state)
which just inserts the node you were already looking at. I think what you should be doing instead is:
frontier.put(neighbor)
so that you explore all the immediate neighbors of state. Otherwise you just keep looking at the start node over and over.
Because you're putting the state value in the queue again. Change this:
for neighbor in graph.neighbors(state):
if neighbor not in explored:
frontier.put(state) # Here you put the 'state' back!
to this:
for neighbor in graph.neighbors(state):
if neighbor not in explored:
frontier.put(neighbor) # Put in the neighbours instead.
I'm working on a web crawler. The crawler is built for web page which has many categories. These categories can have subcategories, the same for subcategories etc.
So it can seems like this:
So I made a recursive method which provides deep first search.
def deep_search(url):
if is_leaf(url):
return get_data(url)
for url in get_subcategories(url):
deep_search(url)
This method works fine but it takes a long time to finish so there are situations when connection falls or another error raises.
What would you do to remember state in case that error occures and next time it continues from this state?
I can't just remember last 'url' or category since there are loops and the program would not know what 'urls' and categories has been stored in upper loops.
If the order of search paths is stable (every time your script visits sub-categories in the same order), then you can maintain a branch number list in your DFS, and make it persistent - save it in a file or database:
current_path = [] # save the path currently visited
def deep_search(url, last_saved_path=None):
if is_leaf(url):
if last_saved_path:
# Continue where you left off
if path_reached(last_saved_path):
data = get_data(url)
else: # first run
data = get_data(url)
# save the whole path persistently
save_to_file(current_path)
# add data to result
else:
for index, url in enumerate(get_subcategories(url)):
current_path.append(index)
deep_search(url, last_saved_path)
del current_path[-1]
def path_reached(old_path):
print old_path, current_path
# if the path has been visited in last run
for i,index in enumerate(current_path):
if index < old_path[i]:
return False
elif index > old_path[i]:
return True
return True
When running the crawler for a second time, you can load the saved path and start where you left off:
# first run
deep_search(url)
# subsequent runs
last_path = load_last_saved_path_from_file()
deep_search(url, last_path)
That said, I think in a web crawler there are 2 kind of tasks: traversing the graph and downloading data. And it's better to keep them separate: use the above DFS algorithm (plus logic to skip paths that have been visited) to traverse the links, and save the download urls in a queue; Then start a bunch of workers to take urls from the queue and download. This way, you just need to record the current position in queue if interrupted.
And I recommend scrapy to you, I haven't read scrapy source, but I guess it implements all of the above, and more.
As a simple hint you can use a try-except statement to handle your Errors and save the relative url and as a good choice for such task you can use collections.deque with 1 capacity,and check it in next iterations.
Demo :
from collections import deque
def deep_search(url,deq=deque(maxlen=1)):
if is_leaf(url):
return get_data(url)
try:
for url in get_subcategories(url):
if deq[0]==url:
deep_search(url,deq)
except : #you can put the error title after except
deq.append(url)
But as a more pythonic way for dealing with networks you can use networkx.
NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
I am relatively new to Python and I need some help. This is also my first post on this site. I am trying to change the value of the colorspace knob in Read Node I have labeled "Plate". I would like to use that value later on. Here is my code so far:
def LabelPlate():
n = nuke.thisNode()
if n != None:
label = n['label'].value()
n['label'].setValue('Plate')
def LabelLook():
name= "Plate"
for node in nuke.allNodes():
if name == node.knob("label").value():
return True
def LabelLookTwo():
name= "Plate"
for node in nuke.allNodes():
if name == node.knob("label").value():
return node.knob("name").value()
def PlateColorspaceSet():
n = LabelLookTwo()
if nuke.toNode("n")["colorspace"].value() == "default (sRGB)":
nuke.toNode("n")["colorspace"].setValue("sRGB")
def LabelQuestion():
if LabelLook() != True:
if nuke.ask('Is this Read your main Plate?'):
LabelPlate()
PlateColorspaceSet()
nuke.addOnUserCreate(LabelQuestion, nodeClass = 'Read')
So order of events:
Bring in Read node
Ask if Read node is your main plate
a. If yes, label node as "Plate", proceed to step 3
b. If no, bring in unlabeled Read node
Change colorspace in node labeled "Plate" from default to an actual value.
So far, I can get the first 2 steps to work. But on step 3, I get
"'NoneType' object has no attribute 'getitem'"
Any ideas why? Is there a better way to get the colorspace value?
I figured out the problem.
nuke.addOnUserCreate is where I was calling the function on the creation of the read node. Problem is, it runs the script before everything exists. So not everything works because not everything is there in Nuke yet, so things, like my variable n = LabelLookTwo(), return as None.
Using addOnCreate instead runs the script after the node and it's defaults have been set and created. So using this, the rest of the script runs exactly as originally written.
Where I found the answer