I am implementing an acoustic feature extraction system in python, and I need to implement a makefile-style algorithm to ensure that all blocks in the feature extraction system are run in the correct order, and without repeating any feature extractions stages.
The input to this feature extraction system will be a graph detailing the links between the feature extraction blocks, and I'd like to work out which functions to run when based upon the graph.
An example of such a system might be the following:
,-> [B] -> [D] ----+
input --> [A] ^ v
`-> [C] ----+---> [E] --> output
and the function calls (assuming each block X is a function of the form output = X(inputs) might be something like:
a = A(input)
b = B(a)
c = C(a)
d = D(b,c) # can't call D until both b and c are ready
output = E(d,c) # can't call E until both c and d are ready
I already have the function graph loaded in the form of a dictionary with each dictionary entry of the form (inputs, function) like so:
blocks = {'A' : (('input'), A),
'B' : (('A'), B),
'C' : (('A'), C),
'D' : (('B','C'), D),
'output' : (('D','C'), E)}
I'm just currently drawing a blank on what the makefile algorithm does exactly, and how to go about implementing it. My google-fu appears to be not-very-helpful here too. If someone at least can give me a pointer to a good discussion of the makefile algorithm that would probably be a good start.
Topological sorting
blocks is basically an adjacency list representation of the (acyclic) dependency graph. Hence, to get the correct order to process the blocks, you can perform a topological sort.
As the other helpful answerers have already pointed out, what I'm after is a topological sort, but I think my particular case is a little simpler because the function graph must always start at input and end at output.
So, here is what I ended up doing (I've edited it slightly to remove some context-dependent stuff, so it may not be completely correct):
def extract(func_graph):
def _getsignal(block,signals):
if block in signals:
return signals[block]
else:
inblocks, fn = func_graph[block]
inputs = [_getsignal(i,signals) for i in inblocks]
signals[block] = fn(inputs)
return signals[block]
def extract_func (input):
signals = dict(input=input)
return _getsignal('output',signals)
return extract_func
So now given I can set up the function with
fn = extract(blocks)
And use it as many times as I like:
list_of_outputs = [fn(i) for i in list_of_inputs]
I should probably also put in some checks for loops, but that is a problem for another day.
For code in many languages, including Python try these Rosetta code links: Topological sort, and Topological sort/Extracted top item.
Related
My algorithm needs to modify the children() of the existing logic gate. Suppose i have the following code
a = Bool('a')
b = Bool('b')
c = Bool('c')
or_gate = Or(a, b)
I want to modify or_gate to be Or(a, c).
I have tried the following:
or_gate.children()[1] = c
print(or_gate)
The above code doesn't work, or_gate is still Or(a, b). So how do i change the children of a logic gate in z3?
Edit: i can't create new Or() that contains the children i want due to nested gate. For example consider the following
a = Bool('a')
b = Bool('b')
c = Bool('c')
d = Bool('d')
e = Bool('e')
or_gate_one = Or(a, b)
or_gate_two = Or(or_gate_one, c)
or_gate_three = Or(or_gate_two, d)
If we create new object instead of directly modifying the children of or_gate_one, then or_gate_two and or_gate_three should also be modified, which is horrible for large scale. I need to modify the children of or_gate_one directly.
Use the substitute method:
from z3 import *
a = Bool('a')
b = Bool('b')
c = Bool('c')
or_gate = Or(a, b)
print(or_gate)
or_gate_c = substitute(or_gate, *[(a, c)])
print(or_gate_c)
This prints:
Or(a, b)
Or(c, b)
I wouldn't worry about efficiency, unless you observe it to be a problem down the road: (1) z3 will "share" the internal AST when you use substitute as much as it can, and (2) in most z3 programming, run-time is dominated by solving the constraints, not building them.
If you find that building the constraints is very costly, then you should switch to the lower-level APIs (i.e., C/C++), instead of using Python so you can avoid the extra-level of interpretation in between. But again, only do this if you have run-time evidence that building is the bottleneck, not the solving. (Because in the common case of the latter, switching to C/C++ isn't going to save you much anyhow.)
For my current assignment, I am to establish the stability of intersection/equilibrium points between two nullclines, which I have defined as follows:
def fNullcline(F):
P = (1/k)*((1/beta)*np.log(F/(1-F))-c*F+v)
return P
def pNullcline(P):
F = (1/delta)*(pD-alpha*P+(r*P**2)/(m**2+P**2))
return F
I also have a method "stability" that applies the Hurwitz criteria on the underlying system's Jacobian:
def dPdt(P,F):
return pD-delta*F-alpha*P+(r*P**2)/(m**2+P**2)
def dFdt(P,F):
return s*(1/(1+sym.exp(-beta*(-v+c*F+k*P)))-F)
def stability(P,F):
x = sym.Symbol('x')
ax = sym.diff(dPdt(x, F),x)
ddx = sym.lambdify(x, ax)
a = ddx(P)
# shortening the code here: the same happens for b, c, d
matrix = [[a, b],[c,d]]
eigenvalues, eigenvectors = np.linalg.eig(matrix)
e1 = eigenvalues[0]
e2 = eigenvalues[1]
if(e1 >= 0 or e2 >= 0):
return 0
else:
return 1
The solution I was looking for was later provided. Basically, values became too small! So this code was added to make sure no too small values are being used for checking the stability:
set={0}
for j in range(1,210):
for i in range(1,410):
x=i*0.005
y=j*0.005
x,y=fsolve(System,[x,y])
nexist=1
for i in set:
if(abs(y-i))<0.00001:
nexist=0
if(nexist):
set.add(y)
set.discard(0)
I'm still pretty new to coding so the function in and on itself is still a bit of a mystery to me, but it eventually helped in making the little program run smoothly :) I would again like to express gratitude for all the help I have received on this question. Below, there are still some helpful comments, which is why I will leave this question up in case anyone might run into this problem in the future, and can find a solution thanks to this thread.
After a bit of back and forth, I came to realise that to avoid the log to use unwanted values, I can instead define set as an array:
set = np.arange(0, 2, 0.001)
I get a list of values within this array as output, complete with their according stabilities. This is not a perfect solution as I still get runtime errors (in fact, I now get... three error messages), but I got what I wanted out of it, so I'm counting that as a win?
Edit: I am further elaborating on this in the original post to improve the documentation, however, I would like to point out again here that this solution does not seem to be working, after all. I was too hasty! I apologise for the confusion. It's a very rocky road for me. The correct solution has since been provided, and is documented in the original question.
I have a video where I want to insert a dynamic amount of TextClip(s). I have a while loop that handles the logic for actually creating the different TextClips and giving them individual durations & start_times (this works). I do however have a problem with actually "compiling" the video itself with inserting these texts.
Code for creating a TextClip (that works).
text = mpy.TextClip(str(contents),
color='white', size=[1700, 395], method='caption').set_duration(
int(list[i - 1])).set_start(currentTime).set_position(("center", 85))
print(str(i) + " written")
textList.append(text)
Code to "compile" the video. (that doesn't work)
final_clip = CompositeVideoClip([clip, len(textList)])
final_clip.write_videofile("files/final/TEST.mp4")
I tried several approaches but now I'm stuck and can't figure out a way to continue. Before I get a lot of "answers" telling me to do a while loop on the compiling, let me just say that the actual compiling takes about 5 minutes and I have 100-500 different texts I need implemented in the final video which would take days. Instead I want to add them one by one and then do 1 big final compile which I know it will take slightly longer than 5 minutes, but still a lot quicker than 2-3 days.
For those of you that may not have used moviepy before I will post a snippet of "my code" that actually works, not in the way I need it to though.
final_clip = CompositeVideoClip([clip, textList[0], textList[1], textList[2]])
final_clip.write_videofile("files/final/TEST.mp4")
This works exactly as intended (adding 3 texts), However I dont/can't know how many texts there will be in each video beforehand so I need to somehow insert a dynamic amount of textList[] into the function.
Kind regards,
Unsure what the arguments after clip, do (you could clarify), but if the problem's solved by inserting a variable number of textList[i] args, the solution's simple:
CompositeVideoClip([clip, *textList])
The star unpacks the list (or any iterable); ex: arg=4,5 -- def f(a,b): return a+b -- f(*arg)==9. If you have many textLists, you can manage them via a nested list or a dictionary:
textListDict = {'1':textList1, '2':textList2, ...}
textListNest = [textList1, textList2, ...] # probably preferable - using for below
# now iterate:
for textList in textListNest:
final_clip = CompositeVideoClip([clip, *textList])
Unpacking demo:
def show_then_join(a, b, c):
print("a={}, b={}, c={}".format(a,b,c))
print(''.join("{}{}{}".format(a,b,c)))
some_list = [1, 2, 'dog']
show_then_sum(*some_list) # only one arg is passed in, but is unpacked into three
# >> a=1, b=2, c=dog
# >> 12dog
Can't get my mind around this...
I read a bunch of spreadsheets, do a bunch of calculations and then want to create a summary DF from each set of calculations. I can create the initial df but don't know how to control my loops so that I
create the initial DF (1st time though the loop)
If it has been created append the next DF (last two rows) for each additional tab.
I just can't wrap my head how to create the right nested loop so that once the 1st one is done the subsequent ones get appended?
My current code looks like: (which just dumbly prints each tab's results separately rather than create a new consolidated sumdf with just the last 2 rows of each tabs' results..
#make summary
area_tabs=['5','12']
for area_tabs in area_tabs:
actdf,aname = get_data(area_tabs)
lastq,fcast_yr,projections,yrahead,aname,actdf,merged2,mergederrs,montdist,ols_test,mergedfcst=do_projections(actdf)
sumdf=merged2[-2:]
sumdf['name']= aname #<<< I'll be doing a few more calculations here as well
print sumdf
Still a newb learning basic python loop techniques :-(
Often a neater way than writing for loops, especially if you are planning on using the result, is to use a list comprehension over a function:
def get_sumdf(area_tab): # perhaps you can name better?
actdf,aname = get_data(area_tab)
lastq,fcast_yr,projections,yrahead,aname,actdf,merged2,mergederrs,montdist,ols_test,mergedfcst=do_projections(actdf)
sumdf=merged2[-2:]
sumdf['name']= aname #<<< I'll be doing a few more calculations here as well
return sumdf
[get_sumdf(area_tab) for area_tab in areas_tabs]
and concat:
pd.concat([get_sumdf(area_tab) for area_tab in areas_tabs])
or you can also use a generator expression:
pd.concat(get_sumdf(area_tab) for area_tab in areas_tabs)
.
To explain my comment re named tuples and dictionaries, I think this line is difficult to read and ripe for bugs:
lastq,fcast_yr,projections,yrahead,aname,actdf,merged2,mergederrs,montdist,ols_test,mergedfcst=do_projections(actdf)
A trick is to have do_projections return a named tuple, rather than a tuple:
from collections import namedtuple
Projection = namedtuple('Projection', ['lastq', 'fcast_yr', 'projections', 'yrahead', 'aname', 'actdf', 'merged2', 'mergederrs', 'montdist', 'ols_test', 'mergedfcst'])
then inside do_projections:
return (1, 2, 3, 4, ...) # don't do this
return Projection(1, 2, 3, 4, ...) # do this
return Projection(last_q=last_q, fcast_yr=f_cast_yr, ...) # or this
I think this avoids bugs and is a lot cleaner, especially to access the results later.
projections = do_projections(actdf)
projections.aname
Do the initialisation outside the for loop. Something like this:
#make summary
area_tabs=['5','12']
if not area_tabs:
return # nothing to do
# init the first frame
actdf,aname = get_data(area_tabs[0])
lastq,fcast_yr,projections,yrahead,aname,actdf,merged2,mergederrs,montdist,ols_test,mergedfcst =do_projections(actdf)
sumdf=merged2[-2:]
sumdf['name']= aname
for area_tabs in area_tabs[1:]:
actdf,aname = get_data(area_tabs)
lastq,fcast_yr,projections,yrahead,aname,actdf,merged2,mergederrs,montdist,ols_test,mergedfcst =do_projections(actdf)
sumdf=merged2[-2:]
sumdf['name']= aname #<<< I'll be doing a few more calculations here as well
print sumdf
You can further improve the code by putting the common steps into a function.
Inside of a network, information (package) can be passed to different node(hosts), by modify it's content it can carry different meaning. The final package depends on hosts input via it's given route of network.
Now I want to implement a calculating network model can do small jobs by give different calculate path.
Prototype:
def a(p): return p + 1
def b(p): return p + 2
def c(p): return p + 3
def d(p): return p + 4
def e(p): return p + 5
def link(p, r):
p1 = p
for x in r:
p1 = x(p1)
return p1
p = 100
route = [a,c,d]
result = link(p,result)
#========
target_result = 108
if result = target_result:
# route is OK
I think finally I need something like this:
p with [init_payload, expected_target, passed_path, actual_calculated_result]
|
\/
[CHAOS of possible of functions networks]
|
\/
px [a,a,b,c,e] # ok this path is ok and match the target
Here is my questions hope may get your help:
can p carry(determin) the route(s) by inspect the function and estmated result?
(1.1 ) for example, if on the route there's a node x()
def x(p): return x / 0 # I suppose it can pass the compile
can p know in somehow this path is not good then avoid select this path?
(1.2) Another confuse is if p is a self-defined class type, the payload inside of this class essentially is a string, when it carry with a path [a,c,d], can p know a() must with a int type then avoid to select this node?'
same as 1.2 when generating the path, can I avoid such oops
def a(p): return p + 1
def b(p): return p + 2
def x(p): return p.append(1)
def y(p): return p.append(2)
full_node_list = [a,b,x,y]
path = random(2,full_node_list) # oops x,y will be trouble for the inttype P and a,b will be trouble to list type.
pls consider if the path is lambda list of functions
PS: as the whole model is not very clear in my mind the any leading and directing will be appreciated.
THANKS!
You could test each function first with a set of sample data; any function which returns consistently unusable values might then be discarded.
def isGoodFn(f):
testData = [1,2,3,8,38,73,159] # random test input
goodEnough = 0.8 * len(testData) # need 80% pass rate
try:
good = 0
for i in testData:
if type(f(i)) is int:
good += 1
return good >= goodEnough
except:
return False
If you know nothing about what the functions do, you will have to essentially do a full breadth-first tree search with error-checking at each node to discard bad results. If you have more than a few functions this will get very large very quickly. If you can guarantee some of the functions' behavior, you might be able to greatly reduce the search space - but this would be domain-specific, requiring more exact knowledge of the problem.
If you had a heuristic measure for how far each result is from your desired result, you could do a directed search to find good answers much more quickly - but such a heuristic would depend on knowing the overall form of the functions (a distance heuristic for multiplicative functions would be very different than one for additive functions, etc).
Your functions can raise TypeError if they are not satisfied with the data types they receive. You can then catch this exception and see whether you are passing an appropriate type. You can also catch any other exception type. But trying to call the functions and catching the exceptions can be quite slow.
You could also organize your functions into different sets depending on the argument type.
functions = { list : [some functions taking a list], int : [some functions taking an int]}
...
x = choose_function(functions[type(p)])
p = x(p)
I'm somewhat confused as to what you're trying to do, but: p cannot "know about" the functions until it is run through them. By design, Python functions don't specify what type of data they operate on: e.g. a*5 is valid whether a is a string, a list, an integer or a float.
If there are some functions that might not be able to operate on p, then you could catch exceptions, for example in your link function:
def link(p, r):
try:
for x in r:
p = x(p)
except ZeroDivisionError, AttributeError: # List whatever errors you want to catch
return None
return p