Scheduling problem and nonlinear simulations

Scheduling problem and nonlinear simulations - python

My problem appears to be a scheduling problem, but there’s a few twists I don’t know how to handle - I’m not even sure what terms to Google.
I have a set of “resources”, which may or may not have a lower bound on their first availability date (or even an upper bound on their last date of availability). The number of resources is fixed.
Then I have a set of tasks: those tasks do not really have any precedence over one another (in general), but some of them may have a lower bound on when they can be executed (I.e., not before a certain date). They of course have durations (in general every task has a different duration).
There cannot be two resources allocated to the same task. Also, the tasks to be executed can only be done at a set of specific locations (I.e., 2D points on a map). For this reason, there cannot be two resources at the same location at the same time - even if they execute two different tasks.
Now, assuming I could ever formulate (and solve) this type of problem, the resulting schedule has to be fed to a time-based fluid flow simulation, which is nonlinear. The results of this simulation are going to give me how much each task is worth (together with how much all the tasks together are worth, which is my target to maximize).
Of course, changing the timing of execution of a task will make the nonlinear simulation give a different “worth” for that task. I.e., executing it in January 2023 will give a number, executing it in June 2023 will give another number. Inside the simulator these tasks interfere with one another and they are subject to additional, complex constraints.
Does anyone have any suggestion on how to approach this problem? I’m familiar with Python and linear programming (LPSolve, GLPK, but anything would do I guess).

Related

Rising and Falling Edge in multiple signals - PYTHON

This is the global scenario: I'm recording some simple signals from a novel sensor using Python 3.8. I have already filtered signals to have a better representations where let run other algorithms of Data Analysis. Nothing of special.
Following some signals on which I need to run my algorithm:
First Example
Second Example
These signals came out a sensor whose I am working on. My aim is to get the timestamps where signals starting to increase or decrease. I actually need to run this algorithm for only one signal (blue or orange).
I have reported both signals because they have antagonistic behaviour and maybe could be useful to achieve my task.
In other words, these signals are regarded to Foot Flexion Extension (FLE/EXT), then the point where they start to increase represents the point when I start to move my foot. Viceversa, when I move back my foot it results on decreasing signals amplitude.
My job is to identify the FLE/EXT and I tried to examine first derivative but it appears to don't give me any useful information.
I also have tried to use a convolution with a fixed-lenght ones-array by looking for when the successive convulution's average is greater than the current one.
This approach has 2 constraints:
Fixed-lenght array: because when signals represents faster FLE/EXT (then in less temporale distance in x-axis) the window is not enough to catch variation.
Threshold's Criterion for choosing how much has to be the successive average respect to the current one in order to save this iteration for my purpose.
I have stuck here, because I want to use a dynamic threshold approach or something similar which can allow me to exclude any fixed thresholds.
I hope to have a discussion with you for solving my problem. What do you think?
Please, if something is unclear, I am ready to clarify better.
Best regards,
V

optmization of a scheduling problem in python

I am trying to learn on scheduling and have the following use case:
I have different parts that need to be delivered on specific dates, also they have different quantity and different runtimes. Only 1 machine is considered. The delivery date is a hard constraint, but I also would like to see if I can optimize the setup of the machine for each product. Therefore I have table with the different tools used for the parts. When the cell has a 0 the tool is not used, when there is a 1 the tool is used. I have around 50 tools in total for all parts. do not want to look only at the delivery dates, I also want to look how I can shorten time between the change from part A to part B, so that I do change as less as tools as possible.
I was able to sort my data after the date, but do not know where I should start to go to optimize, which algorithm might be good, a genetic algorithm or ant colony optimization ? I can not provide a code yet and also do not want one whole code from here, but a good starting point is my interest.

Is there any way to do scipy.optimize.minimize (or something functionally equivalent) in parallel?

I have a multivariate optimization problem that I want to run. Each evaluation is quite slow. So obviously the ability to thread it out to multiple machines would be quite nice. I have no trouble writing the code to dispatch jobs to other machines. However, scipy.optimize.minimize calls each evaluation call sequentially; it won't give me another set of parameters to evaluate until the previous one returns.
Now, I know that the "easy" solution to this would be "run your evaluation task in a parallel manner - break it up". Indeed, while it is possible to do this to some extent, it only goes so far; the bandwidth rises the more you split it up until splitting it up further actually starts to slow you down. Having another axis in which to parallelize - aka, the minimization function itself - would greatly increase scaleability.
Is there no way to do this with scipy.optimize.minimize? Or any other utility that performs in a roughly functionally equivalent manner (trying to find as low of a minimum as possible)? Surely it's possible for a minimization utility to make use of parallelism, particularly on a multivariate optimization problem where there's many axes whose gradients relative to each other at given datapoints need to be examined.

Scheduling: Minimizing Gaps between Non-Overlapping Time Ranges

Using Django to develop a small scheduling web application where people are assigned certain times to meet with their superiors. Employees are stored as models, with a OneToMany relation to a model representing time ranges and day of the week where they are free. For instance:
Bob: (W 9:00, 9:15), (W 9:15, 9:30), ... (W 15:00, 15:20)
Sarah: (Th 9:05, 9:20), (F 9:20, 9:30), ... (Th 16:00, 16:05)
...
Mary: (W 8:55, 9:00), (F 13:00, 13:35), ... etc
My program allows a basic schedule setup, where employers can choose to view the first N possible schedules with the least gaps in between meetings under the condition that they meet all their employees at least once during that week. I am currently generating all possible permutations of meetings, and filtering out schedules where there are overlaps in meeting times. Is there a way to generate the first N schedules out of M possible ones, without going through all M possibilities?
Clarification: We are trying to get the minimum sum of gaps for any given day, summed over all days.

I would use a search algorithm, like A-star, to do this. Each node in the graph represents a person's available time slots and a path from one node to another means that node_a and node_b are in the partial schedule.
Another solution would be to create a graph in which the nodes are each person's availability times and there is a edge from node_a to node_b if the person associated with node_a is not the same as the person associated with node_b. The weight of each node is the amount of time between the time associated with the two nodes.
After creating this graph, you could generate a variant of a minimum spanning tree from the graph. The variant would differ from MSTs in that:
you'll only add a node to the MST if the person associated with the node is not already in the MST.
you finish creating the MST when all persons are in the MST.
The minimum spanning tree generated would represent a single schedule.
To generate other schedules, remove all the edges from the graph which are found in the schedule you just created and then create a new minimum spanning tree from the graph with the removed edges.

In general, scheduling problems are NP-hard, and while I can't figure out a reduction for this problem to prove it such, it's quite similar to a number of other well-known NP-complete problems. There may be a polynomial-time solution for finding the minimum gap for a single day (though I don't know it off hand, either), but I have less hopes for needing to solve it for multiple days. Unfortunately, it's a complicated problem, and there may not be a perfectly elegant answer. (Or, I'm going to kick myself when someone posts one later.)
First off, I'd say that if your dataset is reasonably small and you've been able to compute all possible schedules fairly quickly, you may just want to stick with that solution, as all others will be approximations, and could possibly end up running slower, if the constant factor of their running time is large. (Meaning that it doesn't grow with the size of the dataset, so it will relatively be smaller for a large dataset.)
The simplest approximation would be to just use a greedy heuristic. It will almost assuredly not find the optimum schedules, and may take a long time to find a solution if most of the schedules are overlapping, and there are only a few that are even valid solutions - but I'm going to assume that this is not the case for employee times.
Start with an arbitrary schedule, choosing one timeslot for each employee at random. For each iteration, pick one employee and change his timeslot to the best possible time, with respect to the rest of current schedule. Repeat this process until your satisfied with the result - when it isn't improving quickly enough anymore or has taken too long. You're probably not going to want to repeat until you can't make any more changes that improve the schedule, since this process will likely loop for most data.
It's not a great heuristic, but it should produce some reasonable schedules, and has a lot of room for adjustment. You may want to always try to switch overlapping times first before any others, or you may want to try to flip the employee who currently contributes to the largest gap, or maybe eliminate certain slots that you've already tried. You may want to sometimes allow a move to a less optimal solution in hopes that you're at a local minima and want to get out of it - some randomness can also help with this. Make sure you always keep track of the best solution you've seen so far.
To generate more schedules, the most obvious thing would be to just start the process over with a different random schedule. Or, maybe flip a few arbitrary times from the previous solution you found, and repeat from there.
Edit: This is all fairly related to genetic algorithms, and you may want to use some of the ideas I presented here in a GA.

Comparing Root-finding (of a function) algorithms in Python

I would like to compare different methods of finding roots of functions in python (like Newton's methods or other simple calc based methods). I don't think I will have too much trouble writing the algorithms
What would be a good way to make the actual comparison? I read up a little bit about Big-O. Would this be the way to go?

The answer from #sarnold is right -- it doesn't make sense to do a Big-Oh analysis.
The principal differences between root finding algorithms are:
rate of convergence (number of iterations)
computational effort per iteration
what is required as input (i.e. do you need to know the first derivative, do you need to set lo/hi limits for bisection, etc.)
what functions it works well on (i.e. works fine on polynomials but fails on functions with poles)
what assumptions does it make about the function (i.e. a continuous first derivative or being analytic, etc)
how simple the method is to implement
I think you will find that each of the methods has some good qualities, some bad qualities, and a set of situations where it is the most appropriate choice.

Big O notation is ideal for expressing the asymptotic behavior of algorithms as the inputs to the algorithms "increase". This is probably not a great measure for root finding algorithms.
Instead, I would think the number of iterations required to bring the actual error below some epsilon ε would be a better measure. Another measure would be the number of iterations required to bring the difference between successive iterations below some epsilon ε. (The difference between successive iterations is probably a better choice if you don't have exact root values at hand for your inputs. You would use a criteria such as successive differences to know when to terminate your root finders in practice, so you could or should use them here, too.)
While you can characterize the number of iterations required for different algorithms by the ratios between them (one algorithm may take roughly ten times more iterations to reach the same precision as another), there often isn't "growth" in the iterations as inputs change.
Of course, if your algorithms take more iterations with "larger" inputs, then Big O notation makes sense.

Big-O notation is designed to describe how an alogorithm behaves in the limit, as n goes to infinity. This is a much easier thing to work with in a theoretical study than in a practical experiment. I would pick things to study that you can easily measure that and that people care about, such as accuracy and computer resources (time/memory) consumed.
When you write and run a computer program to compare two algorithms, you are performing a scientific experiment, just like somebody who measures the speed of light, or somebody who compares the death rates of smokers and non-smokers, and many of the same factors apply.
Try and choose an example problem or problems to solve that is representative, or at least interesting to you, because your results may not generalise to sitations you have not actually tested. You may be able to increase the range of situations to which your results reply if you sample at random from a large set of possible problems and find that all your random samples behave in much the same way, or at least follow much the same trend. You can have unexpected results even when the theoretical studies show that there should be a nice n log n trend, because theoretical studies rarely account for suddenly running out of cache, or out of memory, or usually even for things like integer overflow.
Be alert for sources of error, and try to minimise them, or have them apply to the same extent to all the things you are comparing. Of course you want to use exactly the same input data for all of the algorithms you are testing. Make multiple runs of each algorithm, and check to see how variable things are - perhaps a few runs are slower because the computer was doing something else at a time. Be aware that caching may make later runs of an algorithm faster, especially if you run them immediately after each other. Which time you want depends on what you decide you are measuring. If you have a lot of I/O to do remember that modern operating systems and computer cache huge amounts of disk I/O in memory. I once ended up powering the computer off and on again after every run, as the only way I could find to be sure that the device I/O cache was flushed.

You can get wildly different answers for the same problem just by changing starting points. Pick an initial guess that's close to the root and Newton's method will give you a result that converges quadratically. Choose another in a different part of the problem space and the root finder will diverge wildly.
What does this say about the algorithm? Good or bad?

I would suggest you to have a look at the following Python root finding demo.
It is a simple code, with some different methods and comparisons between them (in terms of the rate of convergence).
http://www.math-cs.gordon.edu/courses/mat342/python/findroot.py

I just finish a project where comparing bisection, Newton, and secant root finding methods. Since this is a practical case, I don't think you need to use Big-O notation. Big-O notation is more suitable for asymptotic view. What you can do is compare them in term of:
Speed - for example here newton is the fastest if good condition are gathered
Number of iterations - for example here bisection take the most iteration
Accuracy - How often it converge to the right root if there is more than one root, or maybe it doesn't even converge at all.
Input - What information does it need to get started. for example newton need an X0 near the root in order to converge, it also need the first derivative which is not always easy to find.
Other - rounding errors
For the sake of visualization you can store the value of each iteration in arrays and plot them. Use a function you already know the roots.

Although this is a very old post, my 2 cents :)
Once you've decided which algorithmic method to use to compare them (your "evaluation protocol", so to say), then you might be interested in ways to run your challengers on actual datasets.
This tutorial explains how to do it, based on an example (comparing polynomial fitting algorithms on several datasets).
(I'm the author, feel free to provide feedback on the github page!)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.