optmization of a scheduling problem in python

optmization of a scheduling problem in python - python

I am trying to learn on scheduling and have the following use case:
I have different parts that need to be delivered on specific dates, also they have different quantity and different runtimes. Only 1 machine is considered. The delivery date is a hard constraint, but I also would like to see if I can optimize the setup of the machine for each product. Therefore I have table with the different tools used for the parts. When the cell has a 0 the tool is not used, when there is a 1 the tool is used. I have around 50 tools in total for all parts. do not want to look only at the delivery dates, I also want to look how I can shorten time between the change from part A to part B, so that I do change as less as tools as possible.
I was able to sort my data after the date, but do not know where I should start to go to optimize, which algorithm might be good, a genetic algorithm or ant colony optimization ? I can not provide a code yet and also do not want one whole code from here, but a good starting point is my interest.

Related

Python GEKKO Unexpected Behavior with Constraints

I've been playing around with GEKKO for solving flow optimizations and I have come across behavior that is confusing me.
Context:
Sources --> [mixing and delivery] --> Sinks
I have multiple sources (where my flow is coming from), and multiple sinks (where my flow goes to). For a given source (e.g., SOURCE_1), the total flow to the resulting sinks must equal to the volume from SOURCE_1. This is my idea of conservation of mass where the 'mixing' plant blends all the source volumes together.
Constraint Example (DOES NOT WORK AS INTENDED):
When I try to create a constraint for the two SINK volumes, and the one SOURCE volume:
m.Equation(volume_sink_1[i] + volume_sink_2[i] == max_volumes_for_source_1)
I end up with weird results. With that, I mean, it's not actually optimal, it ends up assigning values very poorly. I am off from the optimal by at least 10% (I tried with different max volumes).
Constraint Example (WORKS BUT I DON'T GET WHY):
When I try to create a constraint for the two SINK volumes, and the one SOURCE volume like this:
m.Equation(volume_sink_1[i] + volume_sink_2[i] <= max_volumes_for_source_1 * 0.999999)
With this, I get MUCH closer to the actual optimum to the point where I can just treat it as the optimum. Please note that I had to change it to a less than or equal to and also multiply by 0.999999 which was me messing around with it nonstop and eventually leading to that.
Also, please note that this uses practically all of the source (up to 99.9999% of it) as I would expect. So both formulations make sense to me but the first approach doesn't work.
The only thing I can think of for this behavior is that it's stricter to solve for == than <=. That doesn't explain to me why I have to multiply by 0.999999 though.
Why is this the case? Also, is there a way for me to debug occurrences like this easier?

This same improvement occurs with complementary constraints for conditional statements when using s1*s2<=0 (easier to solve) versus s1*s2==0 (harder to solve).
From the research papers I've seen, the justification is that the solver has more room to search to find the optimal solution even if it always ends up at s1*s2==0. It sounds like your problem may have multiple local minima as well if it converges to a solution, but it isn't the global optimum.
If you can post a complete and minimal problem that demonstrates the issue, we can give more specific suggestions.

Scheduling problem and nonlinear simulations

My problem appears to be a scheduling problem, but there’s a few twists I don’t know how to handle - I’m not even sure what terms to Google.
I have a set of “resources”, which may or may not have a lower bound on their first availability date (or even an upper bound on their last date of availability). The number of resources is fixed.
Then I have a set of tasks: those tasks do not really have any precedence over one another (in general), but some of them may have a lower bound on when they can be executed (I.e., not before a certain date). They of course have durations (in general every task has a different duration).
There cannot be two resources allocated to the same task. Also, the tasks to be executed can only be done at a set of specific locations (I.e., 2D points on a map). For this reason, there cannot be two resources at the same location at the same time - even if they execute two different tasks.
Now, assuming I could ever formulate (and solve) this type of problem, the resulting schedule has to be fed to a time-based fluid flow simulation, which is nonlinear. The results of this simulation are going to give me how much each task is worth (together with how much all the tasks together are worth, which is my target to maximize).
Of course, changing the timing of execution of a task will make the nonlinear simulation give a different “worth” for that task. I.e., executing it in January 2023 will give a number, executing it in June 2023 will give another number. Inside the simulator these tasks interfere with one another and they are subject to additional, complex constraints.
Does anyone have any suggestion on how to approach this problem? I’m familiar with Python and linear programming (LPSolve, GLPK, but anything would do I guess).

Rising and Falling Edge in multiple signals - PYTHON

This is the global scenario: I'm recording some simple signals from a novel sensor using Python 3.8. I have already filtered signals to have a better representations where let run other algorithms of Data Analysis. Nothing of special.
Following some signals on which I need to run my algorithm:
First Example
Second Example
These signals came out a sensor whose I am working on. My aim is to get the timestamps where signals starting to increase or decrease. I actually need to run this algorithm for only one signal (blue or orange).
I have reported both signals because they have antagonistic behaviour and maybe could be useful to achieve my task.
In other words, these signals are regarded to Foot Flexion Extension (FLE/EXT), then the point where they start to increase represents the point when I start to move my foot. Viceversa, when I move back my foot it results on decreasing signals amplitude.
My job is to identify the FLE/EXT and I tried to examine first derivative but it appears to don't give me any useful information.
I also have tried to use a convolution with a fixed-lenght ones-array by looking for when the successive convulution's average is greater than the current one.
This approach has 2 constraints:
Fixed-lenght array: because when signals represents faster FLE/EXT (then in less temporale distance in x-axis) the window is not enough to catch variation.
Threshold's Criterion for choosing how much has to be the successive average respect to the current one in order to save this iteration for my purpose.
I have stuck here, because I want to use a dynamic threshold approach or something similar which can allow me to exclude any fixed thresholds.
I hope to have a discussion with you for solving my problem. What do you think?
Please, if something is unclear, I am ready to clarify better.
Best regards,
V

Scheduling: Minimizing Gaps between Non-Overlapping Time Ranges

Using Django to develop a small scheduling web application where people are assigned certain times to meet with their superiors. Employees are stored as models, with a OneToMany relation to a model representing time ranges and day of the week where they are free. For instance:
Bob: (W 9:00, 9:15), (W 9:15, 9:30), ... (W 15:00, 15:20)
Sarah: (Th 9:05, 9:20), (F 9:20, 9:30), ... (Th 16:00, 16:05)
...
Mary: (W 8:55, 9:00), (F 13:00, 13:35), ... etc
My program allows a basic schedule setup, where employers can choose to view the first N possible schedules with the least gaps in between meetings under the condition that they meet all their employees at least once during that week. I am currently generating all possible permutations of meetings, and filtering out schedules where there are overlaps in meeting times. Is there a way to generate the first N schedules out of M possible ones, without going through all M possibilities?
Clarification: We are trying to get the minimum sum of gaps for any given day, summed over all days.

I would use a search algorithm, like A-star, to do this. Each node in the graph represents a person's available time slots and a path from one node to another means that node_a and node_b are in the partial schedule.
Another solution would be to create a graph in which the nodes are each person's availability times and there is a edge from node_a to node_b if the person associated with node_a is not the same as the person associated with node_b. The weight of each node is the amount of time between the time associated with the two nodes.
After creating this graph, you could generate a variant of a minimum spanning tree from the graph. The variant would differ from MSTs in that:
you'll only add a node to the MST if the person associated with the node is not already in the MST.
you finish creating the MST when all persons are in the MST.
The minimum spanning tree generated would represent a single schedule.
To generate other schedules, remove all the edges from the graph which are found in the schedule you just created and then create a new minimum spanning tree from the graph with the removed edges.

In general, scheduling problems are NP-hard, and while I can't figure out a reduction for this problem to prove it such, it's quite similar to a number of other well-known NP-complete problems. There may be a polynomial-time solution for finding the minimum gap for a single day (though I don't know it off hand, either), but I have less hopes for needing to solve it for multiple days. Unfortunately, it's a complicated problem, and there may not be a perfectly elegant answer. (Or, I'm going to kick myself when someone posts one later.)
First off, I'd say that if your dataset is reasonably small and you've been able to compute all possible schedules fairly quickly, you may just want to stick with that solution, as all others will be approximations, and could possibly end up running slower, if the constant factor of their running time is large. (Meaning that it doesn't grow with the size of the dataset, so it will relatively be smaller for a large dataset.)
The simplest approximation would be to just use a greedy heuristic. It will almost assuredly not find the optimum schedules, and may take a long time to find a solution if most of the schedules are overlapping, and there are only a few that are even valid solutions - but I'm going to assume that this is not the case for employee times.
Start with an arbitrary schedule, choosing one timeslot for each employee at random. For each iteration, pick one employee and change his timeslot to the best possible time, with respect to the rest of current schedule. Repeat this process until your satisfied with the result - when it isn't improving quickly enough anymore or has taken too long. You're probably not going to want to repeat until you can't make any more changes that improve the schedule, since this process will likely loop for most data.
It's not a great heuristic, but it should produce some reasonable schedules, and has a lot of room for adjustment. You may want to always try to switch overlapping times first before any others, or you may want to try to flip the employee who currently contributes to the largest gap, or maybe eliminate certain slots that you've already tried. You may want to sometimes allow a move to a less optimal solution in hopes that you're at a local minima and want to get out of it - some randomness can also help with this. Make sure you always keep track of the best solution you've seen so far.
To generate more schedules, the most obvious thing would be to just start the process over with a different random schedule. Or, maybe flip a few arbitrary times from the previous solution you found, and repeat from there.
Edit: This is all fairly related to genetic algorithms, and you may want to use some of the ideas I presented here in a GA.

Simulation of molecular dynamics in Python

I am searching for a python package that I can use to simulate molecular dynamics in non-equilibrium situations. I need a setup that can handle a fairly large number of molecules in a primarily kinetic theory manner, and that can handle having solid surfaces present. With regards to the surfaces, I would need to be able to create arbitrary shapes and monitor pressure and other variables resulting from the molecular action. Alternatively, I could add the surface parts myself if I had molecules that could handle it.
Does anyone know of any packages that might be suitable?

Have you considered SimPy? SimPy is a rather generic Discrete Event Simulation package, but could feasibly meet your needs.
Better yet the Molecular Modelling ToolKit (MMTK) seems more specialized...
I have used neither, but this sounds like fun. Python, as a language, seems to be in privileged position for use in simulation software, whereby people can script the specific details of their model while relying on the framework for all the common logic, such as scheduling, visualization, monitoring etc. The unknown is how well such toolkits scale when fed with agent counts commensurate with biology models (BTW, how "big" is that?)

Lampps and gromacs are two well known molecular dynamics codes. These codes both have some python based wrapper stuff, but I am not sure how much functionality the wrappers expose. They may not give you enough control over the simulation.
Google for "GromacsWrapper" or google for "lammps" and "pizza.py"
Digital material and ASE are two molecular dynamics codes that expose a lot of functionality, but last time I looked, they were both fairly specialized. They may not allow you to use the force potentials that you want
Google for "digital material" and "cornell" or google for "ase" and dtu
Note to MJV: Normal MD-codes take one time step at a time, and they move all particles in each time step. Most of the time is spend calculating the total force on each atom. This involves iterating over a list of pairs of neighboring atoms. I think the best idea is to do the force calculation and a few more basics in c++ or fortran and then wrap that functionality in python. (But it could be fun to see how far one can get by using numpy matrices)

The following programs can be used to run MD symulations:
Gromacs
AMBER
charmm
OpenMM
many others...
The following Python packages are useful for preparing and analysing MD trajectories:
MDtraj and the OMNIA ecosystem
MDAnalysis
ProDy
MMTK

Another generic simulations framework is my own GarlicSim. You can try that. I could help you get a simpack up if you're serious about it.

I don't know if that programs does all the features you need but there is avogadro in the kde programs, i think it is extendable and since it is open source you could do anything with it. http://www.kde-apps.org/content/show.php/Avogadro?content=59521
It is really advanced and programmed by a friend of mine

I second MMTK, but take a look at VMD, which is the best MD software I'm aware of, and is Python-scriptable (in addition to Tk). See this for examples and tutorials.

I recommend to use molecular dynamics software to run MD simulations like Gromacs. This software is highly optimized for that particular purpose. You can also run on GPU's and you will be able to run larger systems in less time.
Afterwards, you run only the analysis with python packages using the generated trajectories.
mdtraj
pmx

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.