I am starting a project which involves an allocation problem, and having explored a bit by myself, it is pretty challenging for me to solve it efficiently.
What I call here allocation problem is the following:
There is a set of available slots in 3D space, randomly sampled ([xspot_0, yspot_0, zspot_0], [xspot_1, yspot_1, zspot_1], etc.). These slots are identified with an ID and a position, and are fixed, so they will not change with time.
There are then mobile elements (same number as the number of available slots, on the order of 250,000) which can go from spot to spot. They are identified with an ID, and at a given time step, the spot in which they are.
Each spot must have one and only one element at a given step.
At first, elements are ordered in the same way as spots: the first element (element_id=0) is in the first spot (spot_id=0), etc.
But then, these elements need to move, based on a motion vector that is defined for each spot, which is also fixed. For example, ideally at the first step, the first element should move from [xspot_0, yspot_0, zspot_0] to [xspot_0 + dxspot_0, yspot_0 + dyspot_0, zspot_0 + dzspot_0], etc.
Since spots were randomly sampled, the new target position might not exist among the spots. The goal is therefore to find a candidate slot for the next step that is as close as possible to the "ideal" position the element should be in.
On top of that first challenge, since this will probably be done through a loop, it is possible that the best candidate was already assigned to another element.
Once all new slots are defined for each element (or each element is assigned to a new slot, depending on how you see it), we do it again, applying the same motion with the new order. This is repeated as many times as I need.
Now that I defined the problem, the first thing I tried was a simple allocation based on this information. However, if I pick the best candidate every time based on the distance to the target position, as I said some elements have their best candidate already taken, so they pick the 2nd, 3rd, ... 20th, ... 100th candidate slot, which becomes highly wrong compared to the ideal position.
Another technique I was trying, without being entirely sure about what I was doing, was to assign a probability distribution calculated by doing the inverse exponential of the distance between the slots and the target position. Then I normalized this distribution to obtain probabilities (which seem arbitrary). I still do not get very good results for a single step.
Therefore, I was wondering if someone knows how to solve this type of problem in a more accurate/more efficient way. For your information, I mainly use Python 3 for development.
Thank you!
Related
I have a rather simple problem to define but I did not find a simple answer so far.
I have two graphs (ie sets of vertices and edges) which are identical. Each of them has independently labelled vertices. Look at the example below:
How can the computer detect, without prior knowledge of it, that 1 is identical to 9, 2 to 10 and so on?
Note that in the case of symmetry, there may be several possible one to one pairings which give complete equivalence, but just finding one of them is sufficient to me.
This is in the context of a Python implementation. Does someone have a pointer towards a simple algorithm publicly available on the Internet? The problem sounds simple but I simply lack the mathematical knowledge to come up to it myself or to find proper keywords to find the information.
EDIT: Note that I also have atom types (ie labels) for each graphs, as well as the full distance matrix for the two graphs to align. However the positions may be similar but not exactly equal.
This is known as the graph isomorphism problem, and probably very hard; although the exactly details of how hard are still subject of research.
(But things look better if you graphs are planar.)
So, after searching for it a bit, I think that I found a solution that works most of the time for moderate computational cost. This is a kind of genetic algorithm which uses a bit of randomness, but it is practical enough for my purposes it seems. I didn't have any aberrant configuration with my samples so far even if it is theoretically possible that this happens.
Here is how I proceeded:
Determine the complete set of 2-paths, 3-paths and 4-paths
Determine vertex types using both atom type and surrounding topology, creating an "identity card" for each vertex
Do the following ten times:
Start with a random candidate set of pairings complying with the allowed vertex types
Evaluate how much of 2-paths, 3-paths and 4-paths correspond between the two pairings by scoring one point for each corresponding vertex (also using the atom type as an additional descriptor)
Evaluate all other shortlisted candidates for a given vertex by permuting the pairings for this candidate with its other positions in the same way
Sort the scores in descending order
For each score, check if the configuration is among the excluded configurations, and if it is not, take it as the new configuration and put it into the excluded configurations.
If the score is perfect (ie all of the 2-paths, 3-paths and 4-paths correspond), then stop the loop and calculate the sum of absolute differences between the distance matrices of the two graphs to pair using the selected pairing, otherwise go back to 4.
Stop this process after it has been done 10 times
Check the difference between distance matrices and take the pairings associated with the minimal sum of absolute differences between the distance matrices.
I was given a problem in which you are supposed to write a python code that distributes a number of different weights among 4 boxes.
Logically we can't expect a perfect distribution as in case we are given weights like 10, 65, 30, 40, 50 and 60 kilograms, there is no way of grouping those numbers without making one box heavier than another. But we can aim for the most homogenous distribution. ((60),(40,30),(65),(50,10))
I can't even think of an algorithm to complete this task let alone turn it into python code. Any ideas about the subject would be appreciated.
The problem you're describing is similar to the "fair teams" problem, so I'd suggest looking there first.
Because a simple greedy algorithm where weights are added to the lightest box won't work, the most straightforward solution would be a brute force recursive backtracking algorithm that keeps track of the best solution it has found while iterating over all possible combinations.
As stated in #j_random_hacker's response, this is not going to be something easily done. My best idea right now is to find some baseline. I describe a baseline as an object with the largest value since it cannot be subdivided. Using that you can start trying to match the rest of the data to that value which would only take about three iterations to do. The first and second would create a list of every possible combination and then the third can go over that list and compare the different options by taking the average of each group and storing the closest average value to your baseline.
Using your example, 65 is the baseline and since you cannot subdivide it you know that has to be the minimum bound on your data grouping so you would try to match all of the rest of the values to that. It wont be great, but it does give you something to start with.
As j_random_hacker notes, the partition problem is NP-complete. This problem is also NP-complete by a reduction from the 4-partition problem (the article also contains a link to a paper by Garey and Johnson that proves that 4-partition itself is NP-complete).
In particular, given a list to 4-partition, you could feed that list as an input to a function that solves your box distribution problem. If each box had the same weight in it, a 4-partition would exist, otherwise not.
Your best bet would be to create an exponential time algorithm that uses backtracking to iterate over the 4^n possible assignments. Because unless P = NP (highly unlikely), no polynomial time algorithm exists for this problem.
This is the global scenario: I'm recording some simple signals from a novel sensor using Python 3.8. I have already filtered signals to have a better representations where let run other algorithms of Data Analysis. Nothing of special.
Following some signals on which I need to run my algorithm:
First Example
Second Example
These signals came out a sensor whose I am working on. My aim is to get the timestamps where signals starting to increase or decrease. I actually need to run this algorithm for only one signal (blue or orange).
I have reported both signals because they have antagonistic behaviour and maybe could be useful to achieve my task.
In other words, these signals are regarded to Foot Flexion Extension (FLE/EXT), then the point where they start to increase represents the point when I start to move my foot. Viceversa, when I move back my foot it results on decreasing signals amplitude.
My job is to identify the FLE/EXT and I tried to examine first derivative but it appears to don't give me any useful information.
I also have tried to use a convolution with a fixed-lenght ones-array by looking for when the successive convulution's average is greater than the current one.
This approach has 2 constraints:
Fixed-lenght array: because when signals represents faster FLE/EXT (then in less temporale distance in x-axis) the window is not enough to catch variation.
Threshold's Criterion for choosing how much has to be the successive average respect to the current one in order to save this iteration for my purpose.
I have stuck here, because I want to use a dynamic threshold approach or something similar which can allow me to exclude any fixed thresholds.
I hope to have a discussion with you for solving my problem. What do you think?
Please, if something is unclear, I am ready to clarify better.
Best regards,
V
I am trying to solve the following problem:
The group of people consists of N members. Every member has one or more friends in the group. You are to write program that divides this group into two teams. Every member of each team must have friends in another team.
Input:
The first line of input contains the only number N (N ≤ 100). Members are numbered from 1 to N. The second, the third,…and the (N+1)th line contain list of friends of the first, the second, …and the Nth member respectively. This list is finished by zero. Remember that friendship is always mutual in this group.
Output:
The first line of output should contain the number of people in the first team or zero if it is impossible to divide people into two teams. If the solution exists you should write the list of the first group into the second line of output. Numbers should be divided by single space. If there are more than one solution you may find any of them.
My algorithm looks like this:
create a dictionary where each player maps to a list of friends
team1 = ['1']
team2 = []
left = []
for player in dictionary:
if its friend in team1:
add to team2
elif its freind in team2:
add to team1
else:
add it to left
But still it isn't correct. There may be cycles in the dictionary where the friend of 6 would be 7 and the only friend of 7 would be 6. What should I do in such a case? I do not know how long such a cycle may be. What should I do. Since, I have a while loop around my code, I currently am running into an infinite loop. I am also trying to add players from left to teams but its not working since they have cycles among them. I don't know how to solve the following problem.
Thanks.
Since this is a competition problem and it's clear you want to learn from it, I'll be a little sparse on details and explain more about how I thought about the problem.
First, consider a connected friendship component, then pick any vertex. Since the friendship relationship is commutative, it's easy to see that adding an edge means that both the vertices are "solved". This seems to suggest something like finding a perfect matching.
However, finding a perfect matching is not sufficient, as for the complete graph with three vertices, a perfect matching doesn't exist, yet it can be solved. So thinking about it little more, it seems that a Hamiltonian path is sufficient, because you can just alternate teams.
If you consider a sufficiently large tree though, it should be clear that there's no Hamiltonian path, but the obvious splitting of teams by even or odd height produces the right result. So the answer seems to be that if you can find a spanning tree, that tree can be used to split the teams into two.
This can be repeated for each component, and just playing around with graphs, it should be convincing enough for a competition, as every component has a spanning tree, so there's nowhere else to expand to. I'm not sure what would be a graph with no possible assignment. Maybe if you have an unconnected node, that's considered invalid?
Update: I found even simpler solution. The original answer is at the bottom. This one is cleaner and comes with a proof ;)
We will be building the solution incrementally. The initial state is that all the people are unallocated, and both teams are empty. We will extend the solution using one of two actions below. After each step, the division will be legal, meaning every allocated person will have a friend allocated to the other team.
Action 1: pick any two unallocated guys that are friends. Put one of them in Team A, the other in Team B. The invariant holds, because newly allocated people know each other and are on separate teams.
Action 2: pick any guy who has an allocated friend, and place him on the other team. The invariant holds, because the one allocated person was allocated in such a way to satisfy it.
So at very step you pick any doable action and execute it. Repeat until there are no more possible actions. When does this happen? It would mean that no-one of the unallocated people has any friends. Since we assumed that everyone has at least one friend, you will be able to execute the actions until there is nobody left.
Original answer:
The problem seems complicated at first, but in fact does not require rocket science. The constraint on the division is rather loose - everyone needs just one friend on the other team.
Consider a simpler case first. Let's say you are given two teams of people and one extra player that got late to the party and needs to be allocated to one of the two existing teams. If he has no friends at all, that's impossible. But if he does have any friends, you pick one of his friends and allocate the newcommer to the other team.
The outcome? If you could start with some small teams and then arrange the rest of the people in such a way that they always know someone who came before, you're golden. This means we reduced initial big problem to two smaller ones.
Tackling the first one is easy. In order to bootstrap the teams, just pick any two guys that know each other, put one in Team A, the other in Team B, and it works.
Now, the second: adding the rest of the people. Take a look at all the people that are already allocated to teams and see if they have any unallocated friends. Case 1: one of already allocated guys has an unallocated friend. You can easily add him somewhere. Case 2: all the friends of the allocated guys are already allocated, too. This means the initial friendship graph was not connected and doesn't hurt at all - just take any random unallocated guy and place him anywhere.
I'm attempting to write a program which finds the 'pits' in a list of
integers.
A pit is any integer x where x is less than or equal to the integers
immediately preceding and following it. If the integer is at the start
or end of the list it is only compared on the inward side.
For example in:
[2,1,3] 1 is a pit.
[1,1,1] all elements are pits.
[4,3,4,3,4] the elements at 1 and 3 are pits.
I know how to work this out by taking a linear approach and walking along
the list however i am curious about how to apply divide and conquer
techniques to do this comparatively quickly. I am quite inexperienced and
am not really sure where to start, i feel like something similar to a binary
tree could be applied?
If its pertinent i'm working in Python 3.
Thanks for your time :).
Without any additional information on the distribution of the values in the list, it is not possible to achieve any algorithmic complexity of less than O(x), where x is the number of elements in the list.
Logically, if the dataset is random, such as a brownian noise, a pit can happen anywhere, requiring a full 1:1 sampling frequency in order to correctly find every pit.
Even if one just wants to find the absolute lowest pit in the sequence, that would not be possible to achieve in sub-linear time without repercussions on the correctness of the results.
Optimizations can be considered, such as mere parallelization or skipping values neighbor to a pit, but the overall complexity would stay the same.