Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have a few lists of movement tracking data, which looks something like this
I want to create a list of outputs where I mark these large spikes, essentially telling that there is a movement at that point.
I applied a rolling standard deviation on the data with a window size of two and got this result
Now I can see the spikes which mark the point of interest, but I am not sure how to do it in code. A statistical tool to measure these spikes, which can be used to flag these spikes.
There are several approaches that you can use for an anomaly detection task.
The choice depends on your data.
If you want to use a statistical approach, you can use some measures like z-score or IQR.
Here you can find a tutorial for these measures.
Here instead, you can find another tutorial for a statistical approach which uses mean and variance.
Last but not least, I suggest you also to check how to use a control chart, because in some cases it's enough.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 10 months ago.
Improve this question
I am working on a problem but can not quite categorize it.
The problem consist of putting some items on a grid.
Each edge between an item and another item gives a certain benefit and is supposed to be placed such that I get the maximum possible sum of benefits for the particular configuration of that grid.
The grid itself has some coordinates that can not contain an item.(such as a wall)
It is also allowed not to place some items.
Can I get some guidance on how to go about it?
Algorithm, process of going about the problem, data structure to use or any relevant information.
Unfortunately I can't make the question any clearer.
Thanks
Here are my high-level thoughts on this.
So, this is sounding to me like a graph problem. Each item in your "item bank" will become a node in the graph. You then make edges between the nodes, and then assign a weight (integer value) to an edge based on how beneficial the relationship is between the two nodes the edge connects.
Once you've set up the graph, you could then implement a maximum flow algorithm. Of course, this would also require that you set up "source" and "sink" nodes. The maximum flow you get would then represent the maximum benefit.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 11 months ago.
Improve this question
I'm trying to clean the line noises from this captcha, so I can implement an algorithm to read them. However, I'm finding some difficulties to make it readable to an AI using some techniques, such as Open CV threshold combined with some resources from pil.Image. I also tried an algorithm to "chop" the image, which gave me a better results, but stil far from the expected. I want to know if there is an alternative to remove noises from captchas like this one effectively.
(I'm using python)
Initially, the Captcha looks like this:
Once processed using OpenCV + Pillow, I've got this:
Later, using the "chop method" this what we have:
However, I need a better final image, but I think this methods combination is not appropriate. Is there a better alternative?
I think you could try minisom: https://github.com/JustGlowing/minisom
SOM (Self organizes maps) are a type of neural networks that group clusters of points in data, with an appropiate threshold it could help you removing those lines that are not surrounding the numbers/letters, combining that with chop method could do the job.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have two identical images. One was marked by an algorithm, and the other (already marked) serves as ground truth. I'm able to segment the marks from the images like in the following example.
GROUND_TRUTH
ALGORITHM
My question is what is the best way to compare the mark produced by the algorithm with the ground truth?
So far I´ve tried substracting the image marked by the algorithm from the ground truth and counting the remainig pixels to compute the success of the comparison using the equation success=1-(number of remaining pixels after substraction)/(number of pixels of the ground truth)
But I'm not convinced by this method especially in the case where the mark made by the algorithm and the ground truth are in different places. In the example the part of the mark made by the algorithm that is at the top is not accounted for in the comparison. How could I deal with this?
SUBSTRACTED
I'm using openCV and python to work with the images.
You have binary masks.
Calculate intersection over union ("IoU").
Both numpy itself and OpenCV have ways to calculate the logical and/or of two boolean arrays, and both have ways to count the number of non-zeros.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Is it better to implement my own K-means Algorithm in Python or use the pre-implemented K-mean Algorithm in Python libraries like for example Scikit-Learn?
Before answering which is better, here is a quick reminder of the algorithm:
"Choose" the number of clusters K
Initiate your first centroids
For each point, find the closest centroid
according to a distance function D
When all points are attributed to a cluster, calculate the barycenter of the cluster which become its new centroid
Repeat step 3. and step 4. until convergence
As stressed previously, the algorithm depends on various parameters:
The number of clusters
Your initial centroid positions
A distance function to calculate distance between any point and centroid
A function to calculate the barycenter of each new cluster
A convergence metric
...
If none of the above is familiar to you, and you want to understand the role of each parameter, I would recommend to re-implement it on low-dimensional data-sets. Moreover, the implemented Python libraries might not match your specific requirements - even though they provide good tuning possibilities.
If your point is to use it quickly with a big-picture understanding, you can use existing implementation - scikit-learn would be a good choice.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to create a music player with Python which uses OpenGL for visualizing the audio spectrum.
I already have the 3d engine set up and all I need is to get the spectrum data and feed it to the renderer.
I'd imagine it would be a list of numbers updated each few miliseconds or so.
I've heard you can get that info with FMOD and there's pyfmodex Python wrapper for it, but I can't access the FMOD documentation and the pyfmodex is almost undocumented. Can't find what I need by just browsing the class/function names.
If there's another library which can get that info that will also work.
numpy has an FFT function that will compute a fast fourier transform on a block of input data. You can use its output to obtain your spectral information.