Seam insertion coordinates - Seam Carving - python

I'm having some trouble understanding seam insertion for image enlarging with Seam Carving. AFIK to enlarge an image by k pixels it's necessary to remove k seams, recording their coordinates and using them to reproduce the process backwards, i.e. re-add the deleted seams but duplicating them and applying some kind of average with neighbouring seams (I'm not concerned with this, since it should be the easy part). My confussion comes with the correctness of the recorded coordinates: they are local to the image from which the seam was removed, so by 'restoring' the first seam every other recorded seam coordinates turn invalid. Am I supposed to correct these coordinates checking if every pixel coordinate of every remaining to add seam comes after the previously added seams? This seems rather cumbersome and highly inefficient, given that I've read inserting seams should be trivial once the seam removal part has been achieved (it has).
I'm not sure if I'm communicating my doubt properly. Let me know if that's not the case, although I tried to be as clear as possible.

As pointed out in the cooments, you have to fix indices anyway when inserting even if you can avoid "fixing" in removal part.
You can find a full implementation of seam carving and seam insertion in python here.

Related

How to get the largest rectangle inside a contour?

I'd like to ask if there's a better or faster alternative way to get the largest rectangle inside an almost rectangular contour.
The rectangle should be aligned to both x and y axis and should be completely inside the rectangular contour. That means it would not contain any external white pixels, yet occupy the largest area in the contour.
Test image is here:
I've tried these two but I'm looking if there's a faster and neater way to go around this.
I also tried going through the points of a contour and getting the minimum and maximum points like in here but of course, it just shows similar results to what cv2.boundingRect already does.
Maybe this is a bit of lateral thinking, but looking at your examples and spec when not fill out white pikels contiguouys with the outside bounding box instead. (Like a 'paint pot' brush in Paint-type application).
E.g. (red pixels being the ones you would turn black from white):
You could probably even limit the process to the outer N pixels.
============================
So how might one implement this? It is essentially a version of the "flood fill" algorithm used in pixel graphics programmes, except that you start not from a single seed pixel but checking every point on the edge of the outside bounding rectangle. You start filling in and build a stack of points you need to come back to because you can't necessarily follow every area at once and may need to go back on your self.
You can look that algorithm up, but a 'pure' version will be very stack-heavy if you push every point you can't follow right now, particularly starting with the whole boundary of the shape.
I haven't implemented it this way, but my first thought would be a scan from a boundary inwards, taking a whole line of pixels at a time and mark all the 'white' pixels with a new 3rd colour, then on the next row you fill all the white pixels touching the previously marked pixels and so on. (doesn't matter whether you mark the changed pixels as a 3rd colour, a mask, or alpha-channel or whatever - but you must be able to tell newly filled in pixels from the old black ones.
As you go, you need to check for any 'stranded' areas where you need to work backwards to fill in white areas that are not directly connected to the outside:
Start filling from the edge...
Watch out for stranded areas - if you find one, scan backwards to fill before going to where you were before, to carry one (you may need to recurse if you stranded area turns back on itself again, though in your particular application this shouldn't be a huge issue, unlike some graphics applications)
And continue, not forgetting to fill in from the other edges if required (see note below) until you come to a row with no further pixels to fill and no more back-filling to do. Then restart at the far side of the image as you need to start a backward pass from the far side to catch anything else on that side.
For a practical implementation there is some thinking to do. Your examples will have a lot of filling at the edge but not much by way of complex internal shapes to follow, which keeps things simple. But you need to work from all 4 sides to do it efficiently - perhaps working in as a series of concentric rectangles rather than one side at a time. More complexity working through the design but massively more efficient in this example.
Food for thought anyhow.

Interpolation between two keys makes are off after python importantion

I'm trying to import an animation from a file to Maya, but it gives me odd results between the interpolations:
https://i.imgur.com/cP27Yai.mp4
It was weird because they keyframes looked right at first until i looked at the graph editor.
thought this was at first a gimbal lock, so i used the Euler Filter, but it gave no solution to it. Sometimes, the difference between one key and another is 180, which is why, by just seeing the animation, the keys look fine, but the interpolation makes it do a 180 rotation.
So if i go one by one, and subtract the vaule of the key by 180, and then invert the number (to positive or negative depending on the case), i can make it work by tweaking it a bit.
However this is too much work, specially for being biped animations, it could took me forever.
Is this a common issue or something that happened before to anyone else? Is there any way to fix this? Maybe it's the way i'm applying the euler angles, since they were initially Quaternions, but i did not find a way to apply the quaternions directly:
#Taking a rotation from the QUATERNION Array and converting it to Euler.
arot = anim.AnimRot[index].normal().asEulerRotation()
frot = MEulerRotation(arot.x*180/math.pi, arot.y*180/math.pi, arot.z*180/math.pi)
cmds.setAttr((obj + ".rotate"), frot.x, frot.y, frot.z)
cmds.setKeyframe(obj, time=anim.TotalKeys[i])
Is there any way to fix this from the editor or the script? any that fixes it would do me a great favor for importing this biped animation. I believe this is due to euler conversion, but i found no way to apply a quaternion to a bone in the maya API.
If the rotations already are quaternions, you might want to simply set the anim curves to quaternion interpolation using something like
cmds.rotationInterpolation( 'pSphere2.rotateX', 'pSphere2.rotateY', 'pSphere2.rotateZ', c="quaternionSquad")
To be safe I'd set one key, then apply the rotationInterpolation to change the keys to quats, then step through applying your original quaternions. Since you're already in API land you can make an MTransformationMatrix and use its setRotationComponents method to set the quat values so you don't ever convert to eulers.

Separating Axis Theorem in Python/Pyglet

I'm trying to make a (sort-of) clone of Asteroids in Python using Pyglet. I figured I'd try to get a little fancy and implement the separating axis theorem to do collision. I got it to work, but the problem is that it's miserably slow. I test collision between bullets that the player shoots and the asteroids on the screen in a double for-loop, which I believe is quadratic time, but the frame rate drops from about 60 to 30 fps by the time there's about 6 asteroids and 6 bullets on the screen, which seems incredibly slow, even for a non-optimized way of detecting collision.
So I ran a profiler to determine where, exactly, in the code the program is getting hung up. It seems to be hung up in the method where I transform shape vertices into world space (I define the shapes around the origin and use OpenGL code to transform to world space for drawing, which I believe is the right way to do it). I grab the transformation matrix from OpenGL, turn it into a NumPy array, and then multiply each vertex by this matrix to get the transformed vertices. It's worth noting that I do this every collision check: I used to use XNA, and when I implemented the SAT in that (I made an asteroids clone there, too), the vertices were also defined around the origin and then you had to transform them using a world matrix.
Is it best to store the vertices around (0, 0) and transform each call, or just store the transformed vertices? I feel like the algorithm shouldn't be THIS slow, so I'm willing to bet I screwed up implementing something. If I was better at profiling (I'm pretty unfamiliar with it) I might be able to get a more complete picture, but I was hoping you guys might have some idea.
Here's a direct link to the file with the Shape class in it, where all the collision logic happens: shape.py. The specific method that the profiler seemed to mark as the bottleneck was __get_transformed_verts. Obviously you can get to the entire repo from there too, but just be aware that there's still a good deal not commented.
As Nico suggests in comments, a quick way to get a good speed-up would be to check simpler geometry first. For an Asteroids clone I guess a circle will be a good fit (or sphere for 3D). If the circles (at least large enough to cover your actual shape) don't overlap, then there is no need to do the more expensive geometry test.
If you have many objects, you will probably want to avoid doing n*n tests every frame. Take a look at space partitioning structures/algorithms. The simplest scheme with a lot of moving objects in 2D would be a grid. Then you only need to test objects belonging to the same - or neighbouring - grid cells for collision.
Another thing I noticed: You generate the transformed vertices every time you test for collision. It would be quicker to generate them only once per timestep (frame) for each object that fails the circle-circle test.

Simple quick robust image comparison

I have an image find- and "blur-compare"-task. I could not figure out which methods I should use.
The setup is this: A, say, 100x100 box either is mostly filled by an object or not. To the human eye this object is always almost the same, but might change by blur, slight rescaling, tilting 3-dimensionally, moving to the side or up/down by a or two pixel or other very small graphical changes.
What is a simple quick robust and reliable way to check if the transformed object is there or not? Points to python packages as well as code would be nice.
Not sure I entirely understand your question, but I'll give it a shot..
Assuming:
we just want to know if there is some object in a box.
the empty box is always the same
perfect box alignment etc.
You can do this:
subtract the query image from your empty box image.
sum all pixels
if the value is zero the images are identical, therefore no change, so no object.
Obviously there actually is some difference between the box parts of the two images, but the key thing is that the non-object part of the images are as similar as possible for both pictures, if this is the case, then we can use the above method but with a threshold test as the 3rd step. Provided the threshold is set reasonably, it should give a decent prediction of whether the box is empty or not..

Region Growing Algorithm

Hey everyone. I'm really struggling to figure out the logic with this one and was hoping you could help me out. Before I continue I just want to let you know that I am amateur programmer and a beginner at that, with no formal Computer Science training of any sort, so please bear with me. :D Also, I'm using Python, but I could use Java or something similar.
Anywho, I am looking to implement a Region Growing for use in a rudimentary Drawbot.
Here is an article on region growing: http://en.wikipedia.org/wiki/Region_growing
The way I envision it, the image the draw is based upon will meet the following criteria:
The image will be at most 3x3 inches in size at an arbitrary Color Depth
The image will be a black continuous shape on a white background
The shape can be located anywhere on the background.
I've considered the following solutions to this problem. While some work to an extent, each has some considerable flaws in either their performance or feasibility (at least they don't seem feasible to me). Furthermore, because this is a Drawbot, this needs to be done with a single continuous line. This doesn't mean however that I can't backtrack, it only eliminates the possibility of multiple starting points (seeds).
Considered Approaches:
Random Walk:
Solving this problem with a random walk was my first instinct. A random walk program accomplishing this would, I imagine, look something like this:
pseudo python...
Cells To Visit = Number of Black Cells
Cells Visited = 0
MarkColor = red
While Cells Visited < Cells To Visit:
if currentcell is black:
Mark Current Cell As Visited #change pixel to red
Cells Visited +=1
neighbors = Get_Adjacent_Cells() #returns cells either black or red
next cell = random.choose(neighbors)
currentCell = next cell
While I suppose this is feasible, it seems to me to be highly ineffective and doesn't guarantee good results, but in the interest of actually getting something done I may end up trying this... Is my logic in the pseudocode even vaguely correct?
Sweeping Pattern:
This method to me seemed to be the most trivial to implement. My idea here is that I could choose a starting point at one extreme of the shape (e.g. the lowest most left point). From there it would draw to the right, moving only on the x axis until it hit a white pixel. From here it would move up one pixel on the y axis, and then move left on the x axis until it reached a white pixel. If the pixel directly above it happend to be white, backtrack on the x axis until it finds a black pixel above it.
This method upon further inspection has some major short comings.
When faced with a shape such as this:
The result will look like this:
And even if I were to tell it to start sweeping down after awhile, the middle leg would still be overlooked.
4/8 Connected Neighborhood:
http://en.wikipedia.org/wiki/8-connected_neighborhood
This method appears to me to be the most powerful and effective, but at this point I can't figure it out fully, nor can I think of how I would implement it without potentially leaving some overlooked areas
At every cell I would look at the neighboring black cells, devise some method for ranking which one I should visit first, visit all of them, and repeat the process until all cells are covered.
The problems I could see here is first of all dealing with the data structure necessary to accomplish this, and also merely figuring out the logic behind it.
Those are the best solutions I've been able to think of. Thank you for taking the time to read this, I realize it is long, but I thought that I should make it as explicit as possible. Any and all suggestions will be greatly appreciated... Thanks!
Edit:
I also looked into maze generating and solving algorithms, but wasn't sure how to implement that here. My understanding of the maze solving algorithms is that they rely on the passages of the maze to be of equal width. I could of course be wrong about that.
Basic region growing, in pseudocode looks something like:
seed_point // starting point
visited // boolean array/matrix, same size as image
point_queue // empty queue
point_queue.enqueue( seed_point )
visited( seed_point ) = true
while( point_queue is not empty ) {
this_point = point_queue.dequeue()
for each neighbour of this_point {
if not visited( neighbour ) and neighbour is black/red/whatever
point_queue.enqueue( neighbour )
visited( neighbour ) = true
}
}
// we are done. the "visited" matrix tells
// us which pixels are in the region
I don't understand where the ranking that you've mentioned comes into it though. Am I missing something?
I'm confused by the very long question.
Are you sure you aren't just trying to do a flood fill?
Here's a really nice little screencast on writing a recursive maze solver: http://thinkcode.tv/catalog/amazing-python/
I think it might give you some ideas for the problem you are trying to solve.
Also, here's a little recursive maze solving script that I wrote after watching the screencast http://pastie.org/1854582. Equal width passages are not necessary, the only things that are necessary are open space, walls, and some kind of an ending condition, in this case, finding the end of the maze.
If you don't want to go recursive, the other thing you can do is use a "backtracking" method. You can see a little example of it being used in the random generation of mazes on this page:
http://weblog.jamisbuck.org/2011/2/7/maze-generation-algorithm-recap (First example on the page).
Is this sounding relevant? If it is, let me know if you want me to explain anything in more detail.
Edit:
This seems like a really good discussion on doing flood fills in python http://www.daniweb.com/software-development/python/threads/148874
A simple technique that can help with some maze solving problems, of keeping one hand on the wall, might help.
Note however that if you chose a random starting point, you might chose a point that whichever way you travel from there, you block off a portion. i.e. if you were to start in the middle of an hour-glass shape, you would only be able to fill in one half.

Categories