rectangle coordinates of Abbyy Cloud Ocr (Xml output) - python

so im trying to extract data from invoices for that im using abby cloud ocr. im got the output as xml file now what i want to do is look for a text and take its rectangle cordinates and then look for closest rectangle and take its value
to do that i need the rectangle coordinates well the xml file actually return cordinates but i cant understand it
ill show u an example of the xml output (ill replace uneeded text with '....')
<line baseline="2062" l="2037" t="2033" r="2206" b="2064">....</line>
<line baseline="2101" l="295" t="2070" r="588" b="2097">....</line>
these are too different rectangles anyway i went to see the documentation and this is what is says
baseline — the distance from the base line to the top edge of the page
l — the coordinate of the left border of the surrounding rectangle,
t — the coordinate of the top border of the surrounding rectangle
r — the coordinate of the right border of the surrounding rectangle
b — the coordinate of the bottom border of the surrounding rectangle
what coordinate of the left border of the surrounding rectangle mean ?
isnt the rectangle coordinates on this format [[x1,y1],[x2,y2],[x3,y3],[x4,y4]]?
can you explain to me what they mean by these coordinates or how can i use it ??

Related

Get real GPS coordinates out of known edges values on python

I'm trying to find a way to convert pixels into a real coordinates. I have an image with known (GPS) edges values.
Top left = 43.51281, -70.46223
Top right = 43.51279, -70.46213
Bottom left = 43.51272, -70.46226
Bottom right = 43.51270, -70.46215
Image with known edges values
I have another script that prints the coordinates in pixels of an image. Is there any way that the value of each corner is declared, and that it prints the real coordinates of where I clicked?
For example: The next image shape is [460, 573] and when I click somewhere on it, the pixels of that click are shown, I want it to be real coordinates.
Example
An option is to use OpenCV's getPerspectiveTransform() function, see this for an intuitive explanation of how the function maps real world coordinates to coordinates on another image (which in your case would be mapping the GPS values to the pixel values within the image):
https://towardsdatascience.com/how-to-track-football-players-using-yolo-sort-and-opencv-6c58f71120b8
And these for an example of the function being used:
Python Open CV perspectiveTransform()
https://www.geeksforgeeks.org/perspective-transformation-python-opencv/

Merging perspective corrected image with transparent background template image using PILLOW [PIL, Python]

Problem: I have multiple book cover images. I made a template of "book"-like template with a 3D perspective. And all I have to do now its take each of book cover images, correct a perspective (its always constant, because the template is always unchanged) and merge my perspective corrected image with the template (background/canvas).
For easier understanding - here is an example created in Adobe Photoshop:
With red arrows I tried to show vertex points of the original cover image (before perspective correction). As you can see, 2 vertex points on the right have to stay. The other two points of the left have to be corrected always the same.
Can you please show me how to achieve that?
UPDATE
What I have:
1) Cover itself
2) Template with transparent background:
I need to transform perspective of cover and merge it with template image
You don't really need to write any Python, you can just do it in the Terminal with ImageMagick using a "Perspective Transform" like this:
magick cover.png -virtual-pixel none -distort perspective "0,0 96,89 %w,0 325,63 %w,%h 326,522 0,%h 96,491" template.png +swap -flatten result.png
Looking at the parameters to the perspective transform, you can hopefully see there are 4 pairs of coordinates, one pair for each corner of the transform showing how the source location gets mapped in the output image.
So, the top-left corner of the cover (0,0) gets mapped to the top-left of the empty area in the template (96,89). The top right of the cover (width,0) gets mapped to the top-right of the empty area of the template (325,63). The bottom-right of the cover (width,height) gets mapped to the bottom-right of the empty area on the template (326,522). The bottom-left of the cover (0,height) gets mapped to the bottom-left corner of the empty area of the template (96,491).
If you are using the old v6 ImageMagick, replace magick with convert.
Note that, if you really want to do it in Python, there is a Python binding called wand here. I am not very experienced with wand but this seems to be equivalent:
#!/usr/bin/env python3
from itertools import chain
from wand.color import Color
from wand.image import Image
with Image(filename='cover.png') as cover, Image(filename='template.png') as template:
w, h = cover.size
cover.virtual_pixel = 'transparent'
source_points = (
(0, 0),
(w, 0),
(w, h),
(0, h)
)
destination_points = (
(96, 89),
(325, 63),
(326, 522),
(96, 491)
)
order = chain.from_iterable(zip(source_points, destination_points))
arguments = list(chain.from_iterable(order))
cover.distort('perspective', arguments)
# Overlay cover onto template and save
template.composite(cover,left=0,top=0)
template.save(filename='result.png')
Keywords: Python, ImageMagick, wand, image processing, perspective transform, distort.

How to get box around contour using skimage.segmentation.felzenszwalb?

I'm trying to get a box around a segmented object on the edge of the image, that is, there is no contour around the segmentation because the object is only partially inside the image region.
I use skimage.segmentation, find_boundaries, clear_border, and regioprops. However, regionprops does not provide those edge cases
segments_fz = felzenszwalb(cv2.cvtColor(image, cv2.COLOR_BGR2RGB), scale=300, sigma=0.5, min_size=50)
cleared = clear_border(segments_fz)
label_image = label(cleared)
regionprops(label_image)
A box around segmented object near the limit of the image region.
You shouldn't use clear_border. Then the objects on the border will be treated like any other. The bbox property should give you a bounding box for your object of interest, while find_boundaries and mark_boundaries will let you get or visualise the boundaries between segments.

pixel image to tile map

I am working on a game in pygame/python, and I am wondering who has the know how to show me to turn an image into a map.
The idea is simple. The image is colored by tile type. When the program loads the image, I want the color (example) #ff13ae to be matched to a certain grass tile, and the color (example) #ff13bd to a different tile. Now, I know that I may very well have to convert from hexcodes to rgb, but that is trivial. I just want to know the way I would go about this, mainly because all my other games don't do anything of this sort.
Use pygame.PixelArray:
The PixelArray wraps a Surface and provides direct access to the surface's pixels.
[...]
pxarray = pygame.PixelArray(surface)
# Check, if the first pixel at the topleft corner is blue
if pxarray[0, 0] == surface.map_rgb((0, 0, 255)):
...

PIL - Identifying an object with a virtual box

I have an image (sorry cannot link it for copyright purposes) that has a character outlined in a black line. The black line that outlines the character is the darkest thing on the picture (planned on using this fact to help find it). What I need to do is obtain four coordinates that draw a virtual box around the character. The box should be as small as possible while still keeping the outlined character inside its contents. I intend on using the box to help pinpoint what would be the central point of the character's figure by using the center point of the box.
I started with trying to identify parts of the outline. Since it's the darkest line on the image, I used getextrema() to obtain at least one point on the outline, but I can't figure out how to get more points and then combine those points to make a box.
Any insight into this problem is greatly appreciated. Cheers!
EDIT *
This is what I have now:
im = Image.open("pic.jpg")
im = im.convert("L")
lo, hi = im.getextrema()
im = im.point(lambda p: p == lo)
rect = im.getbbox()
x = 0.5 * (rect[0] + rect[2])
y = 0.5 * (rect[1] + rect[3])
It seems to be pretty consistent to getting inside the figure, but it's really not that close to the center. Any idea why?
Find an appropriate threshold that separates the outline from the rest of the image, perhaps using the extrema you already have. If the contrast is big enough this shouldn't be too hard, just add some value to the minimum.
Threshold the image with the value you found, see this question. You want the dark part to become white in the binary thresholded image, so use a smaller-than threshold (lambda p: p < T).
Use thresholdedImage.getbbox() to get the bounding box of the outline

Categories