get_y() value of the image bottom in fpdf in python - python

I'm using Python FPDF to generate pdf. The pdf generally contains text followed by images followed by text and so on. But the cells that contains text are overlapping with the images above them. I tried to calculate image length and pass that to set_y for next cell to avoid overlapping, still no luck. So I tried using get_y() but it returns the 'y' value of previous text cell instead of the bottom of the image.
So how can I get the 'y-coordinate' of the bottom of the image ?

I was facing the same issue and I resolved it by using the pillow library.
The Pillow library will give us the height and width of the image in pixels. And if you divide it by 10 it will work as we needed.
After this, I am abled to get the y coordinate exactly below the inserted image. I inserted another image exactly below the previous image.
y = pdf.get_y()
img_height = Image.open(imagePath).height/10
y = y + img_height + 10
pdf.set_y(y)
That's how you will get it and to set the y coordinate I used pdf.set_y(y).

Related

Programming a picture maker template in Python possible?

I'm looking for a library that enables to "create pictures" (or even videos) with the following functions:
Accepting picture inputs
Resizing said inputs to fit given template / scheme
Positioning the pictures in pre-set up layers or coordinates
A rather schematic approach to look at this:
whereas the red spots are supposed to represent e.g. text, picture (or if possible video) elements.
The end goal would be to give the .py script multiple input pictures and the .py creating a finished version like mentioned above.
Solutions I tried were looking into Python PIL, but I wasn't able to find what I was looking for.
Yes, it is possible to do this with Python.
The library you are looking for is OpenCV([https://opencv.org][1]/).
Some basic OpenCV python tutorials (https://docs.opencv.org/master/d9/df8/tutorial_root.html).
1) You can use imread() function to read images from files.
2) You can use resize() function to resize the images.
3) You can create a empty master numpy array matching the size and depth(color depth) of the black rectangle in the figure you have shown, resize your image and copy the contents into the empty array starting from the position you want.
Below is a sample code which does something close to what you might need, you can modify this to suit your actual needs. (Since your requirements are not clear I have written the code like this so that it can at least guide you.)
import numpy as np
import cv2
import matplotlib.pyplot as plt
# You can store most of these values in another file and load them.
# You can modify this to set the dimensions of the background image.
BG_IMAGE_WIDTH = 100
BG_IMAGE_HEIGHT = 100
BG_IMAGE_COLOR_DEPTH = 3
# This will act as the black bounding box you have shown in your figure.
# You can also load another image instead of creating empty background image.
empty_background_image = np.zeros(
(BG_IMAGE_HEIGHT, BG_IMAGE_WIDTH, BG_IMAGE_COLOR_DEPTH),
dtype=np.int
)
# Loading an image.
# This will be copied later into one of those red boxes you have shown.
IMAGE_PATH = "./image1.jpg"
foreground_image = cv2.imread(IMAGE_PATH)
# Setting the resize target and top left position with respect to bg image.
X_POS = 4
Y_POS = 10
RESIZE_TARGET_WIDTH = 30
RESIZE_TARGET_HEIGHT = 30
# Resizing
foreground_image= cv2.resize(
src=foreground_image,
dsize=(RESIZE_TARGET_WIDTH, RESIZE_TARGET_HEIGHT),
)
# Copying this into background image
empty_background_image[
Y_POS: Y_POS + RESIZE_TARGET_HEIGHT,
X_POS: X_POS + RESIZE_TARGET_WIDTH
] = foreground_image
plt.imshow(empty_background_image)
plt.show()

Scale and position image in Scribus

I am customizing the built-in CalendarWizard Python script in Scribus to add birthdays with pictures. I have the profile pictures in a folder for each person and I would like to save scale and position information for these pictures, so they are automatically applied when the calendar is generated.
The image box is created like:
kep = createImage(self.marginl + colCnt * self.colSize,
self.calHeight + szovegsor + rowCnt * self.rowSize,
self.colSize, kepsor)
Then I fill the box with the photo:
szkep = 'C:\\profilepics\\' + sznapos + '.jpg'
kkep = loadImage(szkep,kep)
The sznapos variable contains the name of the current birthday person. So far we don't have multiple birthdays.
The next action would be to scale the loaded image. I can fit to the box:
setScaleImageToFrame(1, 1, kep)
This works.
But what I would like is to scale the image by a given value, but I am not sure how.
Tried, first with static values 2 ways:
setImageScale(0.1,0.1,kep)
scaleImage(0.1, 0.1, kep)
Expected to scale the image to 10%, but remains at 100%. No error is raised, just nothing happens. Can somebody please tell me what am I doing wrong?
Edit:
I tried to shift the image to filter out other possible issues and this works as expected:
setImageOffset(10,10,kep)
The image is shifted with 10 points both directions. Only scaling doesn't work.
Finally, I found a workaround here:
http://forums.scribus.net/index.php?topic=94.0
In my case:
setProperty(kep, 'imageXScale', xscale)

Why can't tesseract-ocr detect text which is in a box?

Consider this experiment:
I have two images, one with free text and another with text with-in a box (surrounded by border)
If I run tesseract-ocr on these two images, the free text image outputs 'Text' while the boxed image outputs Nothing ''
Why is that?
As a fix, I can crop borders using some image processing but I am trying to know what's causing this problem.
free image
Boxed image
So far,
I cropped borders of the image using below logic [We should feed it outer border contour-cropped image] and I then I was able to detect the text. However I don't understand why tesseract isn't detecting the boxed text. Feel free to experiment on attached images.
`# Below code modified (x,y) and (height,width) `
`# in a way that new values choose a smaller box enclosed`
`# by the original box`
y = y + int(0.025*h)
x = x + int(0.025*w)
h = h - int(0.05*h)
w = w - int(0.05*w)

opencv zoom function strange results

i am trying to write a zoom function which looks something like this:
centre = ((im.width-1)/2, (im.height-1)/2)
width = int(im.width/(2.0*level))
height = int(im.height/(2.0*level))
rect = (centre[0]-width, centre[1]-height, width*2, height*2)
dst = cv.GetSubRect(im, rect)
cv.Resize(dst, im)
when I use exactly what is written above, I get an odd result where the bottom half of the resultant image is distorted and blurry. However when I replace the line cv.Resize(dst, im) with
size = cv.CloneImage(im)
cv.Resize(dst, size)
im = size
it works fine. Why is this? is there something fundamentally wrong with the way i am performing the zoom?
cv.Resize requires source and destination to be separate memory locations.
Now in the first snippet of your code, you are using cv.GetSubRect to generate an object pointing to area of image which you wish to zoom in. Here the new object is NOT pointing to a new memory location. It is pointing to a memory location which is a subset of original object.
Since cv.Resize requires both the memory locations to be different, what you are getting is a result of undefined behavior.
In the second part of your code you are fulfilling this criteria by using cv.CloneImage.
you are first creating a copy of im (i.e. size. however you could have used a blank image aswell) and then you are using cv.Resize to resize dst and write the resulting image in size.
My advice is to go through the function documentation before using them.

Use Python / PIL or similar to shrink whitespace

Any ideas how to use Python with the PIL module to shrink select all? I know this can be achieved with Gimp. I'm trying to package my app as small as possible, a GIMP install is not an option for the EU.
Say you have 2 images, one is 400x500, other is 200x100. They both are white with a 100x100 textblock somewhere within each image's boundaries. What I'm trying to do is automatically strip the whitespace around that text, load that 100x100 image textblock into a variable for further text extraction.
It's obviously not this simple, so just running the text extraction on the whole image won't work! I just wanted to query about the basic process. There is not much available on Google about this topic. If solved, perhaps it could help someone else as well...
Thanks for reading!
If you put the image into a numpy array, it's simple to find the edges which you can use PIL to crop. Here I'm assuming that the whitespace is the color (255,255,255), you can adjust to your needs:
from PIL import Image
import numpy as np
im = Image.open("test.png")
pix = np.asarray(im)
pix = pix[:,:,0:3] # Drop the alpha channel
idx = np.where(pix-255)[0:2] # Drop the color when finding edges
box = map(min,idx)[::-1] + map(max,idx)[::-1]
region = im.crop(box)
region_pix = np.asarray(region)
To show what the results look like, I've left the axis labels on so you can see the size of the box region:
from pylab import *
subplot(121)
imshow(pix)
subplot(122)
imshow(region_pix)
show()
The general algorithmn would be to find the color of the top left pixel, and then do a spiral scan inwards until you find a pixel not of that color. That will define one edge of your bounding box. Keep scanning until you hit one more of each edge.
http://blog.damiles.com/2008/11/basic-ocr-in-opencv/
might be of some help. You can use the simple bounding box method described in that tutorial or #Tyler Eaves spiral suggestion which works equally as well

Categories