Needing a little help/direction with a project. Our task is to take in a .ppm file (the one we are required to test with is found here: http://beastie.cs.ua.edu/cs250/projects/asciiart/tux.ppm) and reprint it out on the screen using ascii characters. We are required to convert the pixels to greyscale. This is really where I am stuck. Cannot figure out how to read in every three elements (because every three is a pixel in PPM files), convert them to greyscale and move on. Again PIL is not allowed. Any help or links on what to read up on would be awesome!
The PPM isn't hard to parse.
The header:
P3
50 50
255
P3 means that the image is an ASCII pixmap (color).
50 50 is the width and height.
255 is the max color value.
The body:
254 254 252 254 254 252 254 254 252 254 254 252 254 254 252 254 254 252
254 254 252 254 254 252 254 254 252 254 254 252 254 254 252 254 254 252
254 254 252 254 254 252 254 254 252 253 255 250 239 244 237 251 255 248
234 236 231 255 255 251 252 251 249 255 254 251 253 248 242 255 255 244
...
Just remove all newlines:
body.replace('\n', ' ')
And parse it in triplets (not too elegant):
raw = body.split(' ')
for i in range(0, len(raw), 3):
red = raw[i]
green = raw[i + 1]
blue = raw[i + 2]
Reading the ppm file you can do this:
# Open the PPM file and process the 3 first lines
f = open("tux.ppm")
color = f.readline().splitlines()
size_x, size_y = f.readline().split()
max = f.readline().splitlines()
You really don't need to know about the 3 first lines of the file. You only must know that you are working with a RGB image, which means you have 3 values (0-255) for each pixel.
To convert the image to grayscale you have two options: you can either generate another PPM file (with 3 values per pixel) or you can generate a PGM file which has the same format as the PPM but the first line is P2 instead of P3 and you will have only one value per pixel (that's the cool way).
To convert a RGB color value (r,g,b) in one grayscale intensity value you can apply this formula (better than simply apply the average):
0.21*r + 0.71*g + 0.07*b
Generate the grayscale image with one value per pixel (if you want it in 3-values you only have to repeat 3 times it for r,g,b):
# Getting the image data (if you have read the 3 first lines...)
data = f.read().split()
# Generate a new array of pixels with grayscale values (1 per pixel)
gray_data = [0.21*data[i] + 0.71*data[i+1] + 0.07*data[i+2] for i in range(0,len(data),3)]
Related
So in Python NumPy, I have a list of list from 0 to 99 divided by 5:
array_b = np.arange(0,100).reshape(5, 20)
list_a = array_b.tolist()
I want to add the numbers in the list by column so that the result will be:
[200 205 210 215 220 225 230 235 240 245 250 255 260 265 270 275 280 285 290 295]
I know how to do it in the array version, but I want to do the same thing in the list version (without using np.sum(array_b, axis=0)).
Any help?
Without numpy this can be done with zip and map quite elegantly:
list(map(sum, zip(*list_a)))
Explanation:
zip(*list_a) aggregates the lists element-wise
map(sum, ...) tells to apply the sum on each of these aggregations
finally, list(..) simply unpacks the iterator returned by map into a list.
Easy as (num)py...
Use .sum(axis=0) on a numpy array
import numpy as np
result = np.array(values).sum(axis=0)
# [200 205 210 215 220 225 230 235 240 245 250 255 260 265 270 275 280 285 290 295]
With the other axis possibilities
result = np.array(values).sum(axis=1) # [ 190 590 990 1390 1790]
result = np.array(values).sum() # 4950
import numpy as np
a = [[...]]
sum_array = np.sum(a, axis=0)
Just tried using blobs to detect my image, using the example here:
https://www.learnopencv.com/blob-detection-using-opencv-python-c/,
but it just does not detect anything.
https://imgur.com/a/YE1YZpV
I tried using the original Image, grey image, and thresholding it to only black and white, but none of them detect any blobs, and the keypoints always remain 0.
import numpy as np
import cv2
im_width = 320
im_height = 240
img = cv2.imread("D:\\20190822\\racket.bmp")
GreyImage=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh=cv2.threshold(GreyImage,50,255,cv2.THRESH_BINARY)
detector = cv2.SimpleBlobDetector_create()
keypoints = detector.detect(thresh)
blobs = cv2.drawKeypoints(thresh, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
print(len(keypoints))
cv2.imshow("Keypoints", blobs)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am not an expert on this, but it appears from the documentation that the default is to find circular blobs. You do not have circles except for some small dots. So you have to relax all the arguments to catch every shape. See
https://docs.opencv.org/3.4/d0/d7a/classcv_1_1SimpleBlobDetector.html
https://docs.opencv.org/3.4/d2/d29/classcv_1_1KeyPoint.html#a308006c9f963547a8cff61548ddd2ef2
https://craftofcoding.wordpress.com/tag/cv2/
So try this:
Input:
import numpy as np
import cv2
import math
img = cv2.imread("racket.png")
GreyImage=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh=cv2.threshold(GreyImage,50,255,cv2.THRESH_BINARY)
# erode to one large blob
#thresh = cv2.erode(thresh, None, iterations=4)
cv2.imshow("Threshold", thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Set up the SimpleBlobdetector with default parameters.
params = cv2.SimpleBlobDetector_Params()
# Change thresholds
params.minThreshold = 0
params.maxThreshold = 256
# Filter by Area.
params.filterByArea = True
params.minArea = 0
params.maxArea = 100000000000
# Filter by Color (black)
params.filterByColor = True
params.blobColor = 0
# Filter by Circularity
params.filterByCircularity = True
params.minCircularity = 0
params.maxCircularity = 100000000
# Filter by Convexity
params.filterByConvexity = True
params.minConvexity = 0
params.maxConvexity = 100000000
# Filter by InertiaRatio
params.filterByInertia = True
params.minInertiaRatio = 0
params.maxInertiaRatio = 100000000
# Distance Between Blobs
params.minDistBetweenBlobs = 0
# Do detecting
detector = cv2.SimpleBlobDetector_create(params)
# Get keypoints
keypoints = detector.detect(thresh)
print(len(keypoints))
print('')
# Get keypoint locations and radius
for keypoint in keypoints:
x = int(keypoint.pt[0])
y = int(keypoint.pt[1])
s = keypoint.size
r = int(math.floor(s/2))
print (x,y,r)
#cv2.circle(img, (x, y), r, (0, 0, 255), 2)
# Draw blobs
blobs = cv2.drawKeypoints(thresh, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
cv2.imshow("Keypoints", blobs)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Save result
cv2.imwrite("racket_blobs.jpg", blobs)
42
136 226 35
138 225 1
136 225 1
134 225 1
140 223 1
122 223 1
137 222 1
144 221 1
114 221 1
83 232 9
144 219 1
150 217 1
114 215 1
164 209 1
158 209 1
163 206 1
118 203 1
128 195 1
175 194 1
134 189 1
185 184 1
154 175 1
197 174 1
159 174 1
157 172 1
196 171 1
162 171 1
200 169 1
198 167 1
204 165 1
200 165 1
200 163 1
211 162 1
179 160 1
208 159 1
210 157 1
204 157 1
135 227 1
203 156 1
214 155 1
204 155 1
200 155 1
If you want to see the shapes, then you might be better off using contours rather than blobs.
I am trying to get the coordinates or positions of text character from an Image using Tesseract.
I want to know the exact pixel position, so that i can click that text using some other tool.
Edit :
import pytesseract
from pytesseract import pytesseract
import PIL
from PIL import Image
import cv2
import csv
img = 'E:\\OCR-DATA\\sample.jpg'
imge = Image.open(img)
data=pytesseract.image_to_string(imge,lang='eng',boxes=True,config='hocr')
print(data)
data contains recognized text with box boundary value. But i am not sure , how to use that boundary value to get the co-ordinates of the text.
Value of the data variable is as follows:
O 100 356 115 373 0
u 117 356 127 368 0
t 130 356 138 372 0
p 141 351 152 368 0
u 154 356 164 368 0
t 167 356 175 371 0
you can try This:
img = 'tes.jpg'
imge = Image.open(img)
data=pytesseract.image_to_boxes(imge)
print(data)
This will directly give you the result Like:
T 22 58 52 97 0
H 62 58 95 96 0
R 102 58 135 97 0
E 146 57 174 97 0
A 184 57 216 96 0
D 225 56 258 96 0
You have the coordinates of the bounding box in every line.
From: Training Tesseract – Make Box Files
character, left, bottom, right, top, page
So for each character you get the character, followed by its bounding box characters, followed by the 0-based page number.
Is there a way to save a custom maplotlib colourmap (matplotlib.cm) as a file (e.g Color Palette Table file (.cpt), like used in MATLAB) to be shared and then use later in other programs? (e.g. Panopoly, MATLAB...)
Example
Below a new LinearSegmentedColormap is made by modifying an existing colormap (by truncation, as shown in another question linked here).
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
# Get an existing colorbar
cb = 'CMRmap'
cmap = plt.get_cmap( cb )
# Variables to modify (truncate) the colormap with
minval = 0.15
maxval = 0.95
npoints = 100
# Now modify (truncate) the colorbar
cmap = matplotlib.colors.LinearSegmentedColormap.from_list(
'trunc({n},{a:.2f},{b:.2f})'.format(n=cmap.name, a=minval,
b=maxval), cmap(np.linspace(minval, maxval, npoints)))
# Now the data can be extracted as a dictionary
cdict = cmap._segmentdata
# e.g. variables ('blue', 'alpha', 'green', 'red')
print( cdict.keys() )
# Now, is it possible to save to this as a .cpt?
More detail
I am aware of ways of loading external colormaps in matplotlib (e.g. shown here and here).
From NASA GISS's Panoply documentation:
Color Palette Table (CPT) indicates a color palette format used by the
Generic Mapping Tools program. The format defines a number of solid
color and/or gradient bands between the colorbar extrema rather than a
finite number of distinct colors.
The following is a function that takes a colormap, some limits (vmin and vmax) and the number of colors as input and creates a cpt file from it.
import matplotlib.pyplot as plt
import numpy as np
def export_cmap_to_cpt(cmap, vmin=0,vmax=1, N=255, filename="test.cpt",**kwargs):
# create string for upper, lower colors
b = np.array(kwargs.get("B", cmap(0.)))
f = np.array(kwargs.get("F", cmap(1.)))
na = np.array(kwargs.get("N", (0,0,0))).astype(float)
ext = (np.c_[b[:3],f[:3],na[:3]].T*255).astype(int)
extstr = "B {:3d} {:3d} {:3d}\nF {:3d} {:3d} {:3d}\nN {:3d} {:3d} {:3d}"
ex = extstr.format(*list(ext.flatten()))
#create colormap
cols = (cmap(np.linspace(0.,1.,N))[:,:3]*255).astype(int)
vals = np.linspace(vmin,vmax,N)
arr = np.c_[vals[:-1],cols[:-1],vals[1:],cols[1:]]
# save to file
fmt = "%e %3d %3d %3d %e %3d %3d %3d"
np.savetxt(filename, arr, fmt=fmt,
header="# COLOR_MODEL = RGB",
footer = ex, comments="")
# test case: create cpt file from RdYlBu colormap
cmap = plt.get_cmap("RdYlBu",255)
# you may create your colormap differently, as in the question
export_cmap_to_cpt(cmap, vmin=0,vmax=1,N=20)
The resulting file looks like
# COLOR_MODEL = RGB
0.000000e+00 165 0 38 5.263158e-02 190 24 38
5.263158e-02 190 24 38 1.052632e-01 215 49 39
1.052632e-01 215 49 39 1.578947e-01 231 83 55
1.578947e-01 231 83 55 2.105263e-01 244 114 69
2.105263e-01 244 114 69 2.631579e-01 249 150 86
2.631579e-01 249 150 86 3.157895e-01 253 181 104
3.157895e-01 253 181 104 3.684211e-01 253 207 128
3.684211e-01 253 207 128 4.210526e-01 254 230 153
4.210526e-01 254 230 153 4.736842e-01 254 246 178
4.736842e-01 254 246 178 5.263158e-01 246 251 206
5.263158e-01 246 251 206 5.789474e-01 230 245 235
5.789474e-01 230 245 235 6.315789e-01 206 234 242
6.315789e-01 206 234 242 6.842105e-01 178 220 235
6.842105e-01 178 220 235 7.368421e-01 151 201 224
7.368421e-01 151 201 224 7.894737e-01 120 176 211
7.894737e-01 120 176 211 8.421053e-01 96 149 196
8.421053e-01 96 149 196 8.947368e-01 70 118 180
8.947368e-01 70 118 180 9.473684e-01 59 86 164
9.473684e-01 59 86 164 1.000000e+00 49 54 149
B 165 0 38
F 49 54 149
N 0 0 0
and would be in the required format.
Searching Google for Histogram Equalization Python or Contrast Stretching Python I am directed to the same links from python documentation in OpenCv which are actually both related to equalization and not stretching (IMO).
http://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/histogram_equalization/histogram_equalization.html
http://docs.opencv.org/3.2.0/d5/daf/tutorial_py_histogram_equalization.html
Read the documentation, it seems to be a confusion with the wording, as it describes equalization as a stretching operation:
What Histogram Equalization does is to stretch out this range.
AND
So you need to stretch this histogram to either ends (as given in below image, from wikipedia) and that is what Histogram Equalization does (in simple words)
I feel that is wrong because nowhere on Wikipedia it says that histogram equalization means stretching, and reading other sources they clearly distinguish the two operations.
http://homepages.inf.ed.ac.uk/rbf/HIPR2/histeq.htm
http://homepages.inf.ed.ac.uk/rbf/HIPR2/stretch.htm
My questions:
does the OpenCV documentation actually implements Histogram Equalization, while badly explaining it?
Is there any implementation for contrast stretching in Python? (OpenCV, etc?)
OpenCV doesn't have any function for contrast stretching and google yields the same result because histogram equalization does stretch the histogram horizontally but its just the difference of the transformation function. (Both methods increase the contrast of the images.Transformation function transfers the pixel intensity levels from the given range to required range.)
Histogram equalization derives the transformation function(TF) automatically from probability density function (PDF) of the given image where as in contrast stretching you specify your own TF based on the applications' requirement.
One simple TF through which you can do contrast stretching is min-max contrast stretching -
((pixel – min) / (max – min))*255.
You do this for each pixel value. min and max being the minimum and maximum intensities.
You can also use cv2.LUT for contrast stretching by creating a custom table using np.interp. Links to their documentation are this and this respectively. Below an example is shown.
import cv2
import numpy as np
img = cv2.imread('messi.jpg')
original = img.copy()
xp = [0, 64, 128, 192, 255]
fp = [0, 16, 128, 240, 255]
x = np.arange(256)
table = np.interp(x, xp, fp).astype('uint8')
img = cv2.LUT(img, table)
cv2.imshow("original", original)
cv2.imshow("Output", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
The table created
[ 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4
4 4 5 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8
9 9 9 9 10 10 10 10 11 11 11 11 12 12 12 12 13 13
13 13 14 14 14 14 15 15 15 15 16 17 19 21 23 24 26 28
30 31 33 35 37 38 40 42 44 45 47 49 51 52 54 56 58 59
61 63 65 66 68 70 72 73 75 77 79 80 82 84 86 87 89 91
93 94 96 98 100 101 103 105 107 108 110 112 114 115 117 119 121 122
124 126 128 129 131 133 135 136 138 140 142 143 145 147 149 150 152 154
156 157 159 161 163 164 166 168 170 171 173 175 177 178 180 182 184 185
187 189 191 192 194 196 198 199 201 203 205 206 208 210 212 213 215 217
219 220 222 224 226 227 229 231 233 234 236 238 240 240 240 240 240 241
241 241 241 242 242 242 242 243 243 243 243 244 244 244 244 245 245 245
245 245 246 246 246 246 247 247 247 247 248 248 248 248 249 249 249 249
250 250 250 250 250 251 251 251 251 252 252 252 252 253 253 253 253 254
254 254 254 255]
Now cv2.LUT will replace the values of the original image with the values in the table. For example, all the pixels having values 1 will be replaced by 0 and all pixels having values 4 will be replaced by 1.
Original Image
Contrast Stretched Image
The values of xp and fp can be varied to create custom tables as required and it will stretch the contrast even if min and max pixels are 0 and 255 unlike the answer provided by hashcode55.
Python/OpenCV can do contrast stretching via the cv2.normalize() method using min_max normalization. For example:
Input:
#!/bin/python3.7
import cv2
import numpy as np
# read image
img = cv2.imread("zelda3_bm20_cm20.jpg", cv2.IMREAD_COLOR)
# normalize float versions
norm_img1 = cv2.normalize(img, None, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
norm_img2 = cv2.normalize(img, None, alpha=0, beta=1.2, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
# scale to uint8
norm_img1 = (255*norm_img1).astype(np.uint8)
norm_img2 = np.clip(norm_img2, 0, 1)
norm_img2 = (255*norm_img2).astype(np.uint8)
# write normalized output images
cv2.imwrite("zelda1_bm20_cm20_normalize1.jpg",norm_img1)
cv2.imwrite("zelda1_bm20_cm20_normalize2.jpg",norm_img2)
# display input and both output images
cv2.imshow('original',img)
cv2.imshow('normalized1',norm_img1)
cv2.imshow('normalized2',norm_img2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Normalize1:
Normalize2:
You can also do your own stretching by using a simple linear equation with 2 pair of input/ouput values using the form y=A*x+B and solving the two simultaneous equations. See concept for stretching shown in How can I make the gradient appearance of one image equal to the other?
Ok, so I wrote this function that does Standard Deviation Contrast Stretching, on each band of an image.
For normal distributions, 68% of the observations lie within – 1 standard deviation of the mean, 95.4% of all observations lie within – 2 standard deviations, and 99.73% within – 3 standard deviations.
this is basically a min-max stretch but the max is mean+sigma*std and min is mean-sigma*std
def stretch(img,sigma =3,plot_hist=False):
stretched = np.zeros(img.shape)
for i in range(img.shape[2]): #looping through the bands
band = img[:,:,i] # copiying each band into the variable `band`
if np.min(band)<0: # if the min is less that zero, first we add min to all pixels so min becomes 0
band = band + np.abs(np.min(band))
band = band / np.max(band)
band = band * 255 # convertaning values to 0-255 range
if plot_hist:
plt.hist(band.ravel(), bins=256) #calculating histogram
plt.show()
# plt.imshow(band)
# plt.show()
std = np.std(band)
mean = np.mean(band)
max = mean+(sigma*std)
min = mean-(sigma*std)
band = (band-min)/(max-min)
band = band * 255
# this streching cuases the values less than `mean-simga*std` to become negative
# and values greater than `mean+simga*std` to become more than 255
# so we clip the values ls 0 and gt 255
band[band>255]=255
band[band<0]=0
print('band',i,np.min(band),np.mean(band),np.std(band),np.max(band))
if plot_hist:
plt.hist(band.ravel(), bins=256) #calculating histogram
plt.show()
stretched[:,:,i] = band
stretched = stretched.astype('int')
return stretched
in the case above, I didn't need the band ratios to stay the same, but the best practice for an RGB image would be like this:
https://docs.opencv.org/4.x/d5/daf/tutorial_py_histogram_equalization.html
Unfortunately, this CLAHE stretching does not work on multi-band images so should be applied to each band separately - which gives wrong results since the contrast between each band will be lost and the images tend to be gray. what we need to do is:
we need to transform the image into HSV color space and stretch the V (value - intensity) and leave the rest. this is how we get a good stretch(pun intended).
The thing about cv.COLOR_HSV2RGB is that it actually returns BGR instead of RGB so after the HSV2RGB we need to reverse the bands.
here's the function I wrote:
def stack_3_channel(r,g,b , clipLimit = 20 , tileGridSize=(16,16) ):
img = np.stack([r,g,b], axis=2)
img = cv.normalize(img, None, 0, 255, cv.NORM_MINMAX, dtype=cv.CV_8U)
hsv_img = cv.cvtColor(img, cv.COLOR_BGR2HSV)
h, s, v = hsv_img[:,:,0], hsv_img[:,:,1], hsv_img[:,:,2]
clahe = cv.createCLAHE(clipLimit, tileGridSize)
v = clahe.apply(v) #stretched histogram for showing the image with better contrast - its not ok to use it for scientific calculations
hsv_img = np.dstack((h,s,v))
# NOTE: HSV2RGB returns BGR instead of RGB
bgr_stretched = cv.cvtColor(hsv_img, cv.COLOR_HSV2RGB)
#reversing the bands back to RGB
rgb_stretched = np.zeros(bgr_stretched.shape)
rgb_stretched[:,:,0] = bgr_stretched[:,:,2]
rgb_stretched[:,:,1] = bgr_stretched[:,:,1]
rgb_stretched[:,:,2] = bgr_stretched[:,:,0]
# if the valuse are float, plt will have problem showing them
rgb_stretched = rgb_stretched.astype('uint8')
return img , rgb_stretched