I'm a newbie in Python.
I want to extract RGB values from multiple images. I want to use RGB values of every images as an input of K-Fold Cross Validation.
I can only get the RGB values of one image only. So I tried to get from multiple images with the following code:
from __future__ import with_statement
from PIL import Image
import glob
#Path to file
for img in glob.glob({Path}+"*.jpg"):
im = Image.open(img)
#Load the pixel info
pix = im.load()
#Get a tuple of the x and y dimensions of the image
width, height = im.size
#Open a file to write the pixel data
with open('output_file.csv', 'w+') as f:
f.write('R,G,B\n')
#Read the details of each pixel and write them to the file
for x in range(width):
for y in range(height):
r = pix[x,y][0]
g = pix[x,x][1]
b = pix[x,x][2]
f.write('{0},{1},{2}\n'.format(r,g,b))
I expect to get input like this in CSV file:
img_name,R,G,B
1.jpg,50,50,50
2.jpg,60,60,70
But the actual output is the CSV file contain 40000+ rows.
Is it possible to automate RGB value from multiple images?
Your code is currently writing the value of each pixel as a separate row in your CSV file, so you are likely to have a huge number of rows.
To work on multiple files, you need to rearrange your code a bit and indent the file writing inside your loop. It might also be a good idea to make use of Python's CSV library to write the CSV file just in case any of your filenames contain commas. If this happened, it would correctly wrap the field in quotes.
from PIL import Image
import glob
import os
import csv
#Open a file to write the pixel data
with open('output_file.csv', 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(["img_name", "R", "G", "B"])
#Path to file
for filename in glob.glob("*.jpg"):
im = Image.open(filename)
img_name = os.path.basename(filename)
#Load the pixel info
pix = im.load()
#Get a tuple of the x and y dimensions of the image
width, height = im.size
print(f'{filename}, Width {width}, Height {height}') # show progress
#Read the details of each pixel and write them to the file
for x in range(width):
for y in range(height):
r = pix[x,y][0]
g = pix[x,y][1]
b = pix[x,y][2]
csv_output.writerow([img_name, r, g, b])
Note: There was also a problem with getting your r g b values, you had [x,x] in two cases.
As noted by #GiacomoCatenazzi, your loops could also be removed:
from itertools import product
from PIL import Image
import glob
import os
import csv
#Open a file to write the pixel data
with open('output_file.csv', 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(["img_name", "R", "G", "B"])
#Path to file
for filename in glob.glob("*.jpg"):
im = Image.open(filename)
img_name = os.path.basename(filename)
#Load the pixel info
pix = im.load()
#Get a tuple of the x and y dimensions of the image
width, height = im.size
print(f'{filename}, Width {width}, Height {height}') # show
#Read the details of each pixel and write them to the file
csv_output.writerows([img_name, *pix[x,y]] for x, y in product(range(width), range(height)))
Related
I would like to perform a threshold analysis for several images, each with different values for the lower threshold limit. I would like to save the results in a csv file. Unfortunately, my code does not work the way I want it to. I am a python beginner.
Thanks for your help!
import cv2 as cv
import numpy as np
import os
import csv
x = []
y = []
Source_Path = 'Images/Image'
with open('Thresh.csv', 'w', newline='') as f:
writer = csv.writer(f)
for i, filename in enumerate(os.listdir(Source_Path)):
for j in range(100,255):
ret,thresh = cv.threshold(i,j,255,cv.THRESH_BINARY)
count = np.sum(thresh == 255)
x.append(count)
y.append(x)
writer.writerow(y)
You'll need to read the image first before you can threshold it. Start with
img = cv.imread("file.jpg")
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
ret, thresh = cv.threshold(img, 120, 255, cv.THRESH_BINARY)
Don't worry about writing the results to a csv file until you have gathered all the statistics that you want in a numpy array. Then write that array to disk using
np.savetext("foo.csv", np_array, delimiter=",")
I am making a ML project to recognize the silouhettes of different users. I have a raw image dataset of 1900 images. I want to convert them to a csv dataset with labels being the names of the users. I am currently stuck with the part of converting the images to a numpy array. The code is here
from PIL import Image
import numpy as np
import sys
import os
import csv
# default format can be changed as needed
def createFileList(myDir, format='.jpg'):
fileList = []
print(myDir)
for root, dirs, files in os.walk(myDir, topdown=False):
for name in files:
if name.endswith(format):
fullName = os.path.join(root, name)
fileList.append(fullName)
return fileList
rahul = []
# load the original image
myFileList = createFileList(r'C:\Users\Mr.X\PycharmProjects\Gait_Project\data\rahul')
for file in myFileList:
print(file)
img_file = Image.open(file)
# img_file.show()
# get original image parameters...
width, height = img_file.size
format = img_file.format
mode = img_file.mode
# Make image Greyscale
img_grey = img_file.convert('L')
img_res = img_grey.resize((480, 272))
# img_grey.save('result.png')
# img_grey.show()
# Save Greyscale values
value = np.asarray(img_res.getdata(), dtype=np.int).reshape((img_res.size[1], img_res.size[0]))
value = value.flatten()
print(value)
npvalue = np.array(value)
rahul.append(npvalue)
#with open("rahul.csv", 'a') as f:
# writer = csv.writer(f)
# writer.writerow(value)
final = np.array(rahul)
np.save("rahul.npy", final)
My goal is to make a data set with 1900 images and 4 labels, currently while making the numpy array each pixel of an image is entered in a separate column. making if 1900 rows and 200k columns that needs to become 1900 rows and 2 columns. Any suggestion or help is appreciated
It's a letter recognition task and there are 284 images, and 19 classes. I want to apply naive bayesian. First I have to convert each image to feature vector and for reducing extra info I should use some feature selection code like cropping images to remove extra black borders. But I'm not much experienced in python.
How can I crop black spaces in images in order to decrease the size of csv files? ( because the columns are more than expected!) And also how can I resize images to be the same size?
from PIL import Image, ImageChops
from resize import trim
import numpy as np
import cv2
import os
import csv
#Useful function
def createFileList(myDir, format='.jpg'):
fileList = []
print(myDir)
for root, dirs, files in os.walk(myDir, topdown=False):
for name in files:
if name.endswith(format):
fullName = os.path.join(root, name)
fileList.append(fullName)
return fileList
# load the original image
myFileList = createFileList('image_ocr')
#print(myFileList)
for file in myFileList:
#print(file)
img_file = Image.open(file)
# img_file.show()
# get original image parameters...
width, height = img_file.size
format = img_file.format
mode = img_file.mode
# Make image Greyscale
img_grey = img_file.convert('L')
# Save Greyscale values
value = np.asarray(img_grey.getdata(), dtype=np.int).reshape((img_grey.size[1], img_grey.size[0]))
value = value.flatten()
#print(value)
with open("trainData.csv", 'a') as f:
writer = csv.writer(f)
writer.writerow(value)
I am working on a program to take an image and flatten it so it can be written into a CSV file. That part works. I am having an issue when I try to read back the line from the CSV file. I try to reconstruct the image and I get an error of "ValueError: cannot reshape array of size 0 into shape (476,640,3)". I added sample output from the CSV file.
sample output
import csv
import cv2
import numpy as np
from skimage import io
from matplotlib import pyplot as plt
image = cv2.imread('Li.jpg')
def process_images (img):
img = np.array(img)
img = img.flatten()
return img
def save_data(img):
dataset = open('dataset.csv', 'w+')
with dataset:
writer = csv.writer(dataset)
writer.writerow(img)
def load_data():
with open('dataset.csv', 'r') as processed_data:
reader = csv.reader(processed_data)
for row in reader:
img = np.array(row , dtype='uint8')
img = img.reshape(476,6, 3)
return img
def print_image_stats (img):
print (img)
print (img.shape)
print (img.dtype)
def rebuilt_image(img):
img = img.reshape(476,640,3)
plt.imshow(img)
plt.show()
return img
p_images = process_images(image)
print_image_stats(p_images)
r_image = rebuilt_image(p_images)
print_image_stats(r_image)
save_data(p_images)
loaded_data = load_data()
#r_image = rebuilt_image(load_data)
#print_image_stats(r_image)
The empty lines at the end of the file you posted are significant. They are considered rows by the CSV reader object, and will be iterated over in your for loop. Thus there are passes through the loop in which an empty row is converted to an array of size zero, as the row has no elements. Resizing that obviously fails.
Either remove the rows from the CSV file, or directly use the np.loadtxt function, specifying the delimiter=',' option.
I am required to access all images in a folder and store it in a matrix. I was able to do it using matlab and here is the code:
input_dir = 'C:\Users\Karim\Downloads\att_faces\New Folder';
image_dims = [112, 92];
filenames = dir(fullfile(input_dir, '*.pgm'));
num_images = numel(filenames);
images = [];
for n = 1:num_images
filename = fullfile(input_dir, filenames(n).name);
img = imread(filename);
img = imresize(img,image_dims);
end
but I am required to do it using python and here is my python code:
import Image
import os
from PIL import Image
from numpy import *
import numpy as np
#import images
dirname = "C:\\Users\\Karim\\Downloads\\att_faces\\New folder"
#get number of images and dimentions
path, dirs, files = os.walk(dirname).next()
num_images = len(files)
image_file = "C:\\Users\\Karim\\Downloads\\att_faces\\New folder\\2.pgm"
im = Image.open(image_file)
width, height = im.size
images = []
for x in xrange(1, num_images):
filename = os.listdir(dirname)[x]
img = Image.open(filename)
img = im.convert('L')
images[:, x] = img[:]
but I am getting this error:
IOError: [Errno 2] No such file or directory: '10.pgm'
although the file is present.
I'm not quite sure what your end goal is, but try something more like this:
import numpy as np
import Image
import glob
filenames = glob.glob('/path/to/your/files/*.pgm')
images = [Image.open(fn).convert('L') for fn in filenames]
data = np.dstack([np.array(im) for im in images])
This will yield a width x height x num_images numpy array, assuming that all of your images have the same dimensions.
However, your images will be unsorted, so you may want to do filenames.sort().
Also, you may or may not want things as a 3D numpy array, but that depends entirely on what you're actually doing. If you just want to operate on each "frame" individually, then don't bother stacking them into one gigantic array.