Convert pdf to jpg - can't see outputs - python

I have Python 3.6 and want to know how to convert 30+ pdf images into jpgs. I have these pdf images stored in one folder and would like to run a script to run through all the pdfs, convert them to jpgs and split them out into a new folder.
I tried to test this out on one image (see code below):
from pdf2jpg import pdf2jpg
inputpath = r"C:\Users\Admin-dsc\Documents\Image project\pdfinputs\RWG003209_2 Red.pdf"
outputpath = r"C:\Users\Admin-dsc\Documents\Image project\jpgoutputs"
result = pdf2jpg.convert_pdf2jpg(inputpath, outputpath, pages="1")
print(result)
The code runs fine, but when I look in the folder:
C:\Users\Admin-dsc\Documents\Image project\jpgoutputs
I see a folder called RWG003209_2 Red.pdf which is empty. I am confused - shouldn't the jpgs be saved here? Have I misunderstood something?

Related

How do I convert a folder of images to a npy file?

I have a folder containing images of gestures. But to make it work on my code I need to change it to X.npy and Y.npy. I looked for many questions regarding this kinda problems but still in the dark. How do I evaluate this? How do I convert the folder to create npy dataset of my own? Is there any code for this or any converter?
I found a piece of code for this purpose on github.
from PIL import Image
import os, sys
import cv2
import numpy as np
'''
Converts all images in a directory to '.npy' format.
Use np.save and np.load to save and load the images.
Use it for training your neural networks in ML/DL projects.
'''
# Path to image directory
path = "/path/to/image/directory/"
dirs = os.listdir( path )
dirs.sort()
x_train=[]
def load_dataset():
# Append images to a list
for item in dirs:
if os.path.isfile(path+item):
im = Image.open(path+item).convert("RGB")
im = np.array(im)
x_train.append(im)
if __name__ == "__main__":
load_dataset()
# Convert and save the list of images in '.npy' format
imgset=np.array(x_train)
np.save("imgds.npy",imgset)
You can refer to the code snippet in the following GitHub repo, that I found in google to convert a folder of images to a npy file:
https://gist.github.com/anilsathyan7/ffb35601483ac46bd72790fde55f5c04
Here in this case entire images in the folder are converted into NumPy array and are appended in a list named x_train.To convert and save this list of images in a single '.npy' format file, we can use the same code snippet:
imgset=np.array(x_train)
np.save("imgds.npy",imgset)
To convert and save this list of images in multiple '.npy' format files, use below code snippet :
imgset=np.array(x_train,dtype=object)
for i in range(len(imgset)):
np.save("imgds"+str(i)+".npy",imgset[i])

Python script to convert RBG image dataset into Grayscale images using pillow

I want to convert an image RGB dataset to Grayscale dataset using pillow. I want to write a script that takes in the dataset path and converts all images one by one into grayscale. At the end I want to save this script, and want to run this script on the server to avoid the copying of huge data to the server.
Probably this code would work for you?
Code:
import os
from PIL import Image
dir = '/content/pizza_steak/test/pizza'
for i in range(len(os.listdir(dir))):
# directory where images are stored
dir = '/content/pizza_steak/test/steak'
# get the file name
file_name = os.listdir(dir)[i]
# creating a final path
final_path = dir + '/' + file_name
# convet and save the image
Image.open(final_path).convert('L').save(f"/content/meow/gray{file_name}")
# uncomment this is you want to delete the file
# os.remove(final_path)

Can python load all files in all subfolders without the complete names?

Please Help me.
I'm new to python and I want to load all jpg images in 500 folders (each folder has up to 100 jpg images).
I also have a CSV file that contains labels for folders, I want to tell python to consider every folder label for all jpg images in that folder.
example file name: folder_name[.....].jpg
Each file has the same folder name, except the [], which is different for each file.
How can I tell python no matter what it is []??
I would appreciate any help.
train = pd.read_csv("COAD_CMS_label_train.csv")`
train_image = []
for i in tqdm(range(train.shape[0])):
img = image.load_img('tiles/'+train['folder_name'][i]+' [*]'.astype('str')+'.jpg', target_size=(256,256,3),
grayscale=False,)
example for labels:
folder_name,Label
TCGA-A6-2683-01Z-00-DX1.0dfc5d0a-68f4-45e1-a879-0428313c6dbc,CMS2
TCGA-F4-6459-01Z-00-DX1.80a78213-1137-4521-9d60-ac64813dec4c,CMS4
TCGA-A6-6653-01Z-00-DX1.e130666d-2681-4382-9e7a-4a4d27cb77a4,CMS1
You could do something like this which will result in an array of all files with the jpg extension, where parent_dir is the parent directory which contains all the subdirectories with the images.
images_files = glob.iglob(f'{parent_dir}/*.jpg', recursive=True)
Hope that helps.
Good luck

How can I save the output from a subprocess to a dataframe?

I'm working on a script to extract exif data (Latitude, Longitude, and Altitude) from RTK drone images. I have more or less copied the code below from a youtube video (Franchyze923)- with a few modifications. [I've been coding for a very short time]. How can I get the results of the subprocess to save to a table/dataframe (eventually I want to save the information to a .csv).
A different version of this script generated a .csv for every image - which I then imported all the csv files and pd.concat() them into one dataframe. That works but seems clunky.
import os
import subprocess
#Extracting exif data for images in Agisoft folder
exiftool_location = #path to exiftool.exe
images_to_extract_exif = #path to images
for path, directories, files in os.walk(images_to_extract_exif):
for images_to_extract_exif in files:
if images_to_extract_exif.endswith("JPG"):
full_jpg_path = os.path.join(path, images_to_extract_exif)
exiftool_command = [exiftool_location, "-filename", "-gpslatitude", "-gpslongitude", "-gpsaltitude", "-T", "-n", full_jpg_path]
subprocess.run(exiftool_command)
The output from the code looks great - I just have no clue how to save it to a table/dataframe.
DJI_0001.JPG 45.2405341666667 -95.3808298055556 354.427
DJI_0002.JPG 45.2405253333333 -95.3808253055556 354.434
DJI_0003.JPG 45.2404568888889 -95.3808200277778 354.447
DJI_0004.JPG 45.2403695277778 -95.3808205555556 354.431

How do I get the face_recognition encoding from many images in a directory and store them in a CSV File?

This is the code I have and it works for single images:
Loading images and apply the encoding
from face_recognition.face_recognition_cli import image_files_in_folder
Image1 = face_recognition.load_image_file("Folder/Image1.jpg")
Image_encoding1 = face_recognition.face_encodings(Image1)
Image2 = face_recognition.load_image_file("Folder/Image2.jpg")
Image_encoding2 = face_recognition.face_encodings(Image2)
Face encodings are stored in the first array, after column_stack we have to resize
Encodings_For_File = np.column_stack(([Image_encoding1[0]],
[Image_encoding2[0]]))
Encodings_For_File.resize((2, 128))
Convert array to pandas dataframe and write to csv
Encodings_For_File_Panda = pd.DataFrame(Encodings_For_File)
Encodings_For_File_Panda.to_csv("Celebrity_Face_Encoding.csv")
How do I loop over the images in 'Folder' and extract the encoding into a csv file? I have to do this with many images and cannot do it manually. I tried several approaches, but none a working for me. Cv2 can be used instead of load_image_file?
Try this
Note: I am assuming you dont need to specify folder path before file name in your command. This code will show you how to iterate over the directory to list files and process them
import os
from face_recognition.face_recognition_cli import image_files_in_folder
my_dir = 'folder/path/' # Folder where all your image files reside. Ensure it ends with '/
encoding_for_file = [] # Create an empty list for saving encoded files
for i in os.listdir(my_dir): # Loop over the folder to list individual files
image = my_dir + i
image = face_recognition.load_image_file(image) # Run your load command
image_encoding = face_recognition.face_encodings(image) # Run your encoding command
encoding_for_file.append(image_encoding[0]) # Append the results to encoding_for_file list
encoding_for_file.resize((2, 128)) # Resize using your command
You can then convert to pandas and export to csv. Let me know how it goes

Categories