I want to save VideoCapture in some condition,
so I wrote following code:
date = datetime.now()
detector = dlib.get_frontal_face_detector()
cap = cv2.VideoCapture(1) #I use second camera
global count, no_face
total_number = 0
count = 0
no_face = 0
num_bad_posture = 0
not_detected_posture = 0
while True:
ret, frame = cap.read()
frame = cv2.resize(frame,None,fx=sf,fy=sf, interpolation=cv2.INTER_AREA)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
dets = detector(gray,1)
if not dets:
print('no face')
no_face += 1
if no_face > 15:
print('no face!! for real!!')
now = str(date.now())
not_detected = cv2.resize(frame,None,fx=5,fy=5,interpolation = cv2.INTER_CUBIC)
not_detected_posture += 1
print(type(now))
cv2.imwrite('./images/non_detected/non_detected({0},{1}).jpg'. format(not_detected_posture,now),not_detected)
no_face=0
for i, d in enumerate(dets):
no_face = 0
w = d.width()
h = d.height()
x = d.left()
y = d.top()
If I run this code, the file is not saved.
Also if I delete date.now() and just put num_detected, it is saved properly
I have no idea what is wrong with the file name(because it's type is str and other strs are saved properly.
Also if I do
print(type(now),now)
it appears
I need help.
I would use a more limited set of characters for the filenames, avoiding things like spaces, colons, parentheses. Use a custom datetime format to generate a timestamp in a form you want.
Example:
from datetime import datetime
from os import path
BASE_DIR = './images/non_detected'
PREFIX = 'non_detected'
EXTENSION = 'jpg'
file_name_format = "{:s}-{:d}-{:%Y%m%d_%H%M%S}.{:s}"
date = datetime.now()
not_detected_posture = 0
file_name = file_name_format.format(PREFIX, not_detected_posture, date, EXTENSION)
file_path = path.normpath(path.join(BASE_DIR, file_name))
# ---------
print("File name = '%s'" % file_name)
print("File path = '%s'" % file_path)
Output:
File name = 'non_detected-0-20170819_152459.jpg'
File path = 'images/non_detected/non_detected-0-20170819_152459.jpg'
Simple, unambiguous and portable.
Perhaps one more improvement. If you have a rough idea how many images you will generate, you can zero-pad the number, so that the filenames are sorted in order. For example, if you know there will be at most 10000, then you can do
file_name_format = "{:s}-{:04d}-{:%Y%m%d_%H%M%S}.{:s}"
to you get a file name like
File name = 'non_detected-0000-20170819_152459.jpg'
The problem could be with the ':' or the '.' in the filename due to the inclusion of the datetime. I am not sure which one, because you can have filenames that include those and you can write such filenames from within python AFAIK. So, it might be an issue specific to cv2.
Related
Per title, I'm trying to write code to loop through multiple videos in a folder to extract their frames, then write each video's frames to their own new folder, e.g. video1 to frames_video1, video2 to frames_video2.
This is my code:
subclip_video_path = main_path + "\\subclips"
frames_path = main_path + "\\frames"
#loop through videos in file
for subclips in subclip_video_path:
currentVid = cv2.VideoCapture(subclips)
success, image = currentVid.read()
count = 0
while success:
#create new frames folder for each video
newFrameFolder = ("frames_" + subclips)
os.makedirs(newFrameFolder)
I get this error:
[ERROR:0] global C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-k8sx3e60\opencv\modules\videoio\src\cap.cpp (142) cv::VideoCapture::open VIDEOIO(CV_IMAGES): raised OpenCV exception:
OpenCV(4.4.0) C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-k8sx3e60\opencv\modules\videoio\src\cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): P in function 'cv::icvExtractPattern'
What does this mean? How can I fix this?
You can't loop though string: for subclips in subclip_video_path:
You need to get the list of your videos:
from glob import glob
sub_clip_video_path = glob("sub_clip_video_path/*.mp4")
This means get all the .mp4 extension video files and store it in sub_clip_video_path variable.
My Result:
['sub_clip_video_path/output.mp4', 'sub_clip_video_path/result.mp4']
Since I'm sure the directory contains two .mp4 extension files, now I can continue.
You don't need to re-declare VideoCapture for each frame.
for count, sub_clips in enumerate(sub_clip_video_path):
currentVid = cv2.VideoCapture(sub_clips)
success, image = currentVid.read()
count = 0
After you declare VideoCapture read all the frames from the current video, then declare VideoCapture for the next video.
for count, sub_clips in enumerate(sub_clip_video_path):
currentVid = cv2.VideoCapture(sub_clips)
image_counter = 0
while currentVid.isOpened():
.
.
Don't use while success this will create an infinite loop.
If the first frame grabbed from the video, then the success variable returns True. When you say:
while success:
#create new frames folder for each video
newFrameFolder = ("frames_" + subclips)
os.makedirs(newFrameFolder)
You will create infinite amount of folder for the current frame.
Here my result:
import os
import cv2
from glob import glob
sub_clip_video_path = glob("sub_clip_video_path/*.mp4") # Each image extension is `.mp4`
for count, sub_clips in enumerate(sub_clip_video_path):
currentVid = cv2.VideoCapture(sub_clips)
image_counter = 0
while currentVid.isOpened():
success, image = currentVid.read()
if success:
newFrameFolder = "frames_video{}".format(count + 1)
if not os.path.exists(newFrameFolder):
os.makedirs(newFrameFolder)
image_name = os.path.join(newFrameFolder, "frame{}.png".format(image_counter + 1))
cv2.imwrite(image_name, image)
image_counter += 1
else:
break
I gathered all the videos using glob
While the current video is being read:
for count, sub_clips in enumerate(sub_clip_video_path):
currentVid = cv2.VideoCapture(sub_clips)
image_counter = 0
while currentVid.isOpened():
If the current frame successfully grabbed, then declare folder name. If the folder does not exist, create it.
if success:
newFrameFolder = "frames_video{}".format(count + 1)
if not os.path.exists(newFrameFolder):
os.makedirs(newFrameFolder)
Then declare the image name and save it.
image_name = os.path.join(newFrameFolder, "frame{}.png".format(image_counter + 1))
cv2.imwrite(image_name, image)
image_counter += 1
I have written a script to find image size and aspect ratio of all images in a directory along with their corresponding filepaths, I want to print dict values to csv file with following headers width,height,aspect-ratio and filepath
import os
import json
from PIL import Image
folder_images = "/home/user/Desktop/images"
size_images = dict()
def yocd(a,b):
if(b==0):
return a
else:
return yocd(b,a%b)
for dirpath, _, filenames in os.walk(folder_images):
for path_image in filenames:
if path_image.endswith(".png") or path_image.endswith('.jpg') or path_image.endswith('.JPG') or path_image.endswith('.jpeg'):
image = os.path.abspath(os.path.join(dirpath, path_image))
""" ImageFile.LOAD_TRUNCATED_IMAGES = True """
try:
with Image.open(image) as img:
img.LOAD_TRUNCATED_IMAGES = True
img.verify()
print('Valid image')
except Exception:
print('Invalid image')
img = False
if img is not False:
width, heigth = img.size
divisor = yocd(width, heigth)
w = str(int(width / divisor))
h = str(int(heigth / divisor))
aspectratio = w+':'+h
size_images[image] = {'width': width, 'heigth': heigth,'aspect-ratio':aspectratio,'filepath': image}
for k, v in size_images.items():
print(k, '-->', v)
with open('/home/user/Documents/imagesize.txt', 'w') as file:
file.write(json.dumps(size_images))```
You can add a (properly constructed) dict directly to a pandas.DataFrame. Then, DataFrames have a .to_csv() function.
Here are the docs:
Pandas: Create a DataFrame
Pandas: Write to CSV
Without dependencies (but you may have to tweak the formatting)
csv_sep = ';' # choose here wich field separatar you want
with open('your_csv', 'w') as f:
# header
f.write("width"+csv_sep"+height"+csv_sep"+aspect-ratio"+csv_sep+"filepath\n")
# data
for img in size_images:
fields = [img['width'], img['height'], img['aspect-ratio'], img['filepath']]
f.write(csv_sep.join(fields)+'\n')
I am trying to iterate through pdfs to extract information from emails. My individual regex statements work when I try them on individual examples, however, when I try to put all the code together in a for loop to iterate over multiple pdfs at once, I am unable to append to my aggregate df (I'm currently just creating an empty df). I need to use the try/except because not all emails have all fields (eg. some do not have the 'Attachments' field). Below is the code I have written so far:
import os
import pandas as pd
pd.options.display.max_rows=999
import numpy
from numpy import NaN
from tika import parser
root = r"my_dir"
agg_df = pd.DataFrame()
for directory, subdirectory, files in os.walk(root):
for file in files:
filepath = os.path.join(directory, file)
print(file)
raw = parser.from_file(filepath)
img = raw['content']
img = img.replace('\n', '')
try:
from_field = re.search(r'From:(.*?)Sent:', img).group(1)
except:
pass
try:
sent_field = re.search(r'Sent:(.*?)To:', img).group(1)
except:
pass
try:
to_field = re.search(r'To:(.*?)Cc:', img).group(1)
except:
pass
try:
cc_field = re.search(r'Cc:(.*?)Subject:', img).group(1)
except:
pass
try:
subject_field = re.search(r'Subject:(.*?)Attachments:', img).group(1)
except:
pass
try:
attachments_field = re.search(r'Attachments:(.*?)NOTICE', img).group(1)
except:
pass
img_df = pd.DataFrame(columns=['From', 'Sent', 'To',
'Cc', 'Subject', 'Attachments'])
img_df['From'] = from_field
img_df['Sent'] = sent_field
img_df['To'] = to_field
img_df['Cc'] = cc_field
img_df['Subject'] = subject_field
img_df['Attachments'] = attachments_field
agg_df = agg_df.append(img_df)
There are two things:
When you don't get a match you shouldn't just pass the exception.
You should use a default value.
Don't append to your dataframe after each time through the loop.
That is slow. Keep everything in a dictionary, and then construct
the dataframe at the end.
E.g.
from collections import defaultdict
data = defaultdict(list)
for directory, _, files in os.walk(root):
for file in files:
filepath = os.path.join(directory, file)
print(file)
raw = parser.from_file(filepath)
img = raw['content']
img = img.replace('\n', '')
from_match = re.search(r'From:(.*?)Sent:', img)
if not from_match:
sent_by = None
else:
sent_by = from_match.group(1)
data["from"].append(sent_by)
to_match = re.search(r'Sent:(.*?)To:', img)
if not to_match:
sent_to = None
else:
sent_to = to_match.group(1)
data["to"].append(sent_to)
# All your other regexes
df = pd.DataFrame(data)
Also, if you're doing this for a lot of files you should look into using compiled expression.
I searched online but found nothing really helpful. I am trying to verify a file name. If that file name already exists, change the name slightly. For instance. Writing a file User.1.1.jpg. I want it to change to User.2.1.jpg if 1.1 already exists and so on.
import cv2
import os
cam = cv2.VideoCapture(0)
cam.set(3, 640)
cam.set(4, 480)
face_detector = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
#face_id = input('\n id: ')
print("\n [INFO] Initializing face capture. Look the camera and wait ...")
count = 1
face_id = 1
while(True):
ret, img = cam.read()
img = cv2.flip(img, 1)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_detector.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
cv2.rectangle(img, (x,y), (x+w,y+h), (255,0,0), 2)
count += 1
if os.path.exists("dataset/User.%s.1.jpg" % face_id):
face_id + 1
cv2.imwrite("dataset/User." + str(face_id) + '.' + str(count) + ".jpg", gray[y:y+h,x:x+w])
cv2.imshow('image', img)
k = cv2.waitKey(100) & 0xff
if k == 27:
break
elif count >= 30:
break
print("\n [INFO] Exiting Program and cleanup stuff")
cam.release()
cv2.destroyAllWindows()
You can use a while loop instead of an if statement to keep incrementing face_id until the target file name is found to be available.
Change:
if os.path.exists("dataset/User.%s.1.jpg" % face_id):
face_id + 1
to:
while os.path.exists("dataset/User.%s.1.jpg" % face_id):
face_id += 1
Here is a function I made to add an incrementing number to the end of the existing file name. You would only need to change the string manipulation depending on your desired new file name formatting.
def uniq_file_maker(file: str) -> str:
"""Create a unique file path"""
# get file name and extension
filename, filext = os.path.splitext(os.path.basename(file))
# get file directory path
directory = os.path.dirname(file)
# get file without extension only
filexx = str(directory + os.sep + filename)
# check if file exists
if Path(file).exists():
# create incrementing variable
i = 1
# determine incremented filename
while os.path.exists(f"{filexx} ({str(i)}){filext}"):
# update the incrementing variable
i += 1
# update file name with incremented variable
filename = directory + os.sep + filename + ' (' + str(i) + ')' + filext
return filename
Additionally, here is a similar function I made that does the same thing when creating a new directory.
def uniq_dir_maker(directoryname: str) -> str:
"""Create a unique directory at destination"""
# file destination select dialogue
Tk().withdraw() # prevent root window
# open file explorer folder select window
dirspath = filedialog.askdirectory(title='Select the output file save destination')
# correct directory file path
dirsavepath = str(dirspath + os.sep + directoryname)
# try to create directory
try:
# create directory at destination without overwriting
Path(dirsavepath).mkdir(parents=True, exist_ok=False)
# if directory already exists add incremental integers until unique
except FileExistsError:
# create incrementing variable
i = 1
# determine incremented filename
while os.path.exists(f"{dirsavepath} ({str(i)})"):
i += 1
# update directory path with incremented variable
dirsavepath = dirsavepath + ' (' + str(i) + ')'
# create now unique directory
Path(dirsavepath).mkdir(parents=True, exist_ok=False)
# add os separator to new directory for saving
savepath = dirsavepath + os.sep
return savepath
If you want to use pathlib library with adding suffix of datetime then it would be like that:
from pathlib import Path
from datetime import datetime
# path and file parameters
path = Path('/home/') #path to your files
file_name = 'file.txt' # your filename here
#
filename_full = path.joinpath(file_name)
if filename_full.is_file(): #checks whether the file of this name already exists
suffix = datetime.now().strftime("%Y%m%d_%H%M%S") #creates suffix with current date and exact time
print('The file exists. Im adding datetime {suffix} to filename')
file_name1 = filename_full.stem + suffix #adds suffix to filename
filename_full = path.joinpath(file_name1).with_suffix('.txt') #create full filename with path and the final suffix
print(filename_full)
I am trying to do a mass extraction of gps exif data, my code below:
from PIL import Image
from PIL.ExifTags import TAGS, GPSTAGS
def get_exif_data(image):
exif_data = {}
info = image._getexif()
if info:
for tag, value in info.items():
decoded = TAGS.get(tag, tag)
if decoded == "GPSInfo":
gps_data = {}
for t in value:
sub_decoded = GPSTAGS.get(t, t)
gps_data[sub_decoded] = value[t]
exif_data[decoded] = gps_data
else:
exif_data[decoded] = value
return exif_data
def _get_if_exist(data, key):
if key in data:
return data[key]
else:
pass
def get_lat_lon(exif_data):
gps_info = exif_data["GPSInfo"]
lat = None
lon = None
if "GPSInfo" in exif_data:
gps_info = exif_data["GPSInfo"]
gps_latitude = _get_if_exist(gps_info, "GPSLatitude")
gps_latitude_ref = _get_if_exist(gps_info, "GPSLatitudeRef")
gps_longitude = _get_if_exist(gps_info, "GPSLongitude")
gps_longitude_ref = _get_if_exist(gps_info, "GPSLongitudeRef")
if gps_latitude and gps_latitude_ref and gps_longitude and gps_longitude_ref:
lat = _convert_to_degrees(gps_latitude)
if gps_latitude_ref != "N":
lat = 0 - lat
lon = _convert_to_degrees(gps_longitude)
if gps_longitude_ref != "E":
lon = 0 - lon
return lat, lon
Code source
Which is run like:
if __name__ == "__main__":
image = Image.open("photo directory")
exif_data = get_exif_data(image)
print(get_lat_lon(exif_data)
This works fine for one photo, so I've used glob to iterate over all photos in a file:
import glob
file_names = []
for name in glob.glob(photo directory):
file_names.append(name)
for item in file_names:
if __name__ == "__main__":
image = Image.open(item)
exif_data = get_exif_data(image)
print(get_lat_lon(exif_data))
else:
pass
Which works fine, as long as every photo in the file is a) an image and b) has gps data. I have tried adding a pass in the _get_if_exist function as well as my file iteration, however, neither same to have had any impact and I'm still receiving KeyError: 'GPSInfo'
Any ideas on how I can ignore photos with no data or different file types?
A possible approach would be writing a small helper function that first checks, if the file is actually an image file and as a second step checks if the image contains EXIF data.
def is_metadata_image(filename):
try:
image = Image.open(filename)
return 'exif' in image.info
except OSError:
return False
I found that PIL does not work every time with .png files that do contain EXIF information when using _getexif(). So instead I check for the key exif in the info dictionary of an image.
I've tried this source code.
Simply you need to remove
gps_info = exif_data["GPSInfo"]
from the first line of get_lat_lon(exif_data) function, it works well for me.