Combine list of videos into one video - python

I have a list of videos (all .mp4), that I would like to combine to one large .mp4 video. The names of the files are as following: vid0.mp4, vid1.mp4, vid2.mp4, .... After searching, I found a Quora question which explains that the main file should be opened, then all sub files should be read (as bits) and then written. So here is my code:
import os
with open("MainVideo.mp4","wb") as f:
for video in os.listdir("/home/timmy/sd/videos/"):
temp=open('/home/timmy/sd/videos/%s'%video)
h=temp.read()
'''
for i in h:
f.write(i) #Error
'''
f.write(h)
temp.close()
This is only writing the first video. Is there a way to write it without using outside libraries? If not, please refer me to one.
I also tried the moviepy library but I get OSError.
code:
from moviepy.editor import VideoFileClip, concatenate_videoclips
li1=[]
for i in range(0,30):
name_of_file = "/home/timmy/sd/videos/vid%d.mp4"%i
clip = VideoFileClip(name_of_file)
#print(name_of_file)
li1.append(clip)
I get OSError after the 9th clip. (I think this is because of the size of the list.)

Related

Errno 20: Not a directory when saving into zip file

When I try to save a pyplot figure as a jpg, I keep getting a directory error saying that the given file name is not a directory. I am working in Colab. I have a numpy array called z_img and have opened a zip file.
import matplotlib.pyplot as plt
from zipfile import ZipFile
zipObj = ZipFile('slices.zip', 'w') # opening zip file
plt.imshow(z_img, cmap='binary')
The plotting works fine. I did a test of saving the image into Colab's regular memory like so:
plt.savefig(str(ii)+'um_slice.jpg')
And this works perfectly, except I am intending to use this code in a for loop. ii is an index to differentiate between each image, and several hundred images would be created so I want them going in the zipfile. Now when I try adding the path to the zipfile:
plt.savefig('/content/slices.zip/'+str(ii)+'um_slice.jpg')
I get: NotADirectoryError: [Errno 20] Not a directory: '/content/slices.zip/150500um_slice.jpg'
I assume it's because the {}.jpg string is a filename, and not a directory per se. But I am quite new to Python, and don't know how to get the plot into the zip file. That's all I want. Would love any advice!
First off, for anything that's not photographic content (ie. nice and soft), JPEG is the wrong format. You'll have a better time using a different file format. PNG is nice for pixels, SVG for vector graphics (in case you embed this in a website later!), PDF for vector, too.
The error message is quite on point: you cannot just save to a zip file as if it was a directory.
Multiple ways around:
use the tempfile module's mkdtemp to make a temporary directory, save into that, and zip the result
save not into a filename, but into a buffer (BytesIO I guess) and append that to the compressed stream (I'm not too familiar with ZipFile)
use PDF as output and simply generate a multipage PDF; it's not hard, and probably much nicer in the long term. You can still convert that vector graphic result to PNG (or any other pixel format9 as desired, but for the time being, it's space efficient, arbitrarily scaleable and keeps all your pages in one place. It's easy to import selected pages into LaTeX (matter of fact, \includegraphics does it directly) or into websites (pdf.js).
From the docs, matplotlib.pyplot.savefig accepts a binary file-like object. ZipFile.open creates binary file like objects. These two have to get todgether!
with zipobj.open(str(ii)+'um_slice.jpg', 'w') as fp:
plt.savefig(fp)

Append series of videos together in Python/OpenCV

I'm processing a video file and decided to split it up into equal chunks for parallel processing with each chunk running on its own process. I generate this series of video files that I want to connect together to make the original video.
I'm wondering what's the most efficient way of stringing these videos together without having to append frame by frame? (and ideally deleting the video files after they are read so I'm only left with one big video).
I wanted a programmatic solution oppose to a command. I found moviepy very useful for concatenating videos (its based on ffmpeg). Natsort is very useful for organizing the files by numerical order.
from moviepy.editor import VideoFileClip, concatenate_videoclips
from natsort import natsorted
#path is path to folder of videos
def concatVideos(path) :
currentVideo = None
#List all files in the directory and read points from text files one by one
for filePath in natsorted(os.listdir(path)):
if filePath.endswith(".mov"):
if currentVideo == None:
currentVideo = VideoFileClip(path + filePath)
continue
video_2 = VideoFileClip(path+filePath)
currentVideo = concatenate_videoclips([currentVideo,video_2])
currentVideo.write_videofile("export".mp4")

Remove all images from docx files

I've searched the documentation for python-docx and other packages, as well as stack-overflow, but could not find how to remove all images from docx files with python.
My exact use-case: I need to convert hundreds of word documents to "draft" format to be viewed by clients. Those drafts should be identical the original documents but all the images must be deleted / redacted from them.
Sorry for not including an example of things I tried, what I have tried is hours of research that didn't give any info. I found this question on how to extract images from word files, but that doesn't delete them from the actual document: Extract pictures from Word and Excel with Python
From there and other sources I've found out that docx files could be read as simple zip files, I don't know if that means that it's possible to "re-zip" without the images without affecting the integrity of the docx file (edit: simply deleting the images works, but prevents python-docx from continuing to work with this file because of missing references to images), but thought this might be a path to a solution.
Any ideas?
If your goal is to redact images maybe this code I used for a similar usecase could be useful:
import sys
import zipfile
from PIL import Image, ImageFilter
import io
blur = ImageFilter.GaussianBlur(40)
def redact_images(filename):
outfile = filename.replace(".docx", "_redacted.docx")
with zipfile.ZipFile(filename) as inzip:
with zipfile.ZipFile(outfile, "w") as outzip:
for info in inzip.infolist():
name = info.filename
print(info)
content = inzip.read(info)
if name.endswith((".png", ".jpeg", ".gif")):
fmt = name.split(".")[-1]
img = Image.open(io.BytesIO(content))
img = img.convert().filter(blur)
outb = io.BytesIO()
img.save(outb, fmt)
content = outb.getvalue()
info.file_size = len(content)
info.CRC = zipfile.crc32(content)
outzip.writestr(info, content)
Here I used PIL to blur images in some files, but instead of the blur filter any other suitable operation could be used. This worked quite nicely for my usecase.
I don't think it's currently implemented in python-docx.
Pictures in the Word Object Model are defined as either floating shapes or inline shapes. The docx documentation states that it only supports inline shapes.
The Word Object Model for Inline Shapes supports a Delete() method, which should be accessible. However, it is not listed in the examples of InlineShapes and there is also a similar method for paragraphs. For paragraphs, there is an open feature request to add this functionality - which dates back to 2014! If it's not added to paragraphs it won't be available for InlineShapes as they are implemented as discrete paragraphs.
You could do this with win32com if you have a machine with Word and Python installed.
This would allow you to call the Word Object Model directly, giving you access to the Delete() method. In fact you could probably cheat - rather than scrolling through the document to get each image, you can call Find and Replace to clear the image. This SO question talks about win32com find and replace:
import win32com.client
from os import getcwd, listdir
docs = [i for i in listdir('.') if i[-3:]=='doc' or i[-4:]=='docx'] #All Word file
FromTo = {"First Name":"John",
"Last Name":"Smith"} #You can insert as many as you want
word = win32com.client.DispatchEx("Word.Application")
word.Visible = True #Keep comment after tests
word.DisplayAlerts = False
for doc in docs:
word.Documents.Open('{}\\{}'.format(getcwd(), doc))
for From in FromTo.keys():
word.Selection.Find.Text = From
word.Selection.Find.Replacement.Text = FromTo[From]
word.Selection.Find.Execute(Replace=2, Forward=True) #You made the mistake here=> Replace must be 2
name = doc.rsplit('.',1)[0]
ext = doc.rsplit('.',1)[1]
word.ActiveDocument.SaveAs('{}\\{}_2.{}'.format(getcwd(), name, ext))
word.Quit() # releases Word object from memory
In this case since we want images, we would need to use the short-code ^g as the find.Text and blank as the replacement.
word.Selection.Find
find.Text = "^g"
find.Replacement.Text = ""
find.Execute(Replace=1, Forward=True)
I don't know about this library, but looking through the documentation I found this section about images. It mentiones that it is currently not possible to insert images other than inline. If that is what you currently have in your documents, I assume you can also retrieve these by looking in the Document object and then remove them?
The Document is explained here.
Although not a duplicate, you might also want to look at this question's answer where user "scanny" explains how he finds images using the library.

imread_collection There is no item

I am trying to read several images from archive with skimage.io.imread_collection, but for some reason it throws an error:
"There is no item named '00071198d059ba7f5914a526d124d28e6d010c92466da21d4a04cd5413362552/masks/*.png' in the archive".
I checked several times, such directory exists in archive and with *.png I just specify that I want to have all images in my collection, and imread_collection works well, when I am trying to download images not from archive, but from extracted folder.
#specify folder name
each_img_idx = '00071198d059ba7f5914a526d124d28e6d010c92466da21d4a04cd5413362552'
with zipfile.ZipFile('stage1_train.zip') as archive:
mask_ = skimage.io.imread_collection(archive.open(str(each_img_idx) + '/masks/*.png')).concatenate()
May some one explain me, what's going on?
Not all scikit-image plugins support reading from bytes, so I recommend using imageio. You'll also have to tell ImageCollection how to access the images inside the archive, which is done using a customized load_func:
from skimage import io
import imageio
archive = zipfile.ZipFile('foo.zip')
images = [f.filename for f in zf.filelist]
def zip_imread(fn):
return imageio.imread(archive.read(fn))
ic = io.ImageCollection(images, load_func=zip_imread)
ImageCollection has some benefits like not loading all images into memory at the same time. But if you simply want a long list of NumPy arrays, you can do:
collection = [imageio.imread(zf.read(f)) for f in zf.filelist]

Moviepy unable to read duration of file

I have been using Moviepy to combine several shorter video files into hour long files. Some small files are "broken", they contain video but was not completed correctly (i.e. they play with VLC but there is no duration and you cannot skip around in the video).
I noticed this issue when I try to create a clip using VideoFileClip(file) function. The error that comes up is:
MoviePy error: failed to read the duration of file
Is there a way to still read the "good" frames from this video file and then add them to the longer video?
UPDATE
To clarify, my issue specifically is with the following function call:
clip = mp.VideoFileClip("/home/test/"+file)
Stepping through the code it seems to be an issue when checking the duration of the file in ffmpeg_reader.py where it looks for the duration parameter in the video file. However, since the file never finished recording properly this information is missing. I'm not very familiar with the way video files are structured so I am unsure of how to proceed from here.
You're correct. This issue arises commonly when the video duration info is missing from the file.
Here's a thread on the issue: GitHub moviepy issue 116
One user proposed the solution of using MP4Box to convert the video using this guide: RASPIVID tutorial
The final solution that worked for me involved specifying the path to ImageMagick's binary file as WDBell mentioned in this post.
I had the path correctly set in my environment variables, but it wasn't till I specificaly defined it in config_defaults.py that it started working:
I solved it in a simpler way, with the help of VLC I converted the file to the forma MPEG4 xxx TV/device,
and you can now use your new file with python without any problem
xxx = 720p or
xxx = 1080p
everything depends on your choice on the output format
I already answered this question on the blog: https://github.com/Zulko/moviepy/issues/116
This issue appears when VideoFileClip(file) function from moviepy it looks for the duration parameter in the video file and it's missing. To avoid this (in those corrupted files cases) you should make sure that the total frames parameter is not null before to shoot the function: clip = mp.VideoFileClip("/home/test/"+file)
So, I handled it in a simpler way using cv2.
The idea:
find out the total frames
if frames is null, then call the writer of cv2 and generate a temporary copy of the video clip.
mix the audio from the original video with the copy.
replace the original video and delete copy.
then call the function clip = mp.VideoFileClip("/home/test/"+file)
Clarification: Since OpenCV VideoWriter does not encode audio, the new copy will not contain audio, so it would be necessary to extract the audio from the original video and then mix it with the copy, before replacing it with the original video.
You must import cv2
import cv2
And then add something like this in your code before the evaluation:
cap = cv2.VideoCapture("/home/test/"+file)
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
print(f'Checking Video {count} Frames {frames} fps: {fps}')
This will surely return 0 frames but should return at least framerate (fps).
Now we can set the evaluation to avoid the error and handle it making a temp video:
if frames == 0:
print(f'No frames data in video {file}, trying to convert this video..')
writer = cv2.VideoWriter("/home/test/fixVideo.avi", cv2.VideoWriter_fourcc(*'DIVX'), int(cap.get(cv2.CAP_PROP_FPS)),(int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)),int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))))
while True:
ret, frame = cap.read()
if ret is True:
writer.write(frame)
else:
cap.release()
print("Stopping video writer")
writer.release()
writer = None
break
Mix the audio from the original video with the copy. I have created a function for this:
def mix_audio_to_video(pathVideoInput, pathVideoNonAudio, pathVideoOutput):
videoclip = VideoFileClip(pathVideoInput)
audioclip = videoclip.audio
new_audioclip = CompositeAudioClip([audioclip])
videoclipNew = VideoFileClip(pathVideoNonAudio)
videoclipNew.audio = new_audioclip
videoclipNew.write_videofile(pathVideoOutput)
mix_audio_to_video("/home/test/"+file, "/home/test/fixVideo.avi", "/home/test/fixVideo.mp4")
replace the original video and delete copys:
os.replace("/home/test/fixVideo.mp4", "/home/test/"+file)
I had the same problem and I have found the solution.
I don't know why but if we enter the path in this method path = r'<path>' instead of ("F:\\path") we get no error.
Just click on the
C:\Users\gladi\AppData\Local\Programs\Python\Python311\Lib\site-packages\moviepy\video\io\ffmpeg_reader.py
and delete the the code and add this one
Provided by me in GITHUB - https://github.com/dudegladiator/Edited-ffmpeg-for-moviepy
clip1=VideoFileClip('path')
c=clip1.duration
print(c)

Categories