replace downloaded youtube video content with a bold centered text - python

I have zero experience with python, but it is clear enough (for most of the code).
There is this code:
from moviepy.editor import *
video = VideoFileClip("myHolidays.mp4").subclip(50,60)
# Make the text. Many more options are available.
txt_clip = ( TextClip("My Holidays 2013",fontsize=70,color='white')
.set_position('center')
.set_duration(10) )
result = CompositeVideoClip([video, txt_clip]) # Overlay text on video
result.write_videofile("myHolidays_edited.webm",fps=25) # Many options...
What I want to do:
replace my whole video with a text centered and bold (maybe with some little effects) and some solid color background
How do I do that?
A* if I delete the "subclip(50,60)" part, will that select the whole clip?
B* And if I delete ".set_duration(10)", will the rest of the code work?
C* how do I delete the whole (previous) video content?
D* please suggest a simple pro effect (for text)

If you are trying to create a video of just the text the same length as the original video just find the length of the whole video, set the length of the text clip to the same and then txt_clip.write(...

Related

How to filter PDF text by font?

PDF example
A PDF may contain multiple fonts, how can I only keep 1 font with the most words with Python?
disclaimer: I am the author of borb (the library I will use in this example)
Oddly enough, there is an fairly close match example in the borb examples repository for filtering by font. You can find that example here.
In this example, we extract all the text in a particular font in the PDF (e.g. all text written in Courier).
You can easily base yourself on this code to build something that checks the number of characters for each particular font (and at a later stage, return only the font with the most characters).
I'll repeat the example here for completeness:
import typing
from borb.pdf.document.document import Document
from borb.pdf.pdf import PDF
from borb.toolkit.text.font_name_filter import FontNameFilter
from borb.toolkit.text.simple_text_extraction import SimpleTextExtraction
def main():
# create FontNameFilter
l0: FontNameFilter = FontNameFilter("Courier")
# filtered text just gets passed to SimpleTextExtraction
l1: SimpleTextExtraction = SimpleTextExtraction()
l0.add_listener(l1)
# read the Document
doc: typing.Optional[Document] = None
with open("output.pdf", "rb") as in_file_handle:
doc = PDF.loads(in_file_handle, [l0])
# check whether we have read a Document
assert doc is not None
# print the names of the Fonts
print(l1.get_text_for_page(0))
if __name__ == "__main__":
main()
Aside from the imports, everything is quite straightforward. You specify the string of the font you want to filter on. This filter object will process the parsing/rendering of the PDF, and will only push events to its children if they are relevant (if the font information matches).
We add SimpleTextExtraction as its child, and so doing only get the text which is rendered in the desired font.
After we've set up this entire thing, we need to actually process (parse) the Document which is what happens in the next lines.
Some caveats:
PDF documents might contain so-called 'subset fonts'. This is when a font is artificially made smaller by throwing out unused letters. ie if a PDF never uses the 'uppercase X' letter then the font does not need to store information on how to render it. Typically, the names of subset fonts are not the same as those of their original font. You might get something like Courier+AEOKFF.
If this happens to be the case, check out the code of FontNameFilter and make another version that only checks the name using startswith, which out to do the trick.

Can python-docx preserve font color and styles when importing documents?

Essentially what I need to do is write a program that takes in many .docx files and puts them all in one, ordered in a certain way. I have importing working via:
import docx, os, glob
finaldocname = 'Midterm-All-Questions.docx'
finaldoc=docx.Document()
docstoworkon = glob.glob('*.docx')
if finaldocname in docstoworkon:
docstoworkon.remove(finaldocname) #dont process final doc if it exists
for f in docstoworkon:
doc=docx.Document(f)
fullText=[]
for para in doc.paragraphs:
fullText.append(para.text) #generates a long text list
# finaldoc.styles = doc.styles
for l in fullText:
# if l=='u\'\\n\'':
if '#' in l:
print('We got here!')
if '#1 ' not in l: #check last two characters to see if this is the first question
finaldoc.add_section() #only add a page break between questions
finaldoc.add_paragraph(l)
# finaldoc.add_page_break
# finaldoc.add_page_break
finaldoc.save(finaldocname)
But I need to preserve text styles, like font colors, sizes, italics, etc., and they aren't in this method since it just gets the raw text and dumps it. I can't find anything on the python-docx documentation about preserving text styles or importing in something other than raw text. Does anyone know how to go about this?
Styles are a bit difficult to work with in python-docx but it can be done.
See this explanation first to understand some of the problems with styles and Word.
The Long Way
When you read in a file as a Document() it will bring in all of the paragraphs and within each of these are the runs. These runs are chunks of text with the same style attached to them.
You can find out how many paragraphs or runs there are by doing len() on the object or you can iterate through them like you did in your example with paragraphs.
You can inspect the style of any given paragraph but runs may have different styles than the paragraph as a whole, so I would skip to the run itself and inspect the style there using paragraphs[0].runs[0].style which will give you a style object. You can inspect the font object beyond that which will tell you a number of attributes like size, italic, bold, etc.
Now to the long solution:
You first should create a new blank paragraph, then you should go and add_run() one by one with your text from your original. For each of these you can define a style attribute but it would have to be a named style as described in the first link. You cannot apply a stlye object directly as it won't copy the attributes over. But there is a way around that: check the attributes that you care about copying to the output and then ensure your new run applies the same attributes.
doc_out = docx.Document()
for para in doc.paragraphs:
p = doc_out.add_paragraph()
for run in para.runs:
r = p.add_run(run.text)
if run.bold:
r.bold = True
if run.italic:
r.italic = True
# etc
Obviously this is inefficient and not a great solution, but it will work to ensure you have copied the style appropriately.
Add New Styles
There is a way to add styles by name but because it isn't likely that the Word document you are getting the text and styles from is using named styles (rather than just applying bold, etc. to the words that you want), it is probably going to be a long road to adding a lot of slightly different styles or sometimes even the same ones.
Unfortunately that is the best answer I have for you on how to do this. Working with Word, Outlook, and Excel documents is not great in Python, especially for what you are trying to do.

How can I accurately set the new cursor positions after text replacements have been made

I am trying to adapt a plugin for automated text replacement in a Sublime Text 3 Plugin. What I want it to do is paste in text from the clipboard and make some automatic text substitutions
import sublime
import sublime_plugin
import re
class PasteAndEscapeCommand(sublime_plugin.TextCommand):
def run(self, edit):
# Position of cursor for all selections
before_selections = [sel for sel in self.view.sel()]
# Paste from clipboard
self.view.run_command('paste')
# Postion of cursor for all selections after paste
after_selections = [sel for sel in self.view.sel()]
# Define a new region based on pre and post paste cursor positions
new_selections = list()
delta = 0
for before, after in zip(before_selections, after_selections):
new = sublime.Region(before.begin() + delta, after.end())
delta = after.end() - before.end()
new_selections.append(new)
# Clear any existing selections
self.view.sel().clear()
# Select the saved region
self.view.sel().add_all(new_selections)
# Replace text accordingly
for region in self.view.sel():
# Get the text from the selected region
text = self.view.substr(region)
# Make the required edits on the text
text = text.replace("\\","\\\\")
text = text.replace("_","\\_")
text = text.replace("*","\\*")
# Paste the text back to the saved region
self.view.replace(edit, region, text)
# Clear selections and set cursor position
self.view.sel().clear()
self.view.sel().add_all(after_selections)
This works for the most part except I need to get the new region for the edited text. The cursor will be placed to the location of the end of the pasted text. However since I am making replacements which always make the text larger the final position will be inaccurate.
I know very little about Python for Sublime and like most others this is my first plugin.
How do I set the cursor position to account for the size changes in the text. I know I need to do something with the after_selections list as I am not sure how to create new regions as they were created from selections which are cleared in an earlier step.
I feel that I am getting close with
# Add the updated region to the selection
self.view.sel().subtract(region)
self.view.sel().add(sublime.Region(region.begin()+len(text)))
This, for some yet unknown to me reason, places the cursor at the beginning and end of the replaced text. A guess would be that I am removing the regions one by one but forgetting some "initial" region that also exists.
Note
I am pretty sure the double loop in the code in the question here is redundant. but that is outside the scope of the question.
I think your own answer to your question is a good one and probably the way I would go if I was to do something like this in this manner.
In particular, since the plugin is modifying the text on the fly and making it longer, the first way that immediately presents itself as a solution other than what your own answer is doing would be to track the length change of the text after the replacements so you can adjust the selections accordingly.
Since I can't really provide a better answer to your question than the one you already came up with, here's an alternative solution to this instead:
import sublime
import sublime_plugin
class PasteAndEscapeCommand(sublime_plugin.TextCommand):
def run(self, edit):
org_text = sublime.get_clipboard()
text = org_text.replace("\\","\\\\")
text = text.replace("_","\\_")
text = text.replace("*","\\*")
sublime.set_clipboard(text)
self.view.run_command("paste")
sublime.set_clipboard(org_text)
This modifies the text on the clipboard to be quoted the way you want it to be quoted so that it can just use the built in paste command to perform the paste.
The last part puts the original clipboard text back on the clipboard, which for your purposes may or may not be needed.
So, one approach for this would be to make new regions as the replaced text is created using their respective lengths as starting positions. Then once the loop is complete clear all existing selections and set the new one we created in the replacement loop.
# Replace text accordingly
new_replacedselections = list()
for region in self.view.sel():
# Get the text from the selected region
text = self.view.substr(region)
# Make the required edits on the text
text = text.replace("\\","\\\\") # Double up slashes
text = text.replace("*","\\*") # Escape *
text = text.replace("_","\\_") # Escape _
# Paste the text back to the saved region
self.view.replace(edit, region, text)
# Add the updated region to the collection
new_replacedselections.append(sublime.Region(region.begin()+len(text)))
# Set the selection positions after the new insertions.
self.view.sel().clear()
self.view.sel().add_all(new_replacedselections)

Text boxes in xmgrace (preferably with GracePlot.py)

I am currently plotting figures with xmgrace from python using GracePlot.py and I would like to make text annotations in the graph and place them inside a box, in order to make the reading easy when the grid is on.
Does anybody know how to do it with GracePlot.py? Or from xmgrace GUI?
The code I use is similar to the following:
import GracePlot as xg
import math
from numpy import arange
x=arange(0,10,0.1)
y=[math.exp(-q) for q in x]
grace=xg.GracePlot()
graph=grace[0]
data=xg.Data(x=x,y=y)
graph.plot(data)
graph.text('This should be placed inside a box',5,0.5)
I had a quick look through the latest GracePlot module source code. It seems that the author has not yet implemented the capability to make boxes.
Normally, the "Box" tool can be found under "Drawing Objects" when using the Grace/xmgrace GUI. Create a box and save the project, then view it in a text editor as the file is saved in an ASCII format. The following section can be found:
#with box
# box on
# box loctype view
# box 0.340196078431, 0.691176470588, 0.619607843137, 0.513725490196
# box linestyle 1
# box linewidth 1.0
# box color 1
# box fill color 1
# box fill pattern 0
#box def
As you can see by comparing similar chunks for text creation etc. with the source code, the GracePlot module is just printing similar commands for each of the things it is generating. It would be quite easy to add the capability to make boxes. Perhaps you have time yourself? :)
The capability for text has been implemented:
from GracePlot import *
p = GracePlot()
[....]
p.text('Hello, world!', 0.5, 0.4, color=violet, charsize=1.2)
will place some violet-colored text at (0.5, 0.4) with 1.2 character size.
Boxes in Grace do not have their own text, so in order to solve your question you can simply place a text object over the box you created.

cut parts of a video using gstreamer/Python (gnonlin?)

I have a video file and I'd like to cut out some scenes (either identified by a time position or a frame). As far as I understand that should be possible with gnonlin but so far I wasn't able to find a sample how to that (ideally using Python). I don't want to modify the video/audio parts if possible (but conversion to mp4/webm would be acceptable).
Am I correct that gnonlin is the right component in the gstreamer universe to do that? Also I'd be glad for some pointers/recipes how to approach the problem (gstreamer newbie).
Actually it turns out that "gnonlin" is too low-level and still requires a lot of gstreamer knowledge. Luckily there is "gstreamer-editing-services" (gst-editing-services) which is a
library offering a higher level API on top of gstreamer and gnonlin.
With a tiny bit of RTFM reading and a helpful blog post with a Python example I was able to solve my basic problem:
Load the asset (video)
Create a Timeline with a single layer
add the asset multiple times to the layer, adjusting start, inpoint and duration so only the relevant parts of a video are present in the output video
Most of my code is directly taken from the referenced blog post above so I don't want to dump all of that here. The relevant stuff is this:
asset = GES.UriClipAsset.request_sync(source_uri)
timeline = GES.Timeline.new_audio_video()
layer = timeline.append_layer()
start_on_timeline = 0
start_position_asset = 10 * 60 * Gst.SECOND
duration = 5 * Gst.SECOND
# GES.TrackType.UNKNOWN => add every kind of stream to the timeline
clip = layer.add_asset(asset, start_on_timeline, start_position_asset,
duration, GES.TrackType.UNKNOWN)
start_on_timeline = duration
start_position_asset = start_position_asset + 60 * Gst.SECOND
duration = 20 * Gst.SECOND
clip2 = layer.add_asset(asset, start_on_timeline, start_position_asset,
duration, GES.TrackType.UNKNOWN)
timeline.commit()
The resulting video includes the segments 10:00–10:05 and 11:05-11:25 so essentially there are two cuts: One in the beginning and one in the middle.
From what I have seen this worked perfectly fine, audio and video in sync, no worries about key frames and whatnot. The only part left is to find out if I can translate the "frame number" into a timing reference for gst editing services.

Categories