Save doc file as pdf file using python - python

I want to save a doc file as pdf using python, I tried so many solution but I couldn't find the right one.
This is my code, I tried to make the output file as a pdf file but it didn't open. Any help is highly appreciated :
def replace_string(filenameInput, filenameOutput):
doc = Document(filenameInput)
for p in doc.paragraphs:
for d in J['data']:
if p.text.find(d['variable']) >= 0:
p.text = p.text.replace(d['variable'], d['value'])
doc.save(filenameOutput)
replace_string('test.docx', 'test2.pdf')

import docx2pdf
def convert_file(filenameInput, filenameOutput):
docx2pdf.convert(filenameInput, filenameOutput)
convert_file('test.docx', 'test2.pdf')
There is a Python package called docx2pdf. You can use it to simply convert docx file into pdf!
There is a link to the package! https://pypi.org/project/docx2pdf/

Related

I want to open a json file in python but got an error. It said No such file or directory

enter image description here
I wrote the code like this:
intents = json.loads(open('intents.json').read())
Check your intents.json file is in the same folder on which you python file is.
you can use, for example, the os builf-in module to check on the existence of file and os.path for path manipulation. Check the official doc at https://docs.python.org/3/library/os.path.html
import os
file = 'intents.json'
# location of the current directory
w_dir = os.path.abspath('.'))
if os.path.isfile(os.path.join(w_dir, file)):
with open(file, 'r') as fd:
fd.read()
else:
print('Such file does not exist here "{}"...'.format(w_dir))
You can try opening the file using the normal file operation and then use json.load or json.loads to parse the data as per your needs. I may be unfamiliar with this syntax to the best of my knowledge your syntax is wrong.
You can open the file like this:
f = open(file_name)
Then parse the data:
data = json.load(f)
You can refer to this link for more info and reference
https://www.geeksforgeeks.org/read-json-file-using-python/

How to open PDF file with Docx in Python?

I want to open a pdf file from my mac, but I get this error:
'This file can't be opened. It's possible damaged or has a document structure which Preview doesn't recognize.'
This is the code I'm using:
from docx import Document
#open the document
doc=Document('./testDoc.docx')
a = input('Whats your name ')
b = input('Whats your date of birth ')
Dictionary = {"name": a, "dob": b}
for i in Dictionary:
for p in doc.paragraphs:
if p.text.find(i)>=0:
p.text=p.text.replace(i,Dictionary[i])
#save changed document
doc.save('/my/path/contract{}.pdf'.format(a))
Does anyone know what is going wrong?
Unfortunately, I don't think the docx module works for pdfs--there's nothing in their documentation about it. But you can use the docx2pdf module instead: https://pypi.org/project/docx2pdf/
Here's the simple how-to that's in their documentation:
from docx2pdf import convert
convert("input.docx", "output.pdf")
docx module cannot convert word document to PDF.
You can use pywin32 module.
import win32com.client
def wordToPdf(input_path, output_path):
word = win32com.client.Dispatch("Word.Application")
doc = word.Documents.Open(str(input_path))
doc.SaveAs(str(output_path), FileFormat=17)
doc.Close()
word.Quit()

docx.opc.exceptions.PackageNotFoundError

I try to use docxtpl library. docxtpl Use example from documentation:
from docxtpl import DocxTemplate
doc = DocxTemplate("my_word_template.docx")
But there is an error Package not found at '%s'" % pkg_file. If I do this
import os.path
if os.path.isfile('my_word_template.docx'):
print ("File exist")
It is print File exist. File in the same directory as script. Also I tried to use absolute path to file, but that didn't help. In a source I found a place which calls this exception link. How can I fix it?
It probably indicates that the file is not a .docx file. Could you, please, check this file using function is_zipfile from module zipfile?
Try using python-docx by installing it with pip install python-docx.
Then, in you file, write something like this :
try:
document = docx.Document('your_doc_name.docx')
except:
document = docx.Document()
document.save('your_doc_name.docx')
print("Previous file was corrupted or didn't exist - new file was created.")

Python: Use Dropbox API - Save .ODT File

I'm using Dropbox API with Python. I don't have problems with Dropbox API, I make all the authentification steps without problems.
When I use this code:
pdf_dropbox = client.get_file('/Example.pdf')
new_file = open('/home/test.pdf','w')
new_file.write(pdf_dropbox.read())
I generate a file in the path /home/test.pdf, it's a PDF file and the content is displayed same as original.
But when I try same code with an .odt file, it fails generating the new file:
odt_dropbox = client.get_file('/Example.odt')
new_file = open('/home/test_odt.odt','w')
new_file.write(odt_dropbox.read())
This new file test_odt.odt has errors and I can't see it's content.
# With this instruction I have the content of the odt file inside odt_dropbox
odt_dropbox = client.get_file('/Example.odt')
Wich is the best way to save the content of an odt file ?
Is there a better way to write LibreOffice files ?
I'd appreciate any helpfull information,
Thanks
Solved, I forgot 2 things:
Open the file for binary writing wb instead of w
new_file = open('/home/test_odt.odt','wb')
Close the file after creation: new_file.close() to make the flush
Full Code:
odt_dropbox = client.get_file('/Example.odt')
new_file = open('/home/test_odt.odt','wb')
new_file.write(odt_dropbox.read())
new_file.close()

Search and Replace not working in header? Python docx

I'm using python-docx module to do some edits on a large number of documents. They all contain a header in which I need to replace a number, but everytime I do this the document won't open, with the error that the content is unreadable. Anyone have any ideas as to why this is happening, or sample working code snippets? Thanks.
from docx import *
#document = yourdocument.docx
filename = "NUR-ADM-2001"
relationships = relationshiplist()
document = opendocx("C:/Users/ai/My Documents/Nursing docs/" + filename + ".docx")
docbody = document.xpath('/w:document/w:body',namespaces=nsprefixes)[0]
advReplace(docbody, "NUR-NPM 101", "NUR-NPM 202")
# Create our properties, contenttypes, and other support files
coreprops = coreproperties(title='Nursing Doc',subject='Policies',creator='IA',keywords='Policy'])
appprops = appproperties()
contenttypes = contenttypes()
websettings = websettings()
wordrelationships = wordrelationships(relationships)
# Save our document
savedocx(document,coreprops,appprops,contenttypes,websettings, wordrelationships,"C:/Users/ai/My Documents/Nursing docs/" + filename + ".docx")
Edit: So it eventually can open the document, but it says some content cannot be displayed and the headers have vanished... thoughts?
I don't know this module, but in general you should not edit a file in place. Open file "A", write file "/tmp/A". Close both files and make sure you have no errors, then move "/tmp/A" to "A". Otherwise you risk clobbering your file if something goes wrong during the write.

Categories