So I'm a beginning programmer, and python is my first language. I'm trying to write a script that will open a random PDF from a directory and select a random page from that PDF to read. When I run my script I get the error code IO ERROR: [Errno 2] and then displays the title of the selected PDF. How can I fix this? I am using the pyPdf module. Are there any other problems in the code you can see?
import os, random, pyPdf
from pyPdf import PdfFileReader
b = random.choice(os.listdir("/home/illtic/PDF"))
pdf_toread = pyPdf.PdfFileReader(open(b, 'r'))
last_page = pdf_toread.getNumPages() - 1
page_one = pdf_toread.getPage(random.randint(0, last_page))
print " %d " % page_one
what value does b have? I am pretty sure that it is just the filename without the path. Try adding the path in front of the filename and it should be ok.
pdf_toread = pyPdf.PdfFileReader(open('/home/illtic/PDF/' + b, 'r'))
Related
import os
import glob
import comtypes.client
from PyPDF2 import PdfFileMerger
def docxs_to_pdf():
"""Converts all word files in pdfs and append them to pdfslist"""
word = comtypes.client.CreateObject('Word.Application')
pdfslist = PdfFileMerger()
x = 0
for f in glob.glob("*.docx"):
input_file = os.path.abspath(f)
output_file = os.path.abspath("demo" + str(x) + ".pdf")
# loads each word document
doc = word.Documents.Open(input_file)
doc.SaveAs(output_file, FileFormat=16+1)
doc.Close() # Closes the document, not the application
pdfslist.append(open(output_file, 'rb'))
x += 1
word.Quit()
return pdfslist
def joinpdf(pdfs):
"""Unite all pdfs"""
with open("result.pdf", "wb") as result_pdf:
pdfs.write(result_pdf)
def main():
"""docxs to pdfs: Open Word, create pdfs, close word, unite pdfs"""
pdfs = docxs_to_pdf()
joinpdf(pdfs)
main()
I am using jupyter notebook and it throw an error what should I do :
this is error message
I am going to convert many .doc file to one pdf. Help me I am beginner in this field.
Make sure you have all the dependencies installed in your environment. You can use pip to install comtypes.client, simply pass this in your terminal:
pip install comtypes
You can download _ctypes from sourceforge:
https://sourceforge.net/projects/ctypes/files/ctypes/1.0.2/ctypes-1.0.2.tar.gz/download?use_mirror=deac-fra
Using docx2pdf does seem easier for your task though. After you converted the files you can use PyPDF2 to append them.
I am trying to delete a duplicated image by comparing md5 file hash.
my code is
from PIL import Image
import hashlib
import os
import sys
import io
img_file = urllib.request.urlopen(img_url, timeout=30)
f = open('C:\\Users\\user\\Documents\\ + img_name, 'wb')
f.write(img_file.read())
f.close # subject image, status = ok
im = Image.open('C:\\Users\\user\\Documents\\ + img_name)
m = hashlib.md5() # get hash
with io.BytesIO() as memf:
im.save(memf, 'PNG')
data = memf.getvalue()
m.update(data)
md5hash = m.hexdigest() # hash done, status = ok
im.close()
if md5hash in hash_list[name]: # comparing hash
os.remove('C:\\Users\\user\\Documents\\ + img_name) # delete file, ERROR
else:
hash_list[name].append(m.hexdigest())
and i get this error
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process:
'C:\\Users\\user\\Documents\\myimage.jpg'
I tried admin command prompt, but still getting this error. Could you find what is accessing the file?
Just noticed you're using f.close instead of f.close()
Add () and check if problem still occurs.
Cheers ;)
Your issue has indeed been as Adrian Daniszewski said, however, there are quite few more programming problems with your code.
First of all, you should familiarize yourself with with. You use with for BytesIO() but it can also be used for opening files.
The benefit of with open(...) as f: is the fact that you don't have to search whether you closed the file or remember to close it. It will close the file at the end of its indentation.
Second, there is a bit of duplication in your code. Your code should be DRY to avoid being forced to change multiple locations with the same stuff.
Imagine having to change where you save the byte files. Right now you will be forced to change in three different locations.
Now imagine not noticing one of these locations.
My suggestion would be first of all to save the path to a variable and use that -
bytesimgfile = 'C:\\Users\\user\\Documents\\' + img_name
An example to use with in your code would be like this:
with open(bytesimgfile , 'wb') as f:
f.write(img_file.read())
A full example with your given code:
from PIL import Image
import hashlib
import os
import sys
import io
img_file = urllib.request.urlopen(img_url, timeout=30)
bytesimgfile = 'C:\\Users\\user\\Documents\\' + img_name
with open(bytesimgfile , 'wb'):
f.write(img_file.read())
with Image.open(bytesimgfile) as im:
m = hashlib.md5() # get hash
with io.BytesIO() as memf:
im.save(memf, 'PNG')
data = memf.getvalue()
m.update(data)
md5hash = m.hexdigest() # hash done, status = ok
if md5hash in hash_list[name]: # comparing hash
os.remove(bytesimgfile) # delete file, ERROR
else:
hash_list[name].append(m.hexdigest())
I am having trouble creating and writing to a text file in Python. I am running Python 3.5.1 and have the following code to try and create and write to a file:
from os import *
custom_path = "MyDirectory/"
if not path.exists(custom_path)
mkdir(custom_path)
text_path = custom_path + "MyTextFile.txt"
text_file = open(text_path, "w")
text_file.write("my text")
But I get a TypeError saying an integer is required (got type str) at the line text_file = open(text_path, "w").
I don't know what I'm doing wrong as my code is just about identical to that of several tutorial sites showing how to create and write to files.
Also, does the above code create the text file if it doesn't exist, and if not how do I create it?
Please don't import everything from os module:
from os import path, mkdir
custom_path = "MyDirectory/"
if not path.exists(custom_path):
mkdir(custom_path)
text_path = custom_path + "MyTextFile.txt"
text_file = open(text_path, 'w')
text_file.write("my text")
Because there also a "open" method in os module which will overwrite the native file "open" method.
i'm trying to make some program in python to manipulate my pdf beamer presentations. Professor use on click dynamic transition so one page has several click transitions. I want to print those presentations but i have around 5000 pages. So i want to use just the last click transition page, so i will minimize number of pages to around 500. I'm using PyPDF2 module but it not makes valid pdf file. Here's the code:
from pyPdf import PdfFileWriter, PdfFileReader
import os,sys
pdful = raw_input("Uneti ime fajla:")
output = PdfFileWriter()
input1 = PdfFileReader(open(pdful, "rb"))
m = []
f = True
print ("Uneti strane koje zelite da zadrzite.String 0 kraj unosa:\n")
while f:
l = int(raw_input("Uneti broj stranice:"))
if l == 0:
f = not f
else: m.append(l-1)
for i in range(len(m)):
strana = input1.getPage(int(m[i]))
output.addPage(strana)
outputStream = file("Mat8.pdf","wb")
output.write(outputStream)
# string writings are in Serbian, but that's not so important. Program should take input from user: name of file to manipulate, and pages that should copy.
from pyPdf import PdfFileWriter, PdfFileReader pyPdf is discontinued already and is succeeded by PyPDF2. I am not sure about Python 2, but in Python 3 you should import PyPDF2.
No need to import os, sys. However, you can call python3 xyz.py some_arg in bash if you did use sys.argv. This way sys.argv[1] == some_arg
I would prefer using maps instead, as long as you don't need to read input line by line. For example,
print ("Uneti strane koje zelite da zadrzite.String 0 kraj unosa:\n")
m = map (lambda x: int(x) - 1, raw_input("Uneti broj stranice:").split())
Instead of the while loop. Also, iterate over objects instead of indices.
for page_number in m:
strana = input1.getPage(page_number)
output.addPage(strana)
Finally, use with to enclose file operations. Python will automatically handle closing of the file, lest you forget to do so.
with open (pdful, 'wb') as outputStream:
output.write(outputStream)
I am wondering how I can make my script save to the Desktop. Here's my code:
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
from reportlab.platypus import Image
import csv
import os
data_file = "hata.csv"
def import_data(data_file):
inv_data = csv.reader(open(data_file, "r"))
for row in inv_data:
var1 = row[0]
# do more stuff
pdf_file = os.path.abspath("~/Desktop/%s.pdf" % var1)
generate_pdf(variable, pdf_file)
def generate_pdf(variable, file_name):
c = canvas.Canvas(file_name, pagesize=letter)
# do some stuff with my variables
c.setFont("Helvetica", 40, leading=None)
c.drawString(150, 2300, var1)
c.showPage()
c.save()
import_data(data_file)
So this works perfectly, and it saves/creates the PDF I want -- but in the directory of the script. I would instead like to save it to, say, the Desktop.
When I researched and found os.path.abspath, I thought I solved it; but I receive the following error
File "/usr/local/lib/python3.4/site-packages/reportlab/pdfbase/pdfdoc.py", line 218, in SaveToFile
f = open(filename, "wb")
FileNotFoundError: [Errno 2] No such file or directory: '/Users/TARDIS/Desktop/tests/~/Desktop/00001.pdf'
which tells me that it's trying to save starting from my script's home folder. How do I get it to see outside of that?
After much trial and error using different methods that all had drawbacks, I came up with a solution and figured I'd post it here for posterity. I'm rather new to programming so apologies if this is obvious to the more experienced.
First, I give my pdf file a name:
pdf_name = number + ".pdf"
Then, I find the path to the Desktop for current user (given that I don't know what the user name will be, which was the original root of the problem) and create a path to it so that the pdf can be to be saved there.
save_name = os.path.join(os.path.expanduser("~"), "Desktop/", pdf_name)
Finally, that's passed in to my pdf generation function:
...
save_name = ....
generate_pdf(variable, save_name)
def generate_pdf(variable, save_name):
c = canvas.Canvas(save_name, pagesize=letter)
....
And that's it.