Open pdf files format with python - python

I try to open 100 pdf files with python 2.7 with this code:
import arcpy,fnmatch,os
rootPath = r"D:\desktop"
pattern = '*.pdf'
counter = 0
for root, dirs, files in os.walk(rootPath):
for filename in fnmatch.filter(files, pattern):
os.startfile(rootPath)
counter = counter + 1
print counter
as a result the rootPath folder opened and python print the number of pdf files:
>>>
39
>>>
No pdf files opened. I search in the forum and didn't find any question with answers to my request. Thanks for any help

I don't know what are you trying to do, but os.startfile will open up adobe pdf reader (or any other reader that's set as default reader)... here how i managed to do that and it seems to be working.
import os
rootPath = "D:\\desktop"
counter = 0
for file in os.listdir(rootPath):
if file.endswith('.pdf'):
os.startfile("%s/%s" %(rootPath, file))
counter = counter + 1
print counter
or without much editing your main code
import arcpy,fnmatch,os
rootPath = r"D:\desktop"
pattern = '*.pdf'
counter = 0
for root, dirs, files in os.walk(rootPath):
for filename in fnmatch.filter(files, pattern):
os.startfile("%s/%s" %(rootPath,filename))
counter = counter + 1
print counter

Your are always calling
os.startfile(rootPath)
where rootPath is only "D:\desktop". You must call os.startfile with the path to the PDF file as the argument.
os.startfile("{}/{}".format(rootPath, file))

Related

The system cannot find the file specified windows error python

I want to rename all the files in test folder as 1, 2, 3 and so on
import os, sys, path
path = r"F:\test"
dirs = os.listdir(path)
print(dirs)
count = 1
for files in dirs:
str1 = str(count)
os.rename(files, str1)
count += 1
but my code giving me this error:
WindowsError: [Error 2] The system cannot find the file specified
dirs is a list of paths, and iterating through it won't give you the contents of the directories. You would need another os.listdir for that.
Also, to rename the files, you have to go through each of them.
A better solution would've been:
import os
count = 1
path = r"F:\test"
for root, dirs, files in os.walk(path):
for filename in files:
os.rename(os.path.join(root, filename), os.path.join(root, str(count)))
count += 1
Just add one line to change bthe current working directory.
import os, sys, path
path = r"F:\test"
dirs = os.listdir(path)
os.chdir(path) # Change the current working directory
print(dirs)
count = 1
for files in dirs:
str1 = str(count)
os.rename(files, str1)
count += 1

Python - Iterating over all text files recursively

I am creating a text parser with python 3.6. I have a file layout like below:
(The real file structure I will be using is much more extensive than this.)
-Directory(main folder)
-amerigroup.txt
-bcbs.txt
childfolder
-medicare.txt
I need to extract text into 2 different lists (going through and appending to my ever-growing lists). Whenever I run my current code, I can't seem to get my program to open up my medicare.txt file to read and extract the information. I get an error stating that there is no such file or directory: 'medicare.txt'.
My goal is to get the data from the 3 files and extract it in one go. How do I get the amerigroup and bcbs data then go into the childfolder and get medicare.txt, then repeat that for all branches of my file path?
I am simply trying to open and close my text files in this code snippet. Here's what I have so far:
import re
import os
import pandas as pd
#change active directory
os.chdir(r'\\company\Files\HomeDrive\user\My Documents\claimstest')
#rootdir = r'\\company\Files\HomeDrive\user\My Documents\claimstest'
#set up Regular Expression objects to parse X12
claimidRegex = re.compile(r'(CLM\*)(\d+)')
dxRegex = re.compile(r'(ABK:)(\w\d+)(\*|~)(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?')
claimids = []
dxinfo = []
for dirpath, dirnames, files in os.walk(topdir):
for name in files:
cid = []
dx = []
if name.lower().endswith(exten):
data = open(name, 'r')
data.close()
Thank you so much for taking your time to assist me on this!
edit: I have tried using walk to no avail so far. My most recent attempt (I tried using txtfile_full_path as well--did not work):
for dirpath, dirnames, filename in os.walk(base_dir):
for filename in filename:
#defining file type
txtfile=open(filename,"r")
txtfile_full_path = os.path.join(dirpath, filename)
print(filename)
edit2 for anyone interested. This was my final solution to the problem:
import re
import os
import pandas as pd
#change active directory
os.chdir(r'\\company\Files\HomeDrive\user\My Documents\claimstest')
base_dir = (r'\\company\Files\HomeDrive\user\My Documents\claimstest')
#set up Regular Expression objects to parse X12
claimidRegex = re.compile(r'(CLM\*)(\d+)')
dxRegex = re.compile(r'(ABK:)(\w\d+)(\*|~)(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?(ABF:)?(\w\d+)?(\*|~)?')
claimids = []
dxinfo = []
for dirpath, dirnames, filename in os.walk(base_dir):
for filename in filename:
txtfile_full_path = os.path.join(dirpath, filename)
x12 = open(txtfile_full_path, 'r')
for i in x12:
match = claimidRegex.findall(i)
for word in match:
claimids.append(word[1])
x12.seek(0)
for i in x12:
match = dxRegex.findall(i)
for word in match:
dxinfo.append(word)
x12.close()
datadic = dict(zip(claimids, dxinfo))
You need to pass the full path to open. Just creating a string variable somewhere won't do anything for you! So the following should avoid your error:
txt_list = []
for dirpath, dirnames, filename in os.walk(base_dir):
for filename in filename:
# create full path
txtfile_full_path = os.path.join(dirpath, filename)
with open(txtfile_full_path) as f:
txt_list.append(f.read())
It should be easy enough to integrate the segregation based on your regexes now...

Python - Renaming all files in a directory using a loop

I have a folder with images that are currently named with timestamps. I want to rename all the images in the directory so they are named 'captured(x).jpg' where x is the image number in the directory.
I have been trying to implement different suggestions as advised on this website and other with no luck. Here is my code:
path = '/home/pi/images/'
i = 0
for filename in os.listdir(path):
os.rename(filename, 'captured'+str(i)+'.jpg'
i = i +1
I keep getting an error saying "No such file or directory" for the os.rename line.
The results returned from os.listdir() does not include the path.
path = '/home/pi/images/'
i = 0
for filename in os.listdir(path):
os.rename(os.path.join(path,filename), os.path.join(path,'captured'+str(i)+'.jpg'))
i = i +1
The method rename() takes absolute paths, You are giving it only the file names thus it can't locate the files.
Add the folder's directory in front of the filename to get the absolute path
path = 'G:/ftest'
i = 0
for filename in os.listdir(path):
os.rename(path+'/'+filename, path+'/captured'+str(i)+'.jpg')
i = i +1
Two suggestions:
Use glob. This gives you more fine grained control over filenames and dirs to iterate over.
Use enumerate instead of manual counting the iterations
Example:
import glob
import os
path = '/home/pi/images/'
for i, filename in enumerate(glob.glob(path + '*.jpg')):
os.rename(filename, os.path.join(path, 'captured' + str(i) + '.jpg'))
This will work
import glob2
import os
def rename(f_path, new_name):
filelist = glob2.glob(f_path + "*.ma")
count = 0
for file in filelist:
print("File Count : ", count)
filename = os.path.split(file)
print(filename)
new_filename = f_path + new_name + str(count + 1) + ".ma"
os.rename(f_path+filename[1], new_filename)
print(new_filename)
count = count + 1
the function takes two arguments your filepath to rename the file and your new name to the file

python count size specific type file txt in a directory

In my folder, there are two types of files: html and txt.
I want to know the total size of the txt files.
I found this code, but how do I apply it for my needs?
import os
from os.path import join, getsize
size = 0
count = 0
for root, dirs, files in os.walk(path):
size += sum(getsize(join(root, name)) for name in files)
count += len(files)
print count, size
You can qualify which files by adding an if to the comprehensions like:
for root, dirs, files in os.walk(path):
size += sum(getsize(join(root, name)) for name in files if name.endswith('.txt'))
count += sum(1 for name in files if name.endswith('.txt'))
print count, size
better use glob (https://docs.python.org/3/library/glob.html) instead of os to find your files. that makes it imho more readable.
import glob
import os
path = '/tmp'
files = glob.glob(path + "/**/*.txt")
total_size = 0
for file in files:
total_size += os.path.getsize(os.path.join(path, file))
print len(files), total_size

not listing entire directory

Im new on Python, Im actually on a short course this week, but I have a very specific request and I dont know how to deal with it right now: I have many different txt files in a folder, when I use the following code I receive only the filename of two of the many files, why is this?
regards!
import dircache
lista = dircache.listdir('C:\FDF')
i = 0
check = len(lista[0])
temp = []
count = len(lista)
while count != 0:
if len(lista[i]) != check:
temp.append(lista[i- 1])
check = len(lista[i])
else:
i = i + 1
count = count - 1
print (temp)
Maybe you can use the glob library: http://docs.python.org/2/library/glob.html
It seems that it works UNIX-like for listing files so maybe it can work with this?
import glob
directory = 'yourdirectory/'
filelist = glob.glob(directory+'*.txt')
If I've understood you correct, you would like to get all files?
Try it in this case:
import os
filesList = None
dir = 'C:\FDF'
for root, dirs, files in os.walk(dir):
filesList = files
break
print(filesList)
If need full path use:
import os.path
filesList = None
dir = 'C:\FDF'
for root, dirs, files in os.walk(dir):
for file in files:
filesList.append(os.path.join(root, file))
print(filesList)

Categories