I am writing a simple Python script to tell me file size for a set of documents which I am importing from a CSV. I verified that none of the entries are over 100 characters, so this error "ValueError: scandir: path too long for Windows" does not make sense to me.
Here is my code:
# determine size of a given folder in MBytes
import os, subprocess, json, csv, platform
# Function to check if a Drive Letter exists
def hasdrive(letter):
return "Windows" in platform.system() and os.system("vol %s: 2>nul>nul" % (letter)) == 0
# Define Drive to check for
letter = 'S'
# Check if Drive doesnt exist, if not then map drive
if not hasdrive(letter):
subprocess.call(r'net use s: /del /Y', shell=True)
subprocess.call(r'net use s: \\path_to_files', shell=True)
list1 = []
# Import spreadsheet to calculate size
with open('c:\Temp\files_to_delete_subset.csv') as f:
reader = csv.reader(f, delimiter=':', quoting=csv.QUOTE_NONE)
for row in reader:
list1.extend(row)
# Define variables
folder = "S:"
folder_size = 0
# Exporting outcome
for list1 in list1:
folder = folder + str(list1)
for root, dirs, files in os.walk(folder):
for name in files:
folder_size += os.path.getsize(os.path.join(root, name))
print(folder)
# print(os.path.join(root, name) + " " + chr(os.path.getsize(os.path.join(root, name))))
print(folder_size)
From my understanding the max path size in Windows is 260 characters, so 1 driver letter + 100 character path should NOT exceed the Windows max.
Here is an example of a path: '/Document/8669/CORRESP/1722165.doc'
The folder string you're trying to walk is growing forever. Simplifying the code to the problem area:
folder = "S:"
# Exporting outcome
for list1 in list1:
folder = folder + str(list1)
You never set folder otherwise, so it starts out as S:<firstpath>, then on the next loop it's S:<firstpath><secondpath>, then S:<firstpath><secondpath><thirdpath>, etc. Simple fix: Separate drive from folder:
drive = "S:"
# Exporting outcome
for path in list1:
folder = drive + path
Now folder is constructed from scratch on each loop, throwing away the previous path, rather than concatenating them.
I also gave the iteration value a useful name (and removed the str call, because the values should all be str already).
Related
Using the progressbar library :
import progressbar
.
.
bar = progressbar.ProgressBar(maxval=len(files_total)).start()
This is my base for loop to read and store all .txt files in path4safe (which is a local test folder with 200 .txt file), which also tell me how many text files there is in the folder with a "d"
for root, dirnames, files in os.walk(path4safe):
for x in files:
if x.endswith(tuple(ext)):
d += 1
files_total.append(root + "/" + x)
So I tried this :
for root, dirnames, files in os.walk(path4safe):
for idx, files in enumerate(files_total):
for x in files:
if x.endswith(tuple(ext)):
d += 1
files_total.append(root + "/" + x)
bar.update(idx)
But I only get an empty progress bar, I feel like I'm mixing up one of my var. Basically I'm trying to use "d" to create the progressbar.
[]0% | |
Total files:
0
import os
src = "/home/user/Desktop/images/"
ext = ".jpg"
for i,filename in enumerate(os.listdir(src)):
# print(i,filename)
if filename.endswith(ext):
os.rename(src + filename, src + str(i) + ext)
print(filename, src + str(i) + ext)
else :
os.remove(src + filename)
this code will rename all the images in a folder starting with 0.jpg,1.jpg etc... and remove none jpg but what if i already had some images in that folder, let's say i had images 0.jpg, 1.jpg, 2.jpg, then i added a few others called im5.jpg and someImage.jpg.
What i want to do is adjust the code to read the value of the last image number, in this case 2 and start counting from 3 .
In other words i'll ignore the already labeled images and proceed with the new ones counting from 3.
Terse and semi-tested version:
import os
import glob
offset = sorted(int(os.path.splitext(os.path.basename(filename))[0])
for filename in glob.glob(os.path.join(src, '*' + ext)))[-1] + 1
for i, filename in enumerate(os.listdir(src), start=offset):
...
Provided all *.jpg files consist of a only a number before their extension. Otherwise you will get a ValueError.
And if there happens to be a gap in the numbering, that gap will not be filled with new files. E.g., 1.jpg, 2.jpg, 3.jpg, 123.jpg will continue with 124.jpg (which is safer anyway).
If you need to filter out filenames such as im5.jpg or someImage.jpg, you could add an if-clause to the list comprehension, with a regular expression:
import os
import glob
import re
offset = sorted(int(os.path.splitext(os.path.basename(filename))[0])
for filename in glob.glob(os.path.join(src, '*' + ext))
if re.search('\d+' + ext, filename))[-1] + 1
Of course, by now the three lines are pretty unreadable, and may not win the code beauty contest.
I am attempting to take a file name such as 'OP 40 856101.txt' from a directory, remove the .txt, set each single word to a specific variable, then reorder the filename based on a required order such as '856101 OP 040'. Below is my code:
import os
dir = 'C:/Users/brian/Documents/Moeller'
orig = os.listdir(dir) #original names of the files in the folder
for orig_name in orig: #This loop splits each file name into a list of stings containing each word
f = os.path.splitext(orig_name)[0]
sep = f.split() #Separation is done by a space
for t in sep: #Loops across each list of strings into an if statement that saves each part to a specific variable
#print(t)
if t.isalpha() and len(t) == 3:
wc = t
elif len(t) > 3 and len(t) < 6:
wc = t
elif t == 'OP':
op = t
elif len(t) >= 4:
pnum = t
else:
opnum = t
if len(opnum) == 2:
opnum = '0' + opnum
new_nam = '%s %s %s %s' % (pnum,op,opnum, wc) #This is the variable that contain the text for the new name
print("The orig filename is %r, the new filename is %r" % (orig_name, new_nam))
os.rename(orig_name, new_nam)
However I am getting an error with my last for loop where I attempt to rename each file in the directory.
FileNotFoundError: [WinError 2] The system cannot find the file specified: '150 856101 OP CLEAN.txt' -> '856101 OP 150 CLEAN'
The code runs perfectly until the os.rename() command, if I print out the variable new_nam, it prints out the correct naming order for all of the files in the directory. Seems like it cannot find the original file though to replace the filename to the string in new_nam. I assume it is a directory issue, however I am newer to python and can't seem to figure where to edit my code. Any tips or advice would be greatly appreciated!
Try this (just changed the last line):
os.rename(os.path.join(dir,orig_name), os.path.join(dir,new_nam))
You need to tell Python the actual path of the file to rename - otherwise, it looks only in the directory containing this file.
Incidentally, it's better not to use dir as a variable name, because that's the name of a built-in.
import os
import time
source = [r'C:\\Documents','/home/swaroop/byte', '/home/swaroop/bin']
target_dir =r'C:\\Documents','/mnt/e/backup/'
today = target_dir + time.strftime("%Y%m%d")
now = time.strftime('%H%M%S')
os.path.exists(today)
os.mkdir(today)
print 'Successful created directory', today
target = today + os.sep + now + '.zip'
zip_command = "zip -qr '%s' %s" % (target, ' '.join(source))
if os.system(zip_command) == 0:
print 'Successful backup to', target
else:
print 'Backup FAILED'
today = target_dir + time.strftime("%Y%m%d")
Plees help
TypeError: can only concatenate tuple (not "str") to tuple
Check the comma as #MosesKoledoye said, the book has:
# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code'] # Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string # for names with spaces in it.
# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using
your code has:
source = [r'C:\\Documents','/home/swaroop/byte', '/home/swaroop/bin']
target_dir =r'C:\\Documents','/mnt/e/backup/'
In the target_dir assignment the comma makes the right-side of the assignment a tuple. To join two strings together use a +, not a comma:
target_dir =r'C:\\Documents' + '/mnt/e/backup/'
better yet, use a single string. However, /mnt is a Linux directory name, not a Windows one. I suspect you actually need:
target_dir = '/mnt/e/backup/'
You have also made the Windows path a raw string, which means the two back-slashes will be retained. Either do this:
'C:\\Documents'
or this:
r'C:\Documents'
(unless or course you actually do want \\)
Edit: I just noticed you also have an indentation problem:
if os.system(zip_command) == 0:
print 'Successful backup to', target
else:
should be:
if os.system(zip_command) == 0:
print 'Successful backup to', target
else:
Finally, when you say "I copy all the code" and it fails, look to see where yours differs from what it says in the book.
I have recently moved a set of near identical programs from my mac to my school's windows, and while the paths appear to be the same (or the tail end of them), they will not run properly.
import glob
import pylab
from pylab import *
def main():
outfnam = "igdata.csv"
fpout = open(outfnam, "w")
nrows = 0
nprocessed = 0
nbadread = 0
filenames = [s.split("/")[1] for s in glob.glob("c/Cmos6_*.IG")]
dirnames = "c an0 an1 an2 an3 an4".split()
for suffix in filenames:
nrows += 1
row = []
row.append(suffix)
for dirnam in dirnames:
fnam = dirnam+"/"+suffix
lines = [l.strip() for l in open(fnam).readlines()]
nprocessed += 1
if len(lines)<5:
nbadread += 1
print "warning: file %s contains only %d lines"%(fnam, len(lines))
tdate = "N/A"
irrad = dirnam
Ig_zeroVds_largeVgs = 0.0
else:
data = loadtxt(fnam, skiprows=5)
tdate = lines[0].split(":")[1].strip()
irrad = lines[3].split(":")[1].strip()
# pull out last column (column "-1") from second-to-last row
Ig_zeroVds_largeVgs = data[-2,-1]
row.append(irrad)
row.append("%.3e"%(Ig_zeroVds_largeVgs))
fpout.write(", ".join(row) + "\n")
print "wrote %d rows to %s"%(nrows, outfnam)
print "processed %d input files, of which %d had missing data"%( \
nprocessed, nbadread)`
This program worked fine for a mac, but for windows I keep getting for :
print "wrote %d rows to %s"%(nrows, outfnam)
print "processed %d input files, of which %d had missing data"%( \
nprocessed, nbadread)
wrote 0 row to file name
processed 0 input files, of which o had missing data
on my mac i go 144 row to file...
does any one have any suggestions?
If the script doesn't raise any errors, this piece of code is most likely returning an empty list.
glob.glob("c/Cmos6_*.IG")
Seeing as glob.glob works perfectly fine with forward slashes on Windows, the problem is most likely that it's not finding the files, which most likely means that the string you provided has an error somewhere in it. Make sure there isn't any error in "c/Cmos6_*.IG".
If the problem isn't caused by this, then unfortunately, I have no idea why it is happening.
Also, when I tried it, filenames returned by glob.glob have backslashes in them on Windows, so you should probably split by "\\" instead.
Off the top of my head, it looks like a problem of using / in the path. Windows uses \ instead.
os.path contains a number of functions to ease working with paths across platforms.
Your s.split("/") should definitely be s.split(os.pathsep). I got bitten by this, onceā¦ :)
In fact, glob returns paths with \ on Windows and / on Mac OS X, so you need to do your splitting with the appropriate path separator (os.pathsep).