I have a large folder having 900+ sub-folders, each of which has another folder inside it which in turn has a zipped file.
Its like -
-MyFolder
-----MySubfolder
---------MySubSubfolder
-------------MyFile.zip
How can I decompress all the zipped files in their respective folder OR in a separate folder elsewhere in Windows using Python?
Any help would be great!!
You could try something like:
import zipfile,os;
def unzip(source_filename, dest_dir):
with zipfile.ZipFile(source_filename) as zf:
for member in zf.infolist():
extract_allowed = True;
path = dest_dir;
words = member.filename.split('/');
for word in words:
if (word == '..'):
extract_allowed = False;
break;
if (extract_allowed == True):
zf.extract(member, dest_dir);
def unzipFiles(dest_dir):
for file in os.listdir(dest_dir):
if (os.path.isdir(dest_dir + '/' + file)):
return unzipFiles(dest_dir + '/' + file);
if file.endswith(".zip"):
print 'Found file: "' + file + '" in "' + dest_dir + '" - extracting';
unzip(dest_dir + '/' + file, dest_dir + '/');
unzipFiles('./MyFolder');
Related
I have python script which synchronizes files from the remote server by comparing with the local files. It uses a file downloading tool written injava. I store filepath, file size and its name in dictionary which looks like:
eddsFilesDict = {'/part07_data07/Science/TGO/ACS/SDU_': ['860904
SDU__DACS_51FC_023D8101_2021-155T02-12-17__00001.EXM', '17660866
SDU__DACS_51FB_023D8101_2021-155T02-10-16__00001.EXM', '17660866
SDU__DACS_51FA_023D8101_2021-155T 02-02-18__00001.EXM', '17660866
SDU__DACS_51F9_023D8101_2021-155T02-00-16__00001.EXM']}
To list up files on the local machine I use next part of the code:
filenames = []
for top, dirs, files in os.walk('/data/local/'):
for fn in files:
filenames.append(os.stat(os.path.join(top, fn)).st_size.__str__() + ' ' + os.path.join(fn))
But in the following part of my script usage of CPU goes up to 100% and it lasts for 1-2 minutes. I cannot understand why. By experimental way I understood that subprocess is not the thing.
for pathname, fileName in eddsFilesDict.items():
for f in fileName:
if f not in filenames:
logger.info('REQUESTING FILE: ' + pathname + '/' + f.split('.')[0].split(' ')[1] + ' FILE SIZE: '
+ f.split(' ')[0])
if 'SDU' in f:
argsFile = shlex.split(
'/home/user1/client/bin/fs_client --arc tar') # here I call java tool to start file downloading
p1 = subprocess.Popen(argsFile, stderr=subprocess.PIPE)
out, err = p1.communicate()
logger.warning('REQUESTING FILE: ' + pathname + '/' + f.split('.')[0].split(' ')[1] + ' FILE SIZE: '
+ f.split(' ')[0] + '\n' + err.__str__())
elif 'SCI' in f:
argsFile = shlex.split(
'/home/user1/client/bin/fs_client --arc tar')
p2 = subprocess.Popen(argsFile, stderr=subprocess.PIPE)
out, err = p2.communicate()
logger.warning('REQUESTING FILE: ' + pathname + '/' + f.split('.')[0].split(' ')[1] + ' FILE SIZE: '
+ f.split(' ')[0] + '\n' + err.__str__())
I have a requirement to get the file details for certain locations (within the system and SFTP) and get the file size for some locations on SFTP which can be achieved using the shared code.
def getFileDetails(location: str):
filenames: list = []
if location.find(":") != -1:
for file in glob.glob(location):
filenames.append(getFileNameFromFilePath(file))
else:
with pysftp.Connection(host=myHostname, username=myUsername, password=myPassword) as sftp:
remote_files = [x.filename for x in sorted(sftp.listdir_attr(location), key=lambda f: f.st_mtime)]
if location == LOCATION_SFTP_A:
for filename in remote_files:
filenames.append(filename)
sftp_archive_d_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
elif location == LOCATION_SFTP_B:
for filename in remote_files:
filenames.append(filename)
sftp_archive_e_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
else:
for filename in remote_files:
filenames.append(filename)
sftp.close()
return filenames
There are more than 10000+ files in LOCATION_SFTP_A and LOCATION_SFTP_B. For each file, I need to get the file size. To get the size I am using
sftp_archive_d_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
sftp_archive_e_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
# Time Taken : 5 min+
sftp_archive_d_size_mapping[filename] = 1 #sftp.stat(location + "/" + filename).st_size
sftp_archive_e_size_mapping[filename] = 1 #sftp.stat(location + "/" + filename).st_size
# Time Taken : 20-30 s
If I comment sftp.stat(location + "/" + filename).st_size and assign static value It takes only 20-30 seconds to run the entire code. I am looking for a way How can optimize the time and get the file size details.
The Connection.listdir_attr already gives you the file size in SFTPAttributes.st_size.
There's no need to call Connection.stat for each file to get the size (again).
See also:
With pysftp or Paramiko, how can I get a directory listing complete with attributes?
How to fetch sizes of all SFTP files in a directory through Paramiko
I GOT THE ANSWER OF M POST THANKS FOR HELPING ME AND WISH OTHERS LEARN FROM MY MISTAKES GOOD LUCK ALL
I Have Folders that take in .meta files and want to save each filename in variable ,
all folder take in the same meta file name but I added the folder name into the file with hypthen means
-> foldername + '-' + filename
and want to print the file name in each folder into file that I created in specific driver and used os.chdir() to load into file path
so when im going to print each folder meta file name into this file its not saving the var
for dirpath, dirnames, files in os.walk('.') :
print('loop')
for file in files :
print('file')
if file.endswith('.meta'):
print('meta')
METAPath = os.path.abspath(os.path.join(dirpath, file))
METABase = os.path.basename(dirpath)
if True :
if file.startswith(METABase + '-' + 'handling'):
HandlingFile = "'" + file + "'"
return HandlingFile
elif file.startswith(METABase + '-' + 'vehicles'):
VehiclesFile = "'" + file + "'"
return VehiclesFile
elif file.startswith(METABase + '-' + 'carvariations'):
CarVariationsFile = "'" + file + "'"
return CarVariationsFile
elif file.startswith(METABase + '-' + 'carcols'):
CarcolsFile = "'" + file + "'"
return CarcolsFile
elif file.startswith(METABase + '-' + 'dlctext'):
DLCTextFile = "'" + file + "'"
return DLCTextFile
print(HandlingFile, VehiclesFile ,CarVariationsFile ,CarcolsFile ,DLCTextFile)
Error :
Traceback (most recent call last):
File "D:\pythonEx\MyFiveMPython\test.py", line 220, in <module>
Stress_Veh()
File "D:\pythonEx\MyFiveMPython\test.py", line 213, in Stress_Veh
print(HandlingFile, VehiclesFile ,CarVariationsFile ,CarcolsFile ,DLCTextFile)
NameError: name 'HandlingFile' is not defined
delete these five statements. They're the source of your error, and they don't do anything.
HandlingFile = HandlingFile
VehiclesFile = VehiclesFile
CarVariationsFile = CarVariationsFile
CarcolsFile = CarcolsFile
DLCTextFile = DLCTextFile
To cut down your code a bit...
if file.startswith(...):
HandlingFile = <some stuff>
return HandlingFile
print(HandlingFile...)
When your "if" statement returns False, HandlingFile is never defined.
Aaah! Now I understand what those extra five statements were trying to do... you were trying to initalize your variables. You didn't want to do
HandlingFile = HandlingFile
you wanted
HandlingFile = None # or False, or '' or something else
I am running the following block of code to create the path to a new file:
# Opens/create the file that will be created
device_name = target_device["host"].split('.')
path = "/home/user/test_scripts/configs/" + device_name[-1] + "/"
print(path)
# Check if path exists
if not os.path.exists(path):
os.makedirs(path)
# file = open(time_now + "_" + target_device["host"] + "_config.txt", "w")
file = open(path + time_now + "_" + device_name[0] + "_config.txt", "w")
# Time Stamp File
file.write('\n Create on ' + now.strftime("%Y-%m-%d") +
' at ' + now.strftime("%H:%M:%S") + ' GMT\n')
# Writes output to file
file.write(output)
# Close file
file.close()
The code run as intended with the exception that it creates and saves the files on the directory: /home/user/test_scripts/configs/ instead on the indented one that should be: /home/user/test_scripts/configs/device_name[-1]/.
Please advise.
Regards,
./daq
Try using os.path.join(base_path, new_path) [Reference] instead of string concatenation. For example:
path = os.path.join("/home/user/test_scripts/configs/", device_name[-1])
os.makedirs(path, exist_ok=True)
new_name = time_now + "_" + device_name[0] + "_config.txt"
with open(os.path.join(path, new_name), "w+") as file:
file.write("something")
Although I don't get why you're creating a directory with device_name[-1] and as a file name using device_name[0].
I am using your ftputil within a python script to get last modification/creation date of files in directory and I am having few problems and wondered if you could help.
host.stat_cache.resize(200000)
recursive = host.walk(directory, topdown=True, onerror=None)
for root,dirs,files in recursive:
for name in files:
#mctime = host.stat(name).mtime
print name
The above outputs a listing of all files in the directory
host.stat_cache.resize(200000)
recursive = host.walk(directory, topdown=True, onerror=None)
for root,dirs,files in recursive:
for name in files:
if host.path.isfile("name"):
mtime1 = host.stat("name")
mtime2 = host.stat("name").mtime
#if crtime < now -30 * 86400:
#print name + " Was Created " + " " + crtime + " " + mtime
print name + " Was Created " + " " + " " + mtime1 + " " + mtime2
Above produces no output
You've put name in quotes. So Python will always be checking for the literal filename "name", which presumably doesn't exist. You mean:
if host.path.isfile(name):
mtime1 = host.stat(name)
mtime2 = host.stat(name).mtime