IsADirectoryError: [Errno 21] Is a directory: '/' explanation - python

I am getting this error from this code.
import os
import requests
import shutil
path = "/Users/mycode/Documents/API upload/"
api_endpoint = "xxxxxx"
files = {
'file': open(p,'rb') for p in os.path.abspath(path)
}
for file in os.path.abspath(path):
response = requests.post(url=api_endpoint, files=files)
if response.status_code == 200:
print(response.status_code)
print("success!")
else:
print("did not work")
IsADirectoryError: [Errno 21] Is a directory: '/'
^ what does this error mean? I tried googling it but still do not understand in my case. It has something to do with the paths but not sure why.
anything helps!

for p in os.path.abspath(path)
doesn't do what you think it does.
It does not iterate over all files in a given directory. Use os.listdir for that. You can combine the directory path and the filename inside the directory using os.path.join. The pathlib module has an IMHO simpler to use / higher level interface to all of this.
What your code does is iterate over all characters in the string returned by os.path.abspath(path). And the first character is /. Which you then try to open as a file. And that doesn't work, because / is a directory.

You might want to consider doing this in chunks because if your directory contents are very large, you could run out of file descriptors.
Something like this should work:
from requests import post
from glob import glob
from os.path import join, isfile
DIR = '/Users/mycode/Documents/API upload/'
CHUNK = 10
API_ENDPOINT = '...'
filelist = [filename for filename in glob(join(DIR, '*')) if isfile(filename)]
for idx in range(0, len(filelist), CHUNK):
files = [('file', open(fn, 'rb')) for fn in filelist[idx:idx+CHUNK]]
post(API_ENDPOINT, files=files).raise_for_status()
for _, fd in files:
fd.close()
Note:
For improved efficiency you should consider multithreading for this

Related

How to retrieve the path of files from a folder into a text file with a python script?

import shutil
import os
def get_files():
source = os.listdir("/output_folder/sample/cufflinks/")
destination = "output_folder/assemblies.txt"
for files in source:
if files.with("transcripts.gtf"):
shutil.move(files,destination)
I want to retrieve transcripts.gtf files from "/output_folder/sample/cufflinks/" to assemblies.txt. Is the above code correct or not. Please help me out. Thank you !!
You can use os.walk :
import os
import shutil
from os.path import join
destination = "/output_folder/assemblies.txt" # Absolute path, or use join
folder_to_look_in = "/output_folder/sample/cufflinks/" # Absolute folder path, or use join
for _, _, files in os.walk(folder_to_look_in):
for file_name in files:
if file_name.endswith("transcripts.gtf"):
try:
# 'a' will append the data to the tail of the file
with open(destination, 'a') as my_super_file:
my_super_file.write(file_name)
except OSError as e:
print "I/O error({0}): {1}".format(e.errno, e.strerror)
I think I finally understand what you are trying to achieve. Use the glob.glob() function:
import glob
destination = "output_folder/assemblies.txt"
source = '/output_folder/sample/cufflinks/*/*transcripts.gtf'
with open(destination, 'w') as f:
f.write('\n'.join(glob.glob(source))+'\n')
f.write('\n'.join(glob.glob(source))+'\n') is the equivalent of:
s = ''
for path in glob.glob(source):
s += '{}\n'.format(path)
f.write(s)

Python Bulk Blank folder creation

i am trying to create bulk folders based on simple text file. os.makedir helps to create new folder but i am not sure how to incorporate with newpath variable along with folder list. following is what i am trying with. I understand that code has syntax error. So need some help to correct/enhance the code.
import os.path
newpath = r'C:\Program Files\test\'
with open('folders.txt') as f:
for line in f:
ff = os.makedirs(newpath,line.strip())
ff.close()
Use os.path.join to join path components.
import os.path
newpath = r'C:\Program Files\test\'
with open('folders.txt') as f:
for line in f:
os.makedirs(os.path.join(newpath, line.strip()))
You can use os.path.join function documented here.
Perhaps something like this?
import os, sys
newpath = 'C:\Program Files\test'
with open(open('folders.txt') as f:
for line in f:
newdir = os.path.join(newpath, line.strip())
try:
os.makedirs(newdir)
except OSError: # if makedirs() failed
sys.stderr.write("ERR: Could not create %s\n" % newdir)
pass # continue with next line
Notes:
Use os.path.join() to combine a paths. This will automatically use separators that are suitable for your OS.
os.makedirs() does not return anything
os.makedirs() will raise an OSError exception if the directory already exists or cannot be created.

Python File Concatenation

I have a data folder, with subfolders for each subject that ran through a program. So, for example, in the data folder, there are folders for Bob, Fred, and Tom. Each one of those folders contains a variety of files and subfolders. However, I am only interested in the 'summary.log' file contained in each subject's folder.
I want to concatenate the 'summary.log' file from Bob, Fred, and Tom into a single log file in the data folder. In addition, I want to add a column to each log file that will list the subject number.
Is this possible to do in Python? Or is there an easier way to do it? I have tried a number of different batches of code, but none of them get the job done. For example,
#!/usr/bin/python
import sys, string, glob, os
fls = glob.glob(r'/Users/slevclab/Desktop/Acceptability Judgement Task/data/*');
outfile = open('summary.log','w');
for x in fls:
file=open(x,'r');
data=file.read();
file.close();
outfile.write(data);
outfile.close();
Gives me the error,
Traceback (most recent call last):
File "fileconcat.py", line 8, in <module>
file=open(x,'r');
IOError: [Errno 21] Is a directory
I think this has to do with the fact that the data folder contains subfolders, but I don't know how to work around it. I also tried this, but to no avail:
from glob import iglob
import shutil
import os
PATH = r'/Users/slevclab/Desktop/Acceptability Judgement Task/data/*'
destination = open('summary.log', 'wb')
for filename in iglob(os.path.join(PATH, '*.log'))
shutil.copyfileobj(open(filename, 'rb'), destination)
destination.close()
This gives me an "invalid syntax" error at the "for filename" line, but I'm not sure what to change.
The syntax is not related to the use of glob.
You forget the ":" at the end of the for statement:
for filename in iglob(os.path.join(PATH, '*.log')):
^--- missing
But the following pattern works :
PATH = r'/Users/slevclab/Desktop/Acceptability Judgement Task/data/*/*.log'
destination = open('summary.log', 'wb')
for filename in iglob(PATH):
shutil.copyfileobj(open(filename, 'rb'), destination)
destination.close()
The colon (:) is missing in the for line.
Besides you should use with because it handles closing the file (your code is not exception safe).
from glob import iglob
import shutil
import os
PATH = r'/Users/slevclab/Desktop/Acceptability Judgement Task/data/*'
with open('summary.log', 'wb') as destination:
for filename in iglob(os.path.join(PATH, '*.log')):
with open(filename, 'rb') as in_:
shutil.copyfileobj(in_, destination)
In your first example:
import sys, string, glob, os
you are not using sys, string or os, so there is no need to import those.
fls = glob.glob(r'/Users/slevclab/Desktop/Acceptability Judgement Task/data/*');
here, you are selecting the subject folders. Since you are interested in summary.log files within these folders, you may change the pattern as follows:
fls = glob.glob('/Users/slevclab/Desktop/Acceptability Judgement Task/data/*/summary.log')
In Python, there is no need to end lines with semicolons.
outfile = open('summary.log','w')
for x in fls:
file = open(x, 'r')
data = file.read()
file.close()
outfile.write(data)
outfile.close()
As VGE's answer shows, your second solution works once you've fixed the syntax error. But note that a more general solution is to use os.walk:
>>> import os
>>> for i in os.walk('foo'):
... print i
...
('foo', ['bar', 'baz'], ['oof.txt'])
('foo/bar', [], ['rab.txt'])
('foo/baz', [], ['zab.txt'])
This goes through all the directories in the tree above the start directory and maintains a nice separation between directories and files.

Extract files from zip without keeping the structure using python ZipFile?

I try to extract all files from .zip containing subfolders in one folder. I want all the files from subfolders extract in only one folder without keeping the original structure. At the moment, I extract all, move the files to a folder, then remove previous subfolders. The files with same names are overwrited.
Is it possible to do it before writing files?
Here is a structure for example:
my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt
At the end I whish this:
my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt
What can I add to this code ?
import zipfile
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"
zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
zip_file.extract(files, my_dir)
zip_file.close()
if I rename files path from zip_file.namelist(), I have this error:
KeyError: "There is no item named 'file2.txt' in the archive"
This opens file handles of members of the zip archive, extracts the filename and copies it to a target file (that's how ZipFile.extract works, without taking care of subdirectories).
import os
import shutil
import zipfile
my_dir = r"D:\Download"
my_zip = r"D:\Download\my_file.zip"
with zipfile.ZipFile(my_zip) as zip_file:
for member in zip_file.namelist():
filename = os.path.basename(member)
# skip directories
if not filename:
continue
# copy file (taken from zipfile's extract)
source = zip_file.open(member)
target = open(os.path.join(my_dir, filename), "wb")
with source, target:
shutil.copyfileobj(source, target)
It is possible to iterate over the ZipFile.infolist(). On the returned ZipInfo objects you can then manipulate the filename to remove the directory part and finally extract it to a specified directory.
import zipfile
import os
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"
with zipfile.ZipFile(my_zip) as zip:
for zip_info in zip.infolist():
if zip_info.filename[-1] == '/':
continue
zip_info.filename = os.path.basename(zip_info.filename)
zip.extract(zip_info, my_dir)
Just extract to bytes in memory,compute the filename, and write it there yourself,
instead of letting the library do it - -mostly, just use the "read()" instead of "extract()" method:
Python 3.6+ update(2020) - the same code from the original answer, but using pathlib.Path, which ease file-path manipulation and other operations (like "write_bytes")
from pathlib import Path
import zipfile
import os
my_dir = Path("D:\\Download\\")
my_zip = my_dir / "my_file.zip"
zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
data = zip_file.read(files, my_dir)
myfile_path = my_dir / Path(files.filename).name
myfile_path.write_bytes(data)
zip_file.close()
Original code in answer without pathlib:
import zipfile
import os
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"
zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
data = zip_file.read(files, my_dir)
# I am almost shure zip represents directory separator
# char as "/" regardless of OS, but I don't have DOS or Windos here to test it
myfile_path = os.path.join(my_dir, files.split("/")[-1])
myfile = open(myfile_path, "wb")
myfile.write(data)
myfile.close()
zip_file.close()
A similar concept to the solution of Gerhard Götz, but adapted for extracting single files instead of the entire zip:
with ZipFile(zipPath, 'r') as zipObj:
zipInfo = zipObj.getinfo(path_in_zip))
zipInfo.filename = os.path.basename(destination)
zipObj.extract(zipInfo, os.path.dirname(os.path.realpath(destination)))
In case you are getting badZipFile error. you can unzip the archive using 7zip sub process. assuming you have installed the 7zip then use the following code.
import subprocess
my_dir = destFolder #destination folder
my_zip = destFolder + "/" + filename.zip #file you want to extract
ziploc = "C:/Program Files/7-Zip/7z.exe" #location where 7zip is installed
cmd = [ziploc, 'e',my_zip ,'-o'+ my_dir ,'*.txt' ,'-r' ]
#extracting only txt files and from all subdirectories
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)

Deleting files which start with a name Python

I have a few files I want to delete, they have the same name at the start but have different version numbers. Does anyone know how to delete files using the start of their name?
Eg.
version_1.1
version_1.2
Is there a way of delting any file that starts with the name version?
Thanks
import os, glob
for filename in glob.glob("mypath/version*"):
os.remove(filename)
Substitute the correct path (or . (= current directory)) for mypath. And make sure you don't get the path wrong :)
This will raise an Exception if a file is currently in use.
If you really want to use Python, you can just use a combination of os.listdir(), which returns a listing of all the files in a certain directory, and os.remove().
I.e.:
my_dir = # enter the dir name
for fname in os.listdir(my_dir):
if fname.startswith("version"):
os.remove(os.path.join(my_dir, fname))
However, as other answers pointed out, you really don't have to use Python for this, the shell probably natively supports such an operation.
In which language?
In bash (Linux / Unix) you could use:
rm version*
or in batch (Windows / DOS) you could use:
del version*
If you want to write something to do this in Python it would be fairly easy - just look at the documentation for regular expressions.
edit:
just for reference, this is how to do it in Perl:
opendir (folder, "./") || die ("Cannot open directory!");
#files = readdir (folder);
closedir (folder);
unlink foreach (grep /^version/, #files);
import os
os.chdir("/home/path")
for file in os.listdir("."):
if os.path.isfile(file) and file.startswith("version"):
try:
os.remove(file)
except Exception,e:
print e
The following function will remove all files and folders in a directory which start with a common string:
import os
import shutil
def cleanse_folder(directory, prefix):
for item in os.listdir(directory):
path = os.path.join(directory, item)
if item.startswith(prefix):
if os.path.isfile(path):
os.remove(path)
elif os.path.isdir(os.path.join(directory, item)):
shutil.rmtree(path)
else:
print("A simlink or something called {} was not deleted.".format(item))
import os
import re
directory = "./uploaded"
pattern = "1638813371180"
files_in_directory = os.listdir(directory)
filtered_files = [file for file in files_in_directory if ( re.search(pattern,file))]
for file in filtered_files:
path_to_file = os.path.join(directory, file)
os.remove(path_to_file)

Categories