Using listdir realpath abspath with symbolic links - python

Something that I don't understand :
In a shell :
mkdir -p /tmp/toto/titi/tutu
touch /tmp/toto/tata
ln -s /tmp/toto/tata /tmp/toto/titi/tutu/
python
Then in python:
import os
zeList = os.listdir("/tmp/toto/titi/tutu/")
print os.path.realpath(zeList[0])
>'/tata'
print os.path.abspath(zeList[0])
>'/tata'
The expected result should be : /tmp/toto/tata (or /tmp/toto/titi/tutu/tata).
Can anyone explain this result ?

os.listdir() returns base filenames, not full paths:
>>> import os
>>> os.listdir("/tmp/toto/titi/tutu/")
['tata']
Without a path, the file is then considered relative to the current working directory:
>>> os.getcwd()
'/Users/mj/Development/venvs/stackoverflow-2.7'
>>> os.path.realpath('tata')
'/Users/mj/Development/venvs/stackoverflow-2.7/tata'
Join the filename with the path first:
testdir = "/tmp/toto/titi/tutu/"
zeList = [os.path.join(testdir, fname) for fname in os.listdir(testdir)]
Now the symbolic link is properly replaced:
>>> testdir = "/tmp/toto/titi/tutu/"
>>> zeList = [os.path.join(testdir, fname) for fname in os.listdir(testdir)]
>>> print os.path.realpath(zeList[0])
/private/tmp/toto/tata
>>> print os.path.abspath(zeList[0])
/tmp/toto/titi/tutu/tata

listdir returns filenames, not paths. Therefore you are passing a relative path to realpath which is interpreted relative to your working directory /.
Use os.path.realpath(os.path.join(..., zeList[0])) for correct results.

Related

Python filename from path with \

I have a list of paths that looks like this C:/Users/myuser/Documents/files\my_file_1.csv and I want to get the file name from that by doing this:
path=['C:/Users/myuser/Documents/files\my_file_1.csv','C:/Users/myuser/Documents/files\my_file_2.csv',...]
filename, file_extension = os.path.splitext(path[0])
and I always get 'C:/Users/myuser/Documents/files\my_file_1' I know it must be for the ' \ ' slash but I haven't been able to replace it. Can anyone give me an idea ?
You can use os.path.basename to get just the filename without the full directory, then os.path.splitext to remove the file extension.
>>> import os
>>> [os.path.splitext(os.path.basename(i))[0] for i in path]
['my_file_1', 'my_file_2']
Or if you want the filename and extension, but no directories
>>> [os.path.basename(i) for i in path]
['my_file_1.csv', 'my_file_2.csv']
As you are using windows and if you are using python 3.4+
>>> from pathlib import PureWindowsPath
>>> path=['C:/Users/myuser/Documents/files\my_file_1.csv','C:/Users/myuser/Documents/files\my_file_2.csv']
>>> print([PureWindowsPath(i).name for i in path])
['my_file_1.csv', 'my_file_2.csv']

Extract a part of the filepath (a directory) in Python

I need to extract the name of the parent directory of a certain path. This is what it looks like:
C:\stuff\directory_i_need\subdir\file.jpg
I would like to extract directory_i_need.
import os
## first file in current dir (with full path)
file = os.path.join(os.getcwd(), os.listdir(os.getcwd())[0])
file
os.path.dirname(file) ## directory of file
os.path.dirname(os.path.dirname(file)) ## directory of directory of file
...
And you can continue doing this as many times as necessary...
Edit: from os.path, you can use either os.path.split or os.path.basename:
dir = os.path.dirname(os.path.dirname(file)) ## dir of dir of file
## once you're at the directory level you want, with the desired directory as the final path node:
dirname1 = os.path.basename(dir)
dirname2 = os.path.split(dir)[1] ## if you look at the documentation, this is exactly what os.path.basename does.
For Python 3.4+, try the pathlib module:
>>> from pathlib import Path
>>> p = Path('C:\\Program Files\\Internet Explorer\\iexplore.exe')
>>> str(p.parent)
'C:\\Program Files\\Internet Explorer'
>>> p.name
'iexplore.exe'
>>> p.suffix
'.exe'
>>> p.parts
('C:\\', 'Program Files', 'Internet Explorer', 'iexplore.exe')
>>> p.relative_to('C:\\Program Files')
WindowsPath('Internet Explorer/iexplore.exe')
>>> p.exists()
True
All you need is parent part if you use pathlib.
from pathlib import Path
p = Path(r'C:\Program Files\Internet Explorer\iexplore.exe')
print(p.parent)
Will output:
C:\Program Files\Internet Explorer
Case you need all parts (already covered in other answers) use parts:
p = Path(r'C:\Program Files\Internet Explorer\iexplore.exe')
print(p.parts)
Then you will get a list:
('C:\\', 'Program Files', 'Internet Explorer', 'iexplore.exe')
Saves tone of time.
First, see if you have splitunc() as an available function within os.path. The first item returned should be what you want... but I am on Linux and I do not have this function when I import os and try to use it.
Otherwise, one semi-ugly way that gets the job done is to use:
>>> pathname = "\\C:\\mystuff\\project\\file.py"
>>> pathname
'\\C:\\mystuff\\project\\file.py'
>>> print pathname
\C:\mystuff\project\file.py
>>> "\\".join(pathname.split('\\')[:-2])
'\\C:\\mystuff'
>>> "\\".join(pathname.split('\\')[:-1])
'\\C:\\mystuff\\project'
which shows retrieving the directory just above the file, and the directory just above that.
import os
directory = os.path.abspath('\\') # root directory
print(directory) # e.g. 'C:\'
directory = os.path.abspath('.') # current directory
print(directory) # e.g. 'C:\Users\User\Desktop'
parent_directory, directory_name = os.path.split(directory)
print(directory_name) # e.g. 'Desktop'
parent_parent_directory, parent_directory_name = os.path.split(parent_directory)
print(parent_directory_name) # e.g. 'User'
This should also do the trick.
This is what I did to extract the piece of the directory:
for path in file_list:
directories = path.rsplit('\\')
directories.reverse()
line_replace_add_directory = line_replace+directories[2]
Thank you for your help.
You have to put the entire path as a parameter to os.path.split. See The docs. It doesn't work like string split.

os.path.dirname(__file__) returns empty

I want to get the path of the current directory under which a .py file is executed.
For example a simple file D:\test.py with code:
import os
print os.getcwd()
print os.path.basename(__file__)
print os.path.abspath(__file__)
print os.path.dirname(__file__)
It is weird that the output is:
D:\
test.py
D:\test.py
EMPTY
I am expecting the same results from the getcwd() and path.dirname().
Given os.path.abspath = os.path.dirname + os.path.basename, why
os.path.dirname(__file__)
returns empty?
Because os.path.abspath = os.path.dirname + os.path.basename does not hold. we rather have
os.path.dirname(filename) + os.path.basename(filename) == filename
Both dirname() and basename() only split the passed filename into components without taking into account the current directory. If you want to also consider the current directory, you have to do so explicitly.
To get the dirname of the absolute path, use
os.path.dirname(os.path.abspath(__file__))
import os.path
dirname = os.path.dirname(__file__) or '.'
os.path.split(os.path.realpath(__file__))[0]
os.path.realpath(__file__)return the abspath of the current script; os.path.split(abspath)[0] return the current dir
can be used also like that:
dirname(dirname(abspath(__file__)))
print(os.path.join(os.path.dirname(__file__)))
You can also use this way
Since Python 3.4, you can use pathlib to get the current directory:
from pathlib import Path
# get parent directory
curr_dir = Path(__file__).parent
file_path = curr_dir.joinpath('otherfile.txt')
None of the above answers is correct. OP wants to get the path of the current directory under which a .py file is executed, not stored.
Thus, if the path of this file is /opt/script.py...
#! /usr/bin/env python3
from pathlib import Path
# -- file's directory -- where the file is stored
fd = Path(__file__).parent
# -- current directory -- where the file is executed
# (i.e. the directory of the process)
cwd = Path.cwd()
print(f'{fd=} {cwd=}')
Only if we run this script from /opt, fd and cwd will be the same.
$ cd /
$ /opt/script.py
cwd=PosixPath('/') fd=PosixPath('/opt')
$ cd opt
$ ./script.py
cwd=PosixPath('/opt') fd=PosixPath('/opt')
$ cd child
$ ../script.py
cwd=PosixPath('/opt/child') fd=PosixPath('/opt/child/..')
I guess this is a straight forward code without the os module..
__file__.split(__file__.split("/")[-1])[0]

Rename multiple files in a directory in Python

I'm trying to rename some files in a directory using Python.
Say I have a file called CHEESE_CHEESE_TYPE.*** and want to remove CHEESE_ so my resulting filename would be CHEESE_TYPE
I'm trying to use the os.path.split but it's not working properly. I have also considered using string manipulations, but have not been successful with that either.
Use os.rename(src, dst) to rename or move a file or a directory.
$ ls
cheese_cheese_type.bar cheese_cheese_type.foo
$ python
>>> import os
>>> for filename in os.listdir("."):
... if filename.startswith("cheese_"):
... os.rename(filename, filename[7:])
...
>>>
$ ls
cheese_type.bar cheese_type.foo
Here's a script based on your newest comment.
#!/usr/bin/env python
from os import rename, listdir
badprefix = "cheese_"
fnames = listdir('.')
for fname in fnames:
if fname.startswith(badprefix*2):
rename(fname, fname.replace(badprefix, '', 1))
The following code should work. It takes every filename in the current directory, if the filename contains the pattern CHEESE_CHEESE_ then it is renamed. If not nothing is done to the filename.
import os
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("CHEESE_CHEESE_", "CHEESE_"))
Assuming you are already in the directory, and that the "first 8 characters" from your comment hold true always. (Although "CHEESE_" is 7 characters... ? If so, change the 8 below to 7)
from glob import glob
from os import rename
for fname in glob('*.prj'):
rename(fname, fname[8:])
I have the same issue, where I want to replace the white space in any pdf file to a dash -.
But the files were in multiple sub-directories. So, I had to use os.walk().
In your case for multiple sub-directories, it could be something like this:
import os
for dpath, dnames, fnames in os.walk('/path/to/directory'):
for f in fnames:
os.chdir(dpath)
if f.startswith('cheese_'):
os.rename(f, f.replace('cheese_', ''))
Try this:
import os
import shutil
for file in os.listdir(dirpath):
newfile = os.path.join(dirpath, file.split("_",1)[1])
shutil.move(os.path.join(dirpath,file),newfile)
I'm assuming you don't want to remove the file extension, but you can just do the same split with periods.
This sort of stuff is perfectly fitted for IPython, which has shell integration.
In [1] files = !ls
In [2] for f in files:
newname = process_filename(f)
mv $f $newname
Note: to store this in a script, use the .ipy extension, and prefix all shell commands with !.
See also: http://ipython.org/ipython-doc/stable/interactive/shell.html
Here is a more general solution:
This code can be used to remove any particular character or set of characters recursively from all filenames within a directory and replace them with any other character, set of characters or no character.
import os
paths = (os.path.join(root, filename)
for root, _, filenames in os.walk('C:\FolderName')
for filename in filenames)
for path in paths:
# the '#' in the example below will be replaced by the '-' in the filenames in the directory
newname = path.replace('#', '-')
if newname != path:
os.rename(path, newname)
It seems that your problem is more in determining the new file name rather than the rename itself (for which you could use the os.rename method).
It is not clear from your question what the pattern is that you want to be renaming. There is nothing wrong with string manipulation. A regular expression may be what you need here.
import os
import string
def rename_files():
#List all files in the directory
file_list = os.listdir("/Users/tedfuller/Desktop/prank/")
print(file_list)
#Change current working directory and print out it's location
working_location = os.chdir("/Users/tedfuller/Desktop/prank/")
working_location = os.getcwd()
print(working_location)
#Rename all the files in that directory
for file_name in file_list:
os.rename(file_name, file_name.translate(str.maketrans("","",string.digits)))
rename_files()
This command will remove the initial "CHEESE_" string from all the files in the current directory, using renamer:
$ renamer --find "/^CHEESE_/" *
I was originally looking for some GUI which would allow renaming using regular expressions and which had a preview of the result before applying changes.
On Linux I have successfully used krename, on Windows Total Commander does renaming with regexes, but I found no decent free equivalent for OSX, so I ended up writing a python script which works recursively and by default only prints the new file names without making any changes. Add the '-w' switch to actually modify the file names.
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import fnmatch
import sys
import shutil
import re
def usage():
print """
Usage:
%s <work_dir> <search_regex> <replace_regex> [-w|--write]
By default no changes are made, add '-w' or '--write' as last arg to actually rename files
after you have previewed the result.
""" % (os.path.basename(sys.argv[0]))
def rename_files(directory, search_pattern, replace_pattern, write_changes=False):
pattern_old = re.compile(search_pattern)
for path, dirs, files in os.walk(os.path.abspath(directory)):
for filename in fnmatch.filter(files, "*.*"):
if pattern_old.findall(filename):
new_name = pattern_old.sub(replace_pattern, filename)
filepath_old = os.path.join(path, filename)
filepath_new = os.path.join(path, new_name)
if not filepath_new:
print 'Replacement regex {} returns empty value! Skipping'.format(replace_pattern)
continue
print new_name
if write_changes:
shutil.move(filepath_old, filepath_new)
else:
print 'Name [{}] does not match search regex [{}]'.format(filename, search_pattern)
if __name__ == '__main__':
if len(sys.argv) < 4:
usage()
sys.exit(-1)
work_dir = sys.argv[1]
search_regex = sys.argv[2]
replace_regex = sys.argv[3]
write_changes = (len(sys.argv) > 4) and sys.argv[4].lower() in ['--write', '-w']
rename_files(work_dir, search_regex, replace_regex, write_changes)
Example use case
I want to flip parts of a file name in the following manner, i.e. move the bit m7-08 to the beginning of the file name:
# Before:
Summary-building-mobile-apps-ionic-framework-angularjs-m7-08.mp4
# After:
m7-08_Summary-building-mobile-apps-ionic-framework-angularjs.mp4
This will perform a dry run, and print the new file names without actually renaming any files:
rename_files_regex.py . "([^\.]+?)-(m\\d+-\\d+)" "\\2_\\1"
This will do the actual renaming (you can use either -w or --write):
rename_files_regex.py . "([^\.]+?)-(m\\d+-\\d+)" "\\2_\\1" --write
You can use os.system function for simplicity and to invoke bash to accomplish the task:
import os
os.system('mv old_filename new_filename')
This works for me.
import os
for afile in os.listdir('.'):
filename, file_extension = os.path.splitext(afile)
if not file_extension == '.xyz':
os.rename(afile, filename + '.abc')
What about this :
import re
p = re.compile(r'_')
p.split(filename, 1) #where filename is CHEESE_CHEESE_TYPE.***

Directory-tree listing in Python

How do I get a list of all files (and directories) in a given directory in Python?
This is a way to traverse every file and directory in a directory tree:
import os
for dirname, dirnames, filenames in os.walk('.'):
# print path to all subdirectories first.
for subdirname in dirnames:
print(os.path.join(dirname, subdirname))
# print path to all filenames.
for filename in filenames:
print(os.path.join(dirname, filename))
# Advanced usage:
# editing the 'dirnames' list will stop os.walk() from recursing into there.
if '.git' in dirnames:
# don't go into any .git directories.
dirnames.remove('.git')
You can use
os.listdir(path)
For reference and more os functions look here:
Python 2 docs: https://docs.python.org/2/library/os.html#os.listdir
Python 3 docs: https://docs.python.org/3/library/os.html#os.listdir
Here's a helper function I use quite often:
import os
def listdir_fullpath(d):
return [os.path.join(d, f) for f in os.listdir(d)]
import os
for filename in os.listdir("C:\\temp"):
print filename
If you need globbing abilities, there's a module for that as well. For example:
import glob
glob.glob('./[0-9].*')
will return something like:
['./1.gif', './2.txt']
See the documentation here.
For files in current working directory without specifying a path
Python 2.7:
import os
os.listdir('.')
Python 3.x:
import os
os.listdir()
Try this:
import os
for top, dirs, files in os.walk('./'):
for nm in files:
print os.path.join(top, nm)
While os.listdir() is fine for generating a list of file and dir names, frequently you want to do more once you have those names - and in Python3, pathlib makes those other chores simple. Let's take a look and see if you like it as much as I do.
To list dir contents, construct a Path object and grab the iterator:
In [16]: Path('/etc').iterdir()
Out[16]: <generator object Path.iterdir at 0x110853fc0>
If we want just a list of names of things:
In [17]: [x.name for x in Path('/etc').iterdir()]
Out[17]:
['emond.d',
'ntp-restrict.conf',
'periodic',
If you want just the dirs:
In [18]: [x.name for x in Path('/etc').iterdir() if x.is_dir()]
Out[18]:
['emond.d',
'periodic',
'mach_init.d',
If you want the names of all conf files in that tree:
In [20]: [x.name for x in Path('/etc').glob('**/*.conf')]
Out[20]:
['ntp-restrict.conf',
'dnsextd.conf',
'syslog.conf',
If you want a list of conf files in the tree >= 1K:
In [23]: [x.name for x in Path('/etc').glob('**/*.conf') if x.stat().st_size > 1024]
Out[23]:
['dnsextd.conf',
'pf.conf',
'autofs.conf',
Resolving relative paths become easy:
In [32]: Path('../Operational Metrics.md').resolve()
Out[32]: PosixPath('/Users/starver/code/xxxx/Operational Metrics.md')
Navigating with a Path is pretty clear (although unexpected):
In [10]: p = Path('.')
In [11]: core = p / 'web' / 'core'
In [13]: [x for x in core.iterdir() if x.is_file()]
Out[13]:
[PosixPath('web/core/metrics.py'),
PosixPath('web/core/services.py'),
PosixPath('web/core/querysets.py'),
A recursive implementation
import os
def scan_dir(dir):
for name in os.listdir(dir):
path = os.path.join(dir, name)
if os.path.isfile(path):
print path
else:
scan_dir(path)
I wrote a long version, with all the options I might need: http://sam.nipl.net/code/python/find.py
I guess it will fit here too:
#!/usr/bin/env python
import os
import sys
def ls(dir, hidden=False, relative=True):
nodes = []
for nm in os.listdir(dir):
if not hidden and nm.startswith('.'):
continue
if not relative:
nm = os.path.join(dir, nm)
nodes.append(nm)
nodes.sort()
return nodes
def find(root, files=True, dirs=False, hidden=False, relative=True, topdown=True):
root = os.path.join(root, '') # add slash if not there
for parent, ldirs, lfiles in os.walk(root, topdown=topdown):
if relative:
parent = parent[len(root):]
if dirs and parent:
yield os.path.join(parent, '')
if not hidden:
lfiles = [nm for nm in lfiles if not nm.startswith('.')]
ldirs[:] = [nm for nm in ldirs if not nm.startswith('.')] # in place
if files:
lfiles.sort()
for nm in lfiles:
nm = os.path.join(parent, nm)
yield nm
def test(root):
print "* directory listing, with hidden files:"
print ls(root, hidden=True)
print
print "* recursive listing, with dirs, but no hidden files:"
for f in find(root, dirs=True):
print f
print
if __name__ == "__main__":
test(*sys.argv[1:])
Here is another option.
os.scandir(path='.')
It returns an iterator of os.DirEntry objects corresponding to the entries (along with file attribute information) in the directory given by path.
Example:
with os.scandir(path) as it:
for entry in it:
if not entry.name.startswith('.'):
print(entry.name)
Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.
Python Docs
The one worked with me is kind of a modified version from Saleh's answer elsewhere on this page.
The code is as follows:
dir = 'given_directory_name'
filenames = [os.path.abspath(os.path.join(dir,i)) for i in os.listdir(dir)]
A nice one liner to list only the files recursively. I used this in my setup.py package_data directive:
import os
[os.path.join(x[0],y) for x in os.walk('<some_directory>') for y in x[2]]
I know it's not the answer to the question, but may come in handy
For Python 2
#!/bin/python2
import os
def scan_dir(path):
print map(os.path.abspath, os.listdir(pwd))
For Python 3
For filter and map, you need wrap them with list()
#!/bin/python3
import os
def scan_dir(path):
print(list(map(os.path.abspath, os.listdir(pwd))))
The recommendation now is that you replace your usage of map and filter with generators expressions or list comprehensions:
#!/bin/python
import os
def scan_dir(path):
print([os.path.abspath(f) for f in os.listdir(path)])
#import modules
import os
_CURRENT_DIR = '.'
def rec_tree_traverse(curr_dir, indent):
"recurcive function to traverse the directory"
#print "[traverse_tree]"
try :
dfList = [os.path.join(curr_dir, f_or_d) for f_or_d in os.listdir(curr_dir)]
except:
print "wrong path name/directory name"
return
for file_or_dir in dfList:
if os.path.isdir(file_or_dir):
#print "dir : ",
print indent, file_or_dir,"\\"
rec_tree_traverse(file_or_dir, indent*2)
if os.path.isfile(file_or_dir):
#print "file : ",
print indent, file_or_dir
#end if for loop
#end of traverse_tree()
def main():
base_dir = _CURRENT_DIR
rec_tree_traverse(base_dir," ")
raw_input("enter any key to exit....")
#end of main()
if __name__ == '__main__':
main()
FYI Add a filter of extension or ext file
import os
path = '.'
for dirname, dirnames, filenames in os.walk(path):
# print path to all filenames with extension py.
for filename in filenames:
fname_path = os.path.join(dirname, filename)
fext = os.path.splitext(fname_path)[1]
if fext == '.py':
print fname_path
else:
continue
If figured I'd throw this in. Simple and dirty way to do wildcard searches.
import re
import os
[a for a in os.listdir(".") if re.search("^.*\.py$",a)]
Below code will list directories and the files within the dir
def print_directory_contents(sPath):
import os
for sChild in os.listdir(sPath):
sChildPath = os.path.join(sPath,sChild)
if os.path.isdir(sChildPath):
print_directory_contents(sChildPath)
else:
print(sChildPath)
Here is a one line Pythonic version:
import os
dir = 'given_directory_name'
filenames = [os.path.join(os.path.dirname(os.path.abspath(__file__)),dir,i) for i in os.listdir(dir)]
This code lists the full path of all files and directories in the given directory name.
I know this is an old question. This is a neat way I came across if you are on a liunx machine.
import subprocess
print(subprocess.check_output(["ls", "/"]).decode("utf8"))
Easiest way:
list_output_files = [os.getcwd()+"\\"+f for f in os.listdir(os.getcwd())]

Categories