differentiate between file name and file path - python

I need to invoke different functionality if we pass file name and file path
ex
python test.py test1 (invoke different function)
python test.py /home/sai/test1 (invoke different function)
I can get the argument from sys.argv[1]. But I am not able to differentiate into file and filepath.(i.e is it file or file path)

It's a tricky bit as a name of the file is also a valid relative path, right?
You can not differentiate it.
On the other hand assuming you would like to differentiate an absolute path or a relative path starting with a slash\backslash you could use os.path.isabs(path). Doc says it checks if the path starts with slash on Unix, backlash on Win after chopping a potential drive letter:
>>> import os
>>> os.path.isabs('C:\\folder\\name.txt')
True
>>> os.path.isabs('\\folder\\name.txt')
True
>>> os.path.isabs('name.txt')
False
However this will fail with a relative path not beginint with slash:
>>> os.path.isabs('folder\\name.txt')
False
The solution that would work with all cases mentioned above, not sensitive to relative paths with slashes or without them, would be to perform a comparison of the tail of path with the path itself using os.path.basename(path). If they are equal it's just a name:
>>> os.path.basename('C:\\folder\\name.txt') == 'C:\\folder\\name.txt'
False
>>> os.path.basename('\\folder\\name.txt') == '\\folder\\name.txt'
False
>>> os.path.basename('folder\\name.txt') == 'folder\\name.txt'
False
>>> os.path.basename('name.txt') == 'name.txt'
True

You can use isdir() and isfile():
File:
>>> os.path.isdir('a.txt')
False
>>> os.path.isfile('a.txt')
True
Dir:
>>> os.path.isfile('Doc')
False
>>> os.path.isdir('Doc')
True

Did you try
os.path.basename
or
os.path.dirname

Related

Validate relative path by the wildcard in Python

I have a condition where I need to compare the relative name/path of the file (string) with another path that has wildcards:
files list:
aaa/file1.txt
bbb/ccc/file2.txt
accepted files by the wildcard:
aaa/*
bbb/**/*
I need to pick only those files that match to any of the wildcard masks.
aaa/file1.txt equals aaa/* => True
ddd/file3.txt equals aaa/* or bbb/**/* => False
Is that possible to do with glob or any other module?
edit: I removed all unnecessary details from the question after the discussion.
Answering on my own question:
This wcmatch module is able to do what is required for me
from wcmatch import pathlib
print(pathlib.PurePath('a/a/file1.txt').globmatch('a/**/*', flags=pathlib.GLOBSTAR))
print(pathlib.PurePath('b/file2.txt').globmatch('b/**/*', flags=pathlib.GLOBSTAR))
print(pathlib.PurePath('c/file3.txt').globmatch('c/*', flags=pathlib.GLOBSTAR))
print(pathlib.PurePath('d/d/file4.txt').globmatch('d/*', flags=pathlib.GLOBSTAR))
True
True
True
False

return found path in glob

If I have a glob('path/to/my/**/*.json', recursive = True), function returns something like:
path/to/my/subfolder1/subfolder2/file1.json
path/to/my/subfolder1/subfolder2/file2.json
path/to/my/subfolder1/subfolder2/file3.json
path/to/my/subfolder1/file4.json
path/to/my/file5.json
...
I'd like to get only part that starts after ** in the glob, so
subfolder1/subfolder2/file1.json
subfolder1/subfolder2/file2.json
subfolder1/subfolder2/file3.json
subfolder1/file4.json
file5.json
...
What is the best way to do it? Does glob support it natively? Glob input is provided as commmand line hence direct str-replace may be difficult.
Use os.path.commonprefix on the returned paths, then os.path.relpath using the common prefix to get paths relative to it.
An example from a Node.js project with a whole bunch of package.jsons.
>>> pkgs = glob.glob("node_modules/**/package.json", recursive=True)[:10]
['node_modules/queue-microtask/package.json', 'node_modules/callsites/package.json', 'node_modules/sourcemap-codec/package.json', 'node_modules/reusify/package.json', 'node_modules/is-bigint/package.json', 'node_modules/which-boxed-primitive/package.json', 'node_modules/jsesc/package.json', 'node_modules/#types/scheduler/package.json', 'node_modules/#types/react-dom/package.json', 'node_modules/#types/prop-types/package.json']
>>> pfx = os.path.commonprefix(pkgs)
'node_modules/'
>>> [os.path.relpath(pkg, pfx) for pkg in pkgs]
['queue-microtask/package.json', 'callsites/package.json', 'sourcemap-codec/package.json', 'reusify/package.json', 'is-bigint/package.json', 'which-boxed-primitive/package.json', 'jsesc/package.json', '#types/scheduler/package.json', '#types/react-dom/package.json', '#types/prop-types/package.json']
>>>

Having trouble converting paths in Pathlib

I have to take in a file path that looks like this:
'C:/Users/xxx/Desktop/test_folder'
It gets stored into a variable as a string so:
path_intake = 'C:/Users/xxx/Desktop/test_folder'
I want to assign that path to my
p = Path(path_intake)
But, When p takes in path_intake it changes the path to:
'C:\Users\xxx\Desktop\test_folder'
Which is not what I want since .rglob can only read the path like this:
p = Path(C:/Users/xxx/Desktop/test_folder)
How to obtain this path by taking in the first path?
The value
C:/Users/xxx/Desktop/test_folder
is not a canonical Windows path string. As everyone knows, Windows uses backslashes. So if you supply /, pathlib turns the path into a canonical path string for your platform, which is
C:\Users\xxx\Desktop\test_folder
But the two Path objects are identical, as you will quickly see if you do this:
>>> p = pathlib.Path(r"C:\Users\xxx\Desktop\test_folder")
>>> p2 = pathlib.Path(r"C:/Users/xxx/Desktop/test_folder")
>>> p == p2
True
You are not correct when you say that ".rglob can only read a path like this: C:/Users/xxx/Desktop/test_folder". To demonstrate that, do this:
>>> list(p.rglob("*.txt")) == list(p2.rglob("*.txt"))
True
The Path objects are identical and you can call .rglob() on either one and get the expected result.

Comparing two paths in python

Consider:
path1 = "c:/fold1/fold2"
list_of_paths = ["c:\\fold1\\fold2","c:\\temp\\temp123"]
if path1 in list_of_paths:
print "found"
I would like the if statement to return True, but it evaluates to False,
since it is a string comparison.
How to compare two paths irrespective of the forward or backward slashes they have? I'd prefer not to use the replace function to convert both strings to a common format.
Use os.path.normpath to convert c:/fold1/fold2 to c:\fold1\fold2:
>>> path1 = "c:/fold1/fold2"
>>> list_of_paths = ["c:\\fold1\\fold2","c:\\temp\\temp123"]
>>> os.path.normpath(path1)
'c:\\fold1\\fold2'
>>> os.path.normpath(path1) in list_of_paths
True
>>> os.path.normpath(path1) in (os.path.normpath(p) for p in list_of_paths)
True
os.path.normpath(path1) in map(os.path.normpath, list_of_paths) also works, but it will build a list with entire path items even though there's match in the middle. (In Python 2.x)
On Windows, you must use os.path.normcase to compare paths because on Windows, paths are not case-sensitive.
All of these answers mention os.path.normpath, but none of them mention os.path.realpath:
os.path.realpath(path)
Return the canonical path of the specified filename, eliminating any symbolic links encountered in the path (if they are supported by the operating system).
New in version 2.2.
So then:
if os.path.realpath(path1) in (os.path.realpath(p) for p in list_of_paths):
# ...
The os.path module contains several functions to normalize file paths so that equivalent paths normalize to the same string. You may want normpath, normcase, abspath, samefile, or some other tool.
If you are using python-3, you can use pathlib to achieve your goal:
import pathlib
path1 = pathlib.Path("c:/fold1/fold2")
list_of_paths = [pathlib.Path(path) for path in ["c:\\fold1\\fold2","c:\\temp\\temp123"]]
assert path1 in list_of_paths
Store the list_of_paths as a list instead of a string:
list_of_paths = [["c:","fold1","fold2"],["c","temp","temp123"]]
Then split given path by '/' or '\' (whichever is present) and then use the in keyword.
Use os.path.normpath to canonicalize the paths before comparing them. For example:
if any(os.path.normpath(path1) == os.path.normpath(p)
for p in list_of_paths):
print "found"

How to check wether a path represented by a QString with german umlauts exists?

i get a QString which represents a directory from a QLineEdit. Now i want to check wether a certain file exists in this directory. But if i try this with os.path.exists and os.path.join and get in trouble when german umlauts occur in the directory path:
#the direcory coming from the user input in the QLineEdit
#i take this QString to the local 8-Bit encoding and then make
#a string from it
target_dir = str(lineEdit.text().toLocal8Bit())
#the file name that should be checked for
file_name = 'some-name.txt'
#this fails with a UnicodeDecodeError when a umlaut occurs in target_dir
os.path.exists(os.path.join(target_dir, file_name))
How would you check if the file exists, when you might encounter german umlauts?
I was getting no where with this on my Ubuntu box with an ext3 filesystem. So, I guess make sure the filesystem supports unicode filenames first, or else I believe the behavior is undefined?
>>> os.path.supports_unicode_filenames
True
If that's True, you should be able to pass unicode strings to the os.path calls directly:
>>> print u'\xf6'
ö
>>> target_dir = os.path.join(os.getcwd(), u'\xf6')
>>> print target_dir
C:\Python26\ö
>>> os.path.exists(os.path.join(target_dir, 'test.txt'))
True
You should look at QString.toUtf8 and maybe pass the returned value through os.path.normpath before handing it over to os.path.join
Good Luck!
nm, it works fine on my ubuntu box as well...
>>> os.path.supports_unicode_filenames
False
>>> target_dir = os.path.join(os.getcwd(), u'\xf6')
>>> print target_dir
/home/clayg/ö
>>> os.path.exists(os.path.join(target_dir, 'test'))
True

Categories