a='C:/Users/me/Documents/PythonProjects/opencv/Train\11\00011_00014_00018.png'
I am running a for loop with variables such as a, that are strings.
I intend to obtain the number 11 from the string above.
Using a.replace('\\,'/') , i get the exact same string back , that is, 'C:/Users/me/Documents/PythonProjects/opencv/Train\11\00011_00014_00018.png'
the only way i got it to work was with r/'C:/Users/me/Documents/PythonProjects/opencv/Train\11\00011_00014_00018.png'.replace('\\','/') but that does not work with variables i.e
r'a'.replace('\\','/')
its not like f-strings whereby i can parse variables as such f'{a}'
I would instead recommend using os.path if your intention is to clean up or mutate filesystem paths
>>> import os
>>> a='C:/Users/me/Documents/PythonProjects/opencv/Train\11\00011_00014_00018.png'
>>> os.path.normpath(a)
'C:\\Users\\me\\Documents\\PythonProjects\\opencv\\Train\t\x0011_00014_00018.png'
Using os.path for path manipulation will generally behave correctly on different operating systems without you having to manually modify slashes, drive names, etc.
Thanks it worked !
root_dir = 'C:/Users/me/Documents/PythonProjects/opencv/Train'
all_img_paths = glob.glob(os.path.join(root_dir, '**.png'))
for img_path in all_img_paths:
try:
img = preprocess_img(io.imread(img_path))
label = get_class(img_path)
to:
all_img_paths = glob.glob(os.path.join(os.path.normpath(root_dir), '**.png'))
np.random.shuffle(all_img_paths)
Related
I used r' in front of the string so that Python doesn't treat the backslashes as escape sequences.
df.to_excel(r'C:\Users\A\Desktop\Data\file.xlsx', index = False) #exporting to excelfile named "file.xlsx"
However, this time I need the filename to be a variable instead.
I usually format it by using F-string but I can't combine the r' and f' together. It doesn't work
df.to_excel(r'f'C:\Users\A\Desktop\Data\{filename}.xlsx'', index = False)
How can i solve this? Thanks
I would suggest using either pathlib or os.path module in case you are working with paths and want your project to be compatible with different OS.
For pathlib, you can use the following snippet. Note that the forward slashes will be automatically converted in the correct kind of slash for the current OS.
from pathlib import Path
data_folder = Path("C:/Users/A/Desktop/Data/")
file_name = 'myname.xlsx'
file_path = data_folder / file_name
df.to_excel(file_path, index = False)
The answer to your current question would be using string concatenation. Something like this:
df.to_excel(r'C:\Users\A\Desktop\Data\' + f'{filename}.xlsx', index = False)
You don't have to place each within quote marks- 1 set will do:
fr'C:\Users\A\Desktop\Data\{filename}.xlsx'
Quotes are wrong. You can use it like this.
df.to_excel(rf'C:\Users\A\Desktop\Data\{filename}.xlsx', index = False)
I am creating a format string that is different based on class, that is used to generate a filename by a generic class method. I'm using the Python 3.4+ pathlib.Path module for object-oriented file I/O.
In building this string, the path separator is missing, and rather than just put the windows version in, I want to add a platform independent file separator.
I searched the pathlib docs, and answers here about this, but all the examples assume I'm building a Path object, not a string. The pathlib functions will add the correct separator in any string outputs, but those are actual paths - so it won't work.
Besides something hacky like writing a string and parsing it to figure out what the separator is, is there a way to directly get the current, correct file separator string?
Prefer an answer using pathlib.Path, rather than os or shutil packages.
Here's what the code looks like:
In the constructor:
self.outputdir = Path('drive:\\dir\\file')
self.indiv_fileformatstr = str(self.outputdir) + '{}_new.m'
In the final method used:
indiv_filename = Path(self.indiv_fileformatstr.format(topic))
This leaves out the file separator
There is nothing public in the pathlib module providing the character used by the operating system to separate pathname components. If you really need it, import os and use os.sep.
But you likely don't need it in the first place - it's missing the point of pathlib if you convert to strings in order to join a filename. In typical usage, the separator string itself isn't used for concatenating path components because pathlib overrides division operators (__truediv__ and __rtruediv__) for this purpose. Similarly, it's not needed for splitting due to methods such as Path.parts.
Instead of:
self.indiv_fileformatstr = str(self.outputdir) + '{}_new.m'
You would usually do something like:
self.indiv_fileformatpath = self.outputdir / '{}_new.m'
self.indiv_fileformatstr = str(self.indiv_fileformatpath)
The platform-independent separator is in pathlib.os.sep
Solution using wim's answer
Based on wim's answer, the following works great:
Save the format string in the Path object
When needing to substitute into the templated filename in the future, just use str(path_object) to get the string back out.
import pathlib
# Start with following, with self.outputdir as pathlib.Path object
outputdir = 'c:\\myfolder'
file_template_path = outputdir / '{}_new.m'
# Then to make the final file object later (i.e. in a child class, etc.)
base_filename_string = 'myfile'
new_file = pathlib.Path(str(file_template).format(base_filename_string))
This creates:
pathlib.Path("c:\\myfolder\myfile_new.m")
Creating the template with prefix/postfix/etc.
If you need to apply other variables, you can use 2 levels of formatting to apply specialized prefixes/postfixes, etc., then store the final template in a Path object, as shown above.
When creating 2 levels of formatting, use double brackets where the first level formatter should just create a single bracket and not try to interpret a tag. i.e. {{basename}} becomes just {basename} without any variable substitution.
prefix = 'new_'
postfix = '_1'
ext = 'txt'
file_template_path = outputdir / f'{prefix}{{}}{postfix}.{ext}'
which becomes a path object with the following string:
$ file_template_path
pathlib.Path("c:\\myfolder\new_{}_1.txt")
Given is a variable that contains a windows file path. I have to then go and read this file. The problem here is that the path contains escape characters, and I can't seem to get rid of it. I checked os.path and pathlib, but all expect the correct text formatting already, which I can't seem to construct.
For example this. Please note that fPath is given, so I cant prefix it with r for a rawpath.
#this is given, I cant rawpath it with r
fPath = "P:\python\t\temp.txt"
file = open(fPath, "r")
for line in file:
print (line)
How can I turn fPath via some function or method from:
"P:\python\t\temp.txt"
to
"P:/python/t/temp.txt"
I've tried also tried .replace("\","/"), which doesnt work.
I'm using Python 3.7 for this.
You can use os.path.abspath() to convert it:
print(os.path.abspath("P:\python\t\temp.txt"))
>>> P:/python/t/temp.txt
See the documentation of os.path here.
I've solved it.
The issues lies with the python interpreter. \t and all the others don't exist as such data, but are interpretations of nonprint characters.
So I got a bit lucky and someone else already faced the same problem and solved it with a hard brute-force method:
http://code.activestate.com/recipes/65211/
I just had to find it.
After that I have a raw string without escaped characters, and just need to run the simple replace() on it to get a workable path.
You can use Path function from pathlib library.
from pathlib import Path
docs_folder = Path("some_folder/some_folder/")
text_file = docs_folder / "some_file.txt"
f = open(text_file)
if you would like to do replace then do
replace("\\","/")
When using python version >= 3.4, the class Path from module pathlib offers a function called as_posix, which will sort of convert a path to *nix style path. For example, if you were to build Path object via p = pathlib.Path('C:\\Windows\\SysWOW64\\regedit.exe'), asking it for p.as_posix() it would yield C:/Windows/SysWOW64/regedit.exe. So to obtain a complete *nix style path, you'd need to convert the drive letter manually.
I came across similar problem with Windows file paths. This is what is working for me:
import os
file = input(str().split('\\')
file = '/'.join(file)
This gave me the input from this:
"D:\test.txt"
to this:
"D:/test.txt"
Basically when trying to work with the Windows path, python tends to replace '' to '\'. It goes for every backslash. When working with filepaths, you won't have double slashes since those are splitting folder names.
This way you can list all folders by order by splitting '\' and then rejoining them by .join function with frontslash.
Hopefully this helps!
I'm searching for .txt files only
from glob import glob
result = glob('*.txt')
>> result
['text1.txt','text2.txt','text3.txt']
but I'd like result without the file extensions
>> result
['text1','text2','text3']
Is there a regex pattern that I can use with glob to exclude the file extensions from the output, or do I have to use a list comprehension on result?
There is no way to do that with glob(), You need to take the list given and then create a new one to store the values without the extension:
import os
from glob import glob
[os.path.splitext(val)[0] for val in glob('*.txt')]
os.path.splitext(val) splits the file names into file names and extensions. The [0] just returns the filenames.
Since you’re trying to split off a filename extension, not split an arbitrary string, it makes more sense to use os.path.splitext (or the pathlib module). While it’s true that the it makes no practical difference on the only platforms that currently matter (Windows and *nix), it’s still conceptually clearer what you’re doing. (And if you later start using path-like objects instead of strings, it will continue to work unchanged, to boot.)
So:
paths = [os.path.splitext(path)[0] for path in paths]
Meanwhile, if this really offends you for some reason, what glob does under the covers is just calling fnmatch to turn your glob expression into a regular expression and then applying that to all of the filenames. So, you can replace it by just replacing the regex yourself and using capture groups:
rtxt = re.compile(r'(.*?)\.txt')
files = (rtxt.match(file) for file in os.listdir(dirpath))
files = [match.group(1) for match in files if match]
This way, you’re not doing a listcomp on top of the one that’s already in glob; you’re doing one instead of the one that’s already in glob. I’m not sure if that’s a useful win or not, but since you seem to be interested in eliminating a listcomp…
This glob only selects files without an extension: **/*/!(*.*)
Use index slicing:
result = [i[:-4] for i in result]
Another way using rsplit:
>>> result = ['text1.txt','text2.txt.txt','text3.txt']
>>> [x.rsplit('.txt', 1)[0] for x in result]
['text1', 'text2.txt', 'text3']
You could do as a list-comprehension:
result = [x.rsplit(".txt", 1)[0] for x in glob('*.txt')]
Use str.split
>>> result = [r.split('.')[0] for r in glob('*.txt')]
>>> result
['text1', 'text2', 'text3']
I have a string from which I would like to extract certain part. The string looks like :
E:/test/my_code/content/dir/disp_temp_2.hgx
This is a path on a machine for a specific file with extension hgx
I would exactly like to capture "disp_temp_2". The problem is that I used strip function, does not work for me correctly as there are many '/'. Another problem is that, that the above location will change always on the computer.
Is there any method so that I can capture the exact string between the last '/' and '.'
My code looks like:
path = path.split('.')
.. now I cannot split based on the last '/'.
Any ideas how to do this?
Thanks
Use the os.path module:
import os.path
filename = "E:/test/my_code/content/dir/disp_temp_2.hgx"
name = os.path.basename(filename).split('.')[0]
Python comes with the os.path module, which gives you much better tools for handling paths and filenames:
>>> import os.path
>>> p = "E:/test/my_code/content/dir/disp_temp_2.hgx"
>>> head, tail = os.path.split(p)
>>> tail
'disp_temp_2.hgx'
>>> os.path.splitext(tail)
('disp_temp_2', '.hgx')
Standard libs are cool:
>>> from os import path
>>> f = "E:/test/my_code/content/dir/disp_temp_2.hgx"
>>> path.split(f)[1].rsplit('.', 1)[0]
'disp_temp_2'
Try this:
path=path.rsplit('/',1)[1].split('.')[0]
path = path.split('/')[-1].split('.')[0] works.
You can use the split on the other part :
path = path.split('/')[-1].split('.')[0]