How to get filename without some special extensions in python

How to get filename without some special extensions in python - python

I have a file that has some special extension. Sometime it is '.exe', or 'exe.gz' or 'exe.tar.gz'...I want to get the filename only. I am using the below code to get filename abc but it cannot work for all cases
import os
filename = 'abc.exe'
base = os.path.basename(filename)
print(os.path.splitext(base)[0])
filename = 'abc.exe.gz'
base = os.path.basename(filename)
print(os.path.splitext(base)[0])
Note that, I knew the list of extensions such as ['.exe','exe.gz','exe.tar.gz', '.gz']

You can just split with the . char and take the first element:
>>> filename = 'abc.exe'
>>> filename.split('.')[0]
'abc'
>>> filename = 'abc.exe.gz'
>>> filename.split('.')[0]
'abc'

How about a workaround like this?
suffixes = ['.exe','.exe.gz','.exe.tar.gz', '.gz']
def get_basename(filename):
for suffix in suffixes:
if filename.endswith(suffix):
return filename[:-len(suffix)]
return filename

Related

how to split the filename using python

I am using python to create xml file using element and subelement process.
I have a list of zip files in my folder listed below:
Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_secretmanager_rev1_2021_03_09.zip
Retirement_participant-plan_info_v1_mypru_plankeys_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_param_value_rev1_2021_03_09.zip
Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip
I want to split those zip files and get the name like this:
Retirement_participant-plan_info_v1_getPlankeys
Retirement_participant-plan_info_resetcache_secretmanager
Retirement_participant-plan_info_v1_mypru_plankeys
Retirement_participant-plan_info_resetcache_param_value
Retirement_participant-plan_info_resetcache_param_v1_balances
PS: I want to remove _rev1_2021_03_09.zip while creating a name from the zip file.
here is my python code. It works with Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip but its not working if i have too big names for a zip file for eg Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip
Proxies = SubElement(proxy, 'Proxies')
path = "./"
for f in os.listdir(path):
if '.zip' in f:
Proxy = SubElement(Proxies, 'Proxy')
name = SubElement(Proxy, 'name')
fileName = SubElement(Proxy, 'fileName')
a = f.split('_')
name.text = '_'.join(a[:3])
fileName.text = str(f)

You can str.split by rev1_
>>> filenames
['Retirement_participant-plan_info_v1_getPlankeys_rev1_2021_03_09.zip',
'Retirement_participant-plan_info_resetcache_secretmanager_rev1_2021_03_09.zip',
'Retirement_participant-plan_info_v1_mypru_plankeys_rev1_2021_03_09.zip',
'Retirement_participant-plan_info_resetcache_param_value_rev1_2021_03_09.zip',
'Retirement_participant-plan_info_resetcache_param_v1_balances_rev1_2021_03_09.zip']
>>> names = [fname.split('_rev1_')[0] for fname in filenames]
>>> names
['Retirement_participant-plan_info_v1_getPlankeys',
'Retirement_participant-plan_info_resetcache_secretmanager',
'Retirement_participant-plan_info_v1_mypru_plankeys',
'Retirement_participant-plan_info_resetcache_param_value',
'Retirement_participant-plan_info_resetcache_param_v1_balances']
Same can be achieved with str.rsplit by limiting the maxsplit to 4:
>>> names = [fname.rsplit('_', 4)[0] for fname in filenames]
>>> names
['Retirement_participant-plan_info_v1_getPlankeys',
'Retirement_participant-plan_info_resetcache_secretmanager',
'Retirement_participant-plan_info_v1_mypru_plankeys',
'Retirement_participant-plan_info_resetcache_param_value',
'Retirement_participant-plan_info_resetcache_param_v1_balances']

If the rev and date is always the same (2021_03_09), just replace them with the empty string:
filenames = [f.replace("_rev1_2021_03_09.zip", "") for f in os.listdir(path)]

file renaming adding a suffix

How could I add from the following variable fIN = T1_r.nii.gz
the following suffix _brain and create the following output filename?
fOut = T1_r_brain.nii.gz
When I use the following command line
fIn2, file_extension = os.path.splitext(fIn)
it only removes the .gz extension.
Thank you for your help
Fred

I had to write a utility for this, and here's what I came up with.
from pathlib import Path
def add_str_before_suffixes(filepath, string: str) -> Path:
"""Append a string to a filename immediately before extension(s).
Parameters
----------
filepath : Path-like
Path to modify. Can contain multiple extensions like `.bed.gz`.
string : str
String to append to filename.
Returns
-------
Instance of `pathlib.Path`.
Examples
--------
>>> add_str_before_suffixes("foo", "_baz")
PosixPath('foo_baz')
>>> add_str_before_suffixes("foo.bed", "_baz")
PosixPath('foo_baz.bed')
>>> add_str_before_suffixes("foo.bed.gz", "_baz")
PosixPath('foo_baz.bed.gz')
"""
filepath = Path(filepath)
suffix = "".join(filepath.suffixes)
orig_name = filepath.name.replace(suffix, "")
new_name = f"{orig_name}{string}{suffix}"
return filepath.with_name(new_name)
Here is an example:
>>> f_in = "T1_r.nii.gz"
>>> add_str_before_suffixes(f_in, "_brain")
PosixPath('T1_r_brain.nii.gz')

split_path = 'T1_r.nii.gz'.split('.')
split_path[0] += '_brain'
final_path = ".".join(split_path)

How to remove all numbers from a file name or a string - Python3

I have a Python script that will walk through all of the directories within the Test Folder(in this case) and will remove all numbers at the beginning of each of the file names. So my question is how would I modify my script in order to remove numbers from the whole file name? Not just the beginning or the end of it.
Thanks,
Alex
import os
for root, dirs, files in os.walk("Test Folder", topdown=True):
for name in files:
if (name.startswith("01") or name.startswith("02") or name.startswith("03") or name.startswith("04") or name.startswith("04") or name.startswith("05") or name.startswith("06") or name.startswith("07") or name.startswith("08") or name.startswith("09") or name[0].isdigit()):
old_filepath = (os.path.join(root, name))
_, new_filename = name.split(" ", maxsplit=1)
new_filepath = (os.path.join(root, new_filename))
os.rename(old_filepath, new_filepath)

Use regular expression, particularly re.sub:
>>> import re
>>> filename = '12name34with56numbers78in9it.txt'
>>> re.sub(r'\d', '', filename)
'namewithnumbersinit.txt'
This replaces everything that matches the \d pattern, i.e. that is a number, with '', i.e. nothing.
If you want to protect the extension, it get's more messy. You have to split the extension from the string, replace numbers in the first part, then join the extension back on. os.path.splitext can help you with that:
>>> filename = '12name34with56numbers78in9it.mp3'
>>> name, ext = os.path.splitext(filename)
>>> re.sub(r'\d+', '', name) + ext
'namewithnumbersinit.mp3'

You can do this:
filename = "this2has8numbers323in5it"
filename = "".join(char for char in filename if not char.isdigit())
No imports necessary.

import os
def rename_files():
# Get files names from the directory
files_names = os.listdir(" file r directory path")
saved_dir = os.chdir(files_name)
# To get cutrrent working directiry name
print os.getcwd() # This is to verify whether you are in correct path
for file_name in files_names:
os.rename(file_name,file_name.translate(None,'0123456789'))
# Above translate function remove all the number from file name
rename_files()

Renaming multiple images with .rename and .endswith

I've been trying to get this to work, but I feel like I'm missing something. There is a large collection of images in a folder that I need to rename just part of the filename. For example, I'm trying to rename the "RJ_200", "RJ_600", and "RJ_60"1 all to the same "RJ_500", while keeping the rest of the filename intact.
Image01.Food.RJ_200.jpg
Image02.Food.RJ_200.jpg
Image03.Basket.RJ_600.jpg
Image04.Basket.RJ_600.jpg
Image05.Cup.RJ_601.jpg
Image06.Cup.RJ_602.jpg
This is what I have so far, but it keeps just giving me the "else" instead of actually renaming any of them:
import os
import fnmatch
import sys
user_profile = os.environ['USERPROFILE']
dir = user_profile + "\Desktop" + "\Working"
print (os.listdir(dir))
for images in dir:
if images.endswith("RJ_***.jpg"):
os.rename("RJ_***.jpg", "RJ_500.jpg")
else:
print ("Arg!")

The Python string method endswith does not do pattern-matching with *, so you're looking for filenames which explicitly include the asterisk character and not finding any.
Try using regular expressions to match your filenames and then building your target filename explicitly:
import os
import re
patt = r'RJ_\d\d\d'
user_profile = os.environ['USERPROFILE']
path = os.path.join(user_profile, "Desktop", "Working")
image_files = os.listdir(path)
for filename in image_files:
flds = filename.split('.')
try:
frag = flds[2]
except IndexError:
continue
if re.match(patt, flds[2]):
from_name = os.path.join(path, filename)
to_name = '.'.join([flds[0], flds[1], 'RJ_500', 'jpg'])
os.rename(from_name, os.path.join(path, to_name))
Note that you need to do your matching with the file's basename and join on the rest of the path later.

You don't need to use .endswith. You can split the image file name up using .split and check the results. Since there are several suffix strings involved, I've put them all into a set for fast membership testing.
import os
import re
import sys
suffixes = {"RJ_200", "RJ_600", "RJ_601"}
new_suffix = "RJ_500"
user_profile = os.environ["USERPROFILE"]
dir = os.path.join(user_profile, "Desktop", "Working")
for image_name in os.listdir(dir):
pieces = image_name.split(".")
if pieces[2] in suffixes:
from_path = os.path.join(dir, image_name)
new_name = ".".join([pieces[0], pieces[1], new_suffix, pieces[3]])
to_path = os.path.join(dir, new_name)
print("renaming {} to {}".format(from_path, to_path))
os.rename(from_path, to_path)

How do I change the name of a file path correctly in Python?

My code
specFileName = input("Enter the file path of the program you would like to capslock: ")
inFile = open(specFileName, 'r')
ified = inFile.read().upper()
outFile = open(specFileName + "UPPER", 'w')
outFile.write(ified)
outFile.close()
print(inFile.read())
This is basically make to take in any file, capitalize everything, and put it into a new file called UPPER"filename". How do I add the "UPPER" bit into the variable without it being at the very end or very beginning? As it won't work like that due to the rest of the file path in the beginning and the file extension at the end. For example, C:/users/me/directory/file.txt would become C:/users/me/directory/UPPERfile.txt

Look into the methods os.path.split and os.path.splitext from the os.path module.
Also, quick reminder: don't forget to close your "infile".

Depending on exactly how you're trying to do this, there's several approaches.
First of all you probably want to grab just the filename, not the whole path. Do this with os.path.split.
>>> pathname = r"C:\windows\system32\test.txt"
>>> os.path.split(pathname)
('C:\\windows\\system32', 'test.txt')
Then you can also look at os.path.splitext
>>> filename = "test.old.txt"
>>> os.path.splitext(filename)
('test.old', '.txt')
And finally string formatting would be good
>>> test_string = "Hello, {}"
>>> test_string.format("world") + ".txt"
"Hello, world.txt"
Put 'em together and you've probably got something like:
def make_upper(filename, new_filename):
with open(filename) as infile:
data = infile.read()
with open(new_filename) as outfile:
outfile.write(data.upper())
def main():
user_in = input("What's the path to your file? ")
path = user_in # just for clarity
root, filename = os.path.split(user_in)
head,tail = os.path.splitext(filename)
new_filename = "UPPER{}{}".format(head,tail)
new_path = os.path.join(root, new_filename)
make_upper(path, new_path)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get filename without some special extensions in python - python

You can just split with the . char and take the first element: >>> filename = 'abc.exe' >>> filename.split('.')[0] 'abc' >>> filename = 'abc.exe.gz' >>> filename.split('.')[0] 'abc'

How about a workaround like this? suffixes = ['.exe','.exe.gz','.exe.tar.gz', '.gz'] def get_basename(filename): for suffix in suffixes: if filename.endswith(suffix): return filename[:-len(suffix)] return filename

Related

how to split the filename using python

file renaming adding a suffix

How to remove all numbers from a file name or a string - Python3

Renaming multiple images with .rename and .endswith

How do I change the name of a file path correctly in Python?

Categories

Resources