Strip and split at same time in python - python

I'm trying to split and strip one string at same time.
I have a file D:\printLogs\newPrintLogs\4.txt and I want to split it that I get only 4.txt and than to strip the .txt and add in string + ".zpr" to get "4.zpr".
This is the code that I tryed to use:
name = str(logfile)
print ("File name: " + name.split('\\')[-1] + name.strip( '.txt' ))
But I get this output:
File name: 4.txtD:\printLogs\newPrintLogs\4

Don't use stripping and splitting.
First of all, stripping removes all characters from a set, you are removing all 't', 'x' and '.' characters from the start and end of your string, regardless of order:
>>> 'tttx.foox'.strip('.txt')
'foo'
>>> 'tttx.foox'.strip('xt.')
'foo'
Secondly, Python offers you the os.path module for handling paths in a cross-platform and consistent manner:
basename = os.path.basename(logfile)
if basename.endswith('.txt'):
basename = os.path.splitext(basename)[0]
You can drop the str.endswith() test if you just want to remove any extension:
basename = os.path.splitext(os.path.basename(logfile))[0]
Demo:
>>> import os.path
>>> logfile = r'D:\printLogs\newPrintLogs\4.txt'
>>> os.path.splitext(os.path.basename(logfile))[0]
'4'

You're adding too much there. This is all you need:
print ("File name: " + name.split('\\')[-1].strip( '.txt' ))
Better yet, use the os module:
>>> import os
>>> os.path.splitext(os.path.basename(r'D:\printLogs\newPrintLogs\4.txt'))[0]
'4'
Or, split up among several steps, with occasional feedback:
>>> import os
>>> name = r'D:\printLogs\newPrintLogs\4.txt'
>>> basename = os.path.basename(name)
>>> basename
'4.txt'
>>> splitname = os.path.splitext(basename)
>>> splitname
('4', '.txt')
>>> splitname[0]
'4'

Thank you all for your solutions it helped me but at first I didn't explained question right.
I founded solution for my problem
name = str(logfile)
print ("Part name: " + name.split('\\')[-1].replace('.txt','.zpr'))

For python 3.4 or later:
import pathlib
name = r"D:\printLogs\newPrintLogs\4.txt"
stem = pathlib.Path(name).stem
print(stem) # prints 4

You can split and rstrip:
print(s.rsplit("\\",1)[1].rstrip(".txt"))
But it may be safer to split on the .:
print(s.rsplit("\\",1)[1].rsplit(".",1)[0])
If you rstrip or strip you could end up removing more than just the .txt .

Related

Absolute path of string that contains special characters

In my code, I get a path from the database that may contain special escaping characters that I need to convert them to a real path name. I'm using python 3.7 on Windows.
Suppose this path: C:\Files\2c2b2541\00025\test.x
IMPORTANT: the path is not a fixed value in the code and it is an output of executing a Stored Procedure from pyodbc.
When I try to convert it to an absolute path I get this error:
ValueError: _getfullpathname: embedded null character in path
I also tried to replace "\" with "/" but with no luck.
import os
# path = cursor.execute(query, "some_input").fetchone()[0]
path = 'C:\Files\2c2b2541\00025\test.x'
print(os.path.abspath(path))
Judging by your comments on the other answers, it sounds like the data is already corrupted in the database you're using. That is, you have a literal null byte stored there, and perhaps other bogus bytes (like \2 perhaps turning into \x02). So you probably need two fixes.
First, you should fix whatever code is putting values into the database, so it won't put bogus data in any more. You haven't described how the data gets into the database, so I we can't give you much guidance on how to do this. But most programming languages (and DB libraries) have tools to prevent escape sequences from being evaluated in strings where they're not wanted.
Once you've stopped new bad data from getting added, you can work on fixing the values that are already in the database. It probably shouldn't be too hard to write a query that will replace \0 null bytes with \\0 (or whatever the appropriate escape sequence is for your DB). You may want to look for special characters like newlines (\n) and unprintable characters (like \x02) as well.
I'd only try to fix this issue on the output end if you don't have any control of the database at all.
I think the below is the right way to solve your problem.
>>> def get_fixed_path(path):
... path = repr(path)
... path = path.replace("\\", "\\\\")
... path = path.replace("\\x", "\\\\0")
... path = os.path.abspath(path3).split("'")[1]
... return path
...
>>>
>>> path = 'C:\Files\2c2b2541\00025\test.x'
>>> path
'C:\\Files\x02c2b2541\x0025\test.x'
>>>
>>> print(path)
C:\Filesc2b2541 25 est.x
>>>
>>> final_path = get_fixed_path(path)
>>> final_path
'C:\\Files\\002c2b2541\\00025\\test.x'
>>>
>>> print(final_path)
C:\Files\002c2b2541\00025\test.x
>>>
And here is the detailed description of each and every steps/statements in the above solution.
First step (problem)
>>> import os
>>>
>>> path = 'C:\Files\2c2b2541\00025\test.x'
>>> path
'C:\\Files\x02c2b2541\x0025\test.x'
>>>
>>> print(path)
C:\Filesc2b2541 25 est.x
>>>
Second step (problem)
>>> path2 = repr(path)
>>> path2
"'C:\\\\Files\\x02c2b2541\\x0025\\test.x'"
>>>
>>> print(path2)
'C:\\Files\x02c2b2541\x0025\test.x'
>>>
Third step (problem)
>>> path3 = path2.replace("\\", "\\\\")
>>> path3
"'C:\\\\\\\\Files\\\\x02c2b2541\\\\x0025\\\\test.x'"
>>>
>>> print(path3)
'C:\\\\Files\\x02c2b2541\\x0025\\test.x'
>>>
>>> path3 = path3.replace("\\x", "\\\\0")
>>> path3
"'C:\\\\\\\\Files\\\\\\002c2b2541\\\\\\00025\\\\test.x'"
>>>
>>> print(path3)
'C:\\\\Files\\\002c2b2541\\\00025\\test.x'
>>>
Fourth step (problem)
>>> os.path.abspath(path3)
"C:\\Users\\RISHIKESH\\'C:\\Files\\002c2b2541\\00025\\test.x'"
>>>
>>> os.path.abspath(path2)
"C:\\Users\\RISHIKESH\\'C:\\Files\\x02c2b2541\\x0025\\test.x'"
>>>
>>> os.path.abspath('k')
'C:\\Users\\RISHIKESH\\k'
>>>
>>> os.path.abspath(path3).split("'")
['C:\\Users\\RISHIKESH\\', 'C:\\Files\\002c2b2541\\00025\\test.x', '']
>>> os.path.abspath(path3).split("'")[1]
'C:\\Files\\002c2b2541\\00025\\test.x'
>>>
Final step (solution)
>>> final_path = os.path.abspath(path3).split("'")[1]
>>>
>>> final_path
'C:\\Files\\002c2b2541\\00025\\test.x'
>>>
>>> print(final_path)
C:\Files\002c2b2541\00025\test.x
>>>
Replace "\" by "\\".
That's it.
You need to either use a raw string literal or double backslashes \\.
import os
path = r'C:\Files\2c2b2541\00025\test.x' #r before the string
print(os.path.abspath(path))

How to add numbers infront of each files without touching filename using Python?

I have some files (800+) in folder as shown below:
test_folder
1_one.txt
2_two.txt
3_three.txt
4_power.txt
5_edge.txt
6_mobile.txt
7_test.txt
8_power1.txt
9_like.txt
10_port.txt
11_fire.txt
12_water.txt
I want to rename all these files using python like this:
test_folder
001_one.txt
002_two.txt
003_three.txt
004_power.txt
005_edge.txt
006_mobile.txt
007_test.txt
008_power1.txt
009_like.txt
010_port.txt
011_fire.txt
012_water.txt
Can we do this with Python? Please guide on how to do this.
Use zfill to pad zeros
import os,glob
src_folder = r"/user/bin/"
for file_name in glob.glob(os.path.join(src_folder, "*.txt")):
lst = file_name.split('_')
if len(lst)>1:
try:
value=int(lst[0])
except ValueError:
continue
lst[0] = lst[0].zfill(3)
os.rename(file_name, '_'.join(lst))
Using zfill:
Split based on underscore _ and then use zfill to pad zero's
import os
os.chdir("test_folder")
for filename in os.listdir("."):
os.rename(filename, filename.split("_")[0].zfill(3) + filename[filename.index('_'):])
Converting to integer:
Only renames if prefix is a valid integer. Uses format(num, '03') to make sure the integer is padded with appropriate leading zero's. Renames files 1_file.txt, 12_water.txt but skips a_baa.txt etc.
import os
os.chdir("E:\pythontest")
for filename in os.listdir("."):
try:
num = int(filename.split("_")[0])
os.rename(filename, format(num, '03') + filename[filename.index('_'):])
except:
print 'Skipped ' + filename
EDIT: Both snippets ensure that if the filename contains multiple underscores then the later ones aren't snipped. So 1_file_new.txt gets renamed to 001_file_new.txt.
Examples:
# Before
'1_one.txt',
'12_twelve.txt',
'13_new_more_underscores.txt',
'a_baaa.txt',
'newfile.txt',
'onlycharacters.txt'
# After
'001_one.txt',
'012_twelve.txt',
'013_new_more_underscores.txt',
'a_baaa.txt',
'newfile.txt',
'onlycharacters.txt'
Here's a quick example to rename the files in the current directory:
import os
for f in os.listdir("."):
if os.path.isfile(f) and len(f.split("_")) > 1:
number, suffix = f.split("_")
new_name = "%03d_%s" % (int(number), suffix)
os.rename(f, new_name)
You can use glob.glob() to get a list of text files. Then use a regular expression to ensure that the file being renamed starts with digits and an underscore. Then split the file up and add leading zeros as follows:
import re
import glob
import os
src_folder = r"c:\source folder"
for filename in glob.glob(os.path.join(src_folder, "*.txt")):
path, filename = os.path.split(filename)
re_file = re.match("(\d+)(_.*)", filename)
if re_file:
prefix, base = re_file.groups()
new_filename = os.path.join(path, "{:03}{}".format(int(prefix), base))
os.rename(filename, new_filename)
The {:03} tells Python to zero pad your number to 3 digits. Python's Format Specification Mini-Language is very powerful.
Note os.path.join() is used to safely concatenate path components, so you don't have to worry about trailing separators.

strip() function doesn't remove trailing numbers

I try the following code but fails to remove the trailing digits using python 3.4.3
file_name = "48athens22.jpg"
result = file_name.strip("0123456789")
print (result)
Output:
athens22.jpg
What has gone wrong?
strip() only strips from the end of a string; the 22 is not at the end of the string.
Here's how to do what you want:
import os
def strip_filename(filename):
root, ext = os.path.splitext(filename)
root = root.strip('0123456789')
return root + ext
print(strip_filename('48athens22.jpg')) # athens.jpg
strip only removes from the beginning and end of the string. Try re.sub instead, if you want to remove any occurrences of a substring or a pattern.
E.g.
re.sub('[0-9]', '', file_name)
Those numbers aren't trailing. They come before the '.jpg'.
file_name = "48athens22.jpg"
name, *extension = file_name.rpartition('.')
result = name.strip("0123456789") + ''.join(extension)
print (result)
Works for me:
file_name = "48athens22.jpg1234"
result = file_name.strip("0123456789")
print(result)
Gives:
athens22.jpg
If you want to remove all digits, try:
import re
file_name = "48athens22.jpg1234"
result = re.sub(r'\d+', "", file_name)
print(result)
Gives:
athens.jpg
If you only want to remove digits before the ".", try:
result = re.sub(r'\d+\.', ".", file_name)
There are no trailing numbers , the last character in your string in 'g' , 22 is actually in the middle , if you do not want to consider the extension when striping , you will have to first split the file_name based on '.' And then strip the first part and then rejoin them.
Code -
filenames = file_name.split('.')
result = filenames[0].strip('0123456789') + '.' + '.'.join(filenames[1:])
print(result)

python strip function is not giving expected output

i have below code in which filenames are FR1.1.csv, FR2.0.csv etc. I am using these names to print in header row but i want to modify these name to FR1.1 , Fr2.0 and so on. Hence i am using strip function to remove .csv. when i have tried it at command prompt its working fine. But when i have added it to main script its not giving output.
for fname in filenames:
print "fname : ", fname
fname.strip('.csv');
print "after strip fname: ", fname
headerline.append(fname+' Compile');
headerline.append(fname+' Run');
output i am getting
fname :FR1.1.csv
after strip fname: FR1.1.csv
required output-->
fname :FR1.1.csv
after strip fname: FR1.1
i guess some indentation problem is there in my code after for loop.
plesae tell me what is the correct way to achive this.
Strings are immutable, so string methods can't change the original string, they return a new one which you need to assign again:
fname = fname.strip('.csv') # no semicolons in Python!
But this call doesn't do what you probably expect it to. It will remove all the leading and trailing characters c, s, v and . from your string:
>>> "cross.csv".strip(".csv")
'ro'
So you probably want to do
import re
fname = re.sub(r"\.csv$", "", fname)
Strings are immutable. strip() returns a new string.
>>> "FR1.1.csv".strip('.csv')
'FR1.1'
>>> m = "FR1.1.csv".strip('.csv')
>>> print(m)
FR1.1
You need to do fname = fname.strip('.csv').
And get rid of the semicolons in the end!
P.S - Please see Jon Clement's comment and Tim Pietzcker's answer to know why this code should not be used.
You probably should use os.path for path manipulations:
import os
#...
for fname in filenames:
print "fname : ", fname
fname = os.path.splitext(fname)[0]
#...
The particular reason why your code fails is provided in other answers.
change
fname.strip('.csv')
with
fname = fname.strip('.csv')

Adding a simple value to a string

If I have a string lets say ohh
path2 = '"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio'
And I want to add a " at the end of the string how do I do that? Right now I have it like this.
path2 = '"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio'
w = '"'
final = os.path.join(path2, w)
print final
However when it prints it out, this is what is returned:
"C:\Users\bgbesase\Documents\Brent\Code\Visual Studio\"
I don't need the \ I only want the "
Thanks for any help in advance.
How about?
path2 = '"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio' + '"'
Or, as you had it
final = path2 + w
It's also worth mentioning that you can use raw strings (r'stuff') to avoid having to escape backslashes. Ex.
path2 = r'"C:\Users\bgbesase\Documents\Brent\Code\Visual Studio'
just do:
path2 = '"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio' + '"'
I think the path2+w is the simplest answer here but you can also use string formatting to make it more readable:
>>> path2 = '"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio'
>>> '{}"'.format(path2)
'"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio"'
If path2 was long than it's much easier to use string formatting than adding a + at the end of the string.
>>> path2 = '"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio\\Documents\\Brent\\Code\\Visual Studio\\Documents\\Brent\\Code\\Visual Studio'
>>> w = '"'
>>> "{}{}".format(path2,w)
'"C:\\Users\\bgbesase\\Documents\\Brent\\Code\\Visual Studio\\Documents\\Brent\\Code\\Visual Studio\\Documents\\Brent\\Code\\Visual Studio"'
From Python documentation Common pathname manipulations section:
The return value is the concatenation of path1, and optionally path2,
etc., with exactly one directory separator (os.sep) following each
non-empty part except the last.
In this case, os.path.join() treats your string '"' as path part and adds the separator. Since you are not joining two parts of path you need to use string concatenation or string formatting.
The simplest would be just to add two strings:
final = path2 + '"'
You can actually modify path2 using += operator:
path2 += '"'

Categories