Rename invalid filename in XP via Python - python

My problem is similar to Python's os.path choking on Hebrew filenames
however, I don't know the original encoding of the filename I need to rename (unlike the other post he knew it was Hebrew originally).
I was doing data recovery for a client and copied over the files to my XP SP3 machine,
and some of the file names have "?" replacing/representing invalid characters.
I tried to use Python to os.rename the files since I know it has unicode support, however, when I tell python to rename the files, it seems it's unable to pass a valid file name back to the windows API.
i.e.:
>>> os.chdir(r'F:\recovery\My Music')
>>> os.listdir(u'.')
[u'Don?t Be Them.mp3', u'That?s A Soldier.mp3']
>>> blah=os.listdir(u'.')
>>> blah[0]
Don?t Be Them.mp3
>>> os.rename(blah[0],'dont be them.mp3')
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
os.rename(blah[0],'dont be them.mp3')
WindowsError: [Error 123] The filename, directory name, or
volume label syntax is incorrect
I'm using Python 2.6, on Win XP SP3, with whatever encoding is standard XP behavior for US/English.
Is there a way to handle these renames without knowing the original language?

'?' is not valid character for filenames. That is the reason while your approach failed.
You may try to use DOS short filenames:
import win32api
filelist = win32api.FindFiles(r'F:/recovery/My Music/*.*')
# this will extract "short names" from WIN32_FIND_DATA structure
filelist = [i[9] if i[9] else i[8] for i in filelist]
# EXAMPLE:
# this should rename all files in 'filelist' to 1.mp3, 2.mp3, 3.mp3, ...
for (number, filename) in enumerate(filelist):
os.rename(filaname, '%d.mp3' % (number))

Try passing a unicode string:
os.rename(blah[0], u'dont be them.mp3')

Related

Python FileNotFoundError how to handle long filenames

I have a weird problem. I can neither rename specific files, nor remove them. I get the FileNotFoundError.
Similar questions have been asked before. The solution to this problem was using a full path and not just the filename.
My script worked before using only the filenames, but using different files I get this error, even using the full path.
It seems, that the filename is causing the error, but I cannot resolve it.
import os
cwd = os.getcwd()
file = "003de5664668f009cbaa7944fe188ee1_recursion1.c_2016-04-21-21-06-11_9bacb48fecd32b8cb99238721e7e27a3."
change = "student_1_recursion1.c_2016-04-21-21-06-11_9bacb48fecd32b8cb99238721e7e27a3."
oldname = os.path.join(cwd,file)
newname = os.path.join(cwd,change)
print(file in os.listdir())
print(os.path.isfile(file))
os.rename(oldname, newname)
I get the following output:
True
False
Traceback (most recent call last):
File "C:\Users\X\Desktop\code\sub\test.py", line 13, in <module>
os.rename(oldname, newname)
FileNotFoundError: [WinError 2] Das System kann die angegebene Datei nicht finden: 'C:\\Users\\X\\Desktop\\code\\sub\\003de5664668f009cbaa7944fe188ee1_recursion1.c_2016-04-21-21-06-11_9bacb48fecd32b8cb99238721e7e27a3.' -> 'C:\\Users\\X\\Desktop\\code\\sub\\student_1_recursion1.c_2016-04-21-21-06-11_9bacb48fecd32b8cb99238721e7e27a3.'
[Finished in 0.4s with exit code 1]
This file is existing if I use windows search in the folder.
If I try to use the full path I also get an windows error not finding the file.
I have also tried appending a unicode string u''+filename to the strings, because it was suggested by an user.
The pathlength is < 260, so what is causing the problem?
This is a windows/Python thing. Filenames with a trailing period are sometimes trimmed.
If this is a once-off task, you can use two trailing periods as a workaround.
This isn't exactly an answer (I lack the rep for that) but...
Two thoughts:
A) Are those file names supposed to end with periods?
B) Instead of escaping backslashes, you can use forward slashes here (i.e., C:/.../.../...)

Python pathlib glob function fails on WindowsError: [123]?

I've written the following python function that returns a python list of File Geodatabase Paths. Please note that input_folder is a raw string and contains no unicode characters.
try:
gdbs = list(Path(input_folder).glob('**/*.gdb'))
for gdb in gdbs:
print(gdb)
except WindowsError, e:
print("error")
The problem that I'm having is that pathlib glob method is failing when it encounters unicode characters in the path of files in the directory.
I tried the following but it still fails, which I assume is because I'm not converting the paths the glob generator is coming across.
try:
gdbs = list(Path(unicode(input_folder)).glob('**/*.gdb'))
for gdb in gdbs:
print(gdb)
except WindowsError, e:
print("error")
The error message that is returned is:
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'R:\\Data\\Africa\\Tanzania\\fromDropbox\\DART\\BRT Phase 2-3 designs\\1.12 Engineering Drawings for Service\\ROAD LIGHT\\PDF\\01.Traffic Sign(Kilwa)-??04.pdf'
Any help to handle the following error will be appreciated.
Try this :
input_folder = r'R:\Data\Africa\Tanzania\fromDropbox\DART\BRT Phase 2-3 designs\1.12 Engineering Drawings for Service\ROAD LIGHT\PDF\01.Traffic Sign(Kilwa)-??04.pdf'
The correct call should have 'r' in front of the path, and using single slash.
It seems to be a problem with pathlib because of Python 2.7 not being able to handle non-ascii characters. pathlib chokes up on international characters on Python 2 on Windows

File not found Error in reading text in python [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 7 months ago.
I am trying to read a text file on my hard drive via python with the following script:
fileref = open("H:\CloudandBigData\finalproj\BeautifulSoup\twitter.txt","r")
But it is giving the following error:
IOError Traceback (most recent call last)
<ipython-input-2-4f422ec273ce> in <module>()
----> 1 fileref = open("H:\CloudandBigData\finalproj\BeautifulSoup\twitter.txt","r")
IOError: [Errno 2] No such file or directory: 'H:\\CloudandBigData\x0cinalproj\\BeautifulSoup\twitter.txt'
I also tried with other way:
with open('H:\CloudandBigData\finalproj\BeautifulSoup\twitter.txt', 'r') as f:
print f.read()
Ended up with the same error. The text file is present in the directory specified.
Replace
fileref = open("H:\CloudandBigData\finalproj\BeautifulSoup\twitter.txt","r")
with
fileref = open(r"H:\CloudandBigData\finalproj\BeautifulSoup\twitter.txt","r")
Here, I have created a raw string (r""). This will cause things like "\t" to not be interpreted as a tab character.
Another way to do it without a raw string is
fileref = open("H:\\CloudandBigData\\finalproj\\BeautifulSoup\\twitter.txt","r")
This escapes the backslashes (i.e. "\\" => \).
An even better solution is to use the os module:
import os
filepath = os.path.join('H:', 'CloudandBigData', 'finalproj', 'BeautifulSoup', 'twitter.txt')
fileref = open(filepath, 'r')
This creates your path in an os-independent way so you don't have to worry about those things.
One last note... in general, I think you should use the with construct you mentioned in your question... I didn't in the answer for brevity.
I was encountering same problem. This problem resulted due to different file path notation Python.
For example, filepath in Windows reads with backward slash like: "D:\Python\Project\file.txt"
But Python reads file path with forward slash like: "D:/Python/Project/file.txt"
I used r"filepath.txt" and "os.path.join" and "os.path.abspath" to no relief. os library also generates file path in Windows notation. Then I just resorted to IDE notation.
You don't encounter this error if "file.txt" is located in same directory, as filename is appended to working directory.
PS: I am using Python 3.6 with Spyder IDE on Windows machine.

Error opening a csv file in python from a specific directory

I am very new to python and I am not having much experience in programming.
I try to open a CSV file from a specific directory and I get error.
import csv
ifile = open('F:\Study\CEN\Mini Project\Data Sets\test.csv', "rb");
Error:
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
ifile = open('F:\Study\CEN\Mini Project\Data Sets\test.csv', "rb");
IOError: [Errno 22] invalid mode ('rb') or filename: 'F:\\Study\\CEN\\Mini Project\\Data Sets\test.csv'
What to do ????
Use forward slashes:
ifile = open('F:/Study/CEN/Mini Project/Data Sets/test.csv', "rb");
Or at least escape your backslashes:
ifile = open('F:\\Study\\CEN\\Mini Project\\Data Sets\\test.csv', "rb");
Another option: use os.path.join:
out = os.path.abspath(os.path.join('path', 'test.csv'))
Your problem is here:
'F:\Study\CEN\Mini Project\Data Sets\test.csv'
^^
Because you did not use a raw string, Python thinks \t is supposed to mean a tab character.
You can see that in the error message, by the way: Notice how Python translated all the backslashes into double backslashes (which is how a literal backslash needs to be represented in a normal string) in all the places except the one where "backslash plus letter" actually meant something special?
Use
ifile = open(r'F:\Study\CEN\Mini Project\Data Sets\test.csv', "rb")
(and remove the semicolons, you don't need them in Python) and it should work.
Your problem is with the "\t" AND a lack of exposure to various tools in the os.path package
The correct and easiest way to deal with this problem is to use os.path.normpath, combined with the string literal r, which ensures that backslashes are not interpreted as an escape character.
(Documentation on Lexical Analysis in python can be found here: https://docs.python.org/2/reference/lexical_analysis.html)
Open interactive python by typing "python" at the command line, and do the following to see that it's dead simple.
>>> import os
>>> path = r'F:\Study\CEN\Mini Project\Data Sets\test.csv'
>>> os.path.normpath(path)
'F:\\Study\\CEN\\Mini Project\\Data Sets\\test.csv'
normpath should be used when using hardcoded paths for scripts that may have to run on both dos and unix (eg OS X). It will ensure that the right kind of slashes are used for your particular environment
On a side note, if you are working with CSV files, you should use the petl library instead of the csv module. You'll save yourself a lot of time and hassle. Install it with pip install petl

How do I allow opening of files that have Unicode characters in their filenames?

I have this Python script here that opens a random video file in a directory when run:
import glob,random,os
files = glob.glob("*.mkv")
files.extend(glob.glob("*.mp4"))
files.extend(glob.glob("*.tp"))
files.extend(glob.glob("*.avi"))
files.extend(glob.glob("*.ts"))
files.extend(glob.glob("*.flv"))
files.extend(glob.glob("*.mov"))
file = random.choice(files)
print "Opening file %s..." % file
cmd = "rundll32 url.dll,FileProtocolHandler \"" + file + "\""
os.system(cmd)
Source: An answer in my Super User post, 'How do I open a random file in a folder, and set that only files with the specified filename extension(s) should be opened?'
This is called by a BAT file, with this as its script:
C:\Python27\python.exe "C:\Programs\Scripts\open-random-video.py" cd
I put this BAT file in the directory I want to open random videos of.
In most cases it works fine. However, I can't make it open files with Unicode characters (like Japanese or Korean characters in my case) in their filenames.
This is the error message when the BAT file and Python script is run on a directory and opens a file with Unicode characters in its filename:
C:\TestDir>openrandomvideo.BAT
C:\TestDir>C:\Python27\python.exe "C:\Programs\Scripts\open-random-video.py" cd
The filename, directory name, or volume label syntax is incorrect.
Note that the filename of the .FLV video file in that log is changed from its original filename (소시.flv) to '∩╗┐' in the command line log.
EDIT: I learned that the above command line error message is due to saving the BAT file as 'UTF-8 with BOM'. Saving it as 'ANSI or UTF-16' shows the following message instead, but still does not open the file:
C:\TestDir>openrandomvideo.BAT
C:\TestDir>C:\Python27\python.exe "C:\Programs\Scripts\open-random-video.py" cd
Opening file ??.flv...
Now, the filename of the .FLV video file in that log is changed from its original filename (소시.flv) to '??.flv.' in the command line log.
I'm using Python 2.7 on Windows 7, 64-bit.
How do I allow opening of files that have Unicode characters in their filenames?
Just use Unicode literals e.g., u".mp4" everywhere. IO functions in Python will return Unicode filenames back if you give them Unicode input (internally they might use Unicode-aware Windows API):
import os
import random
videodir = u"." # get videos from current directory
extensions = tuple(u".mkv .mp4 .tp .avi .ts .flv .mov".split())
files = [file for file in os.listdir(videodir) if file.endswith(extensions)]
if files: # at least one video file exists
random_file = random.choice(files)
os.startfile(os.path.join(videodir, random_file)) # start the video
else:
print('No %s files found in "%s"' % ("|".join(extensions), videodir,))
If you want to emulate how your web browser would open video files then you could use webbrowser.open() instead of os.startfile() though the former might use the latter internally on Windows anyway.
The error when running the BAT file is because the BAT file itself is saved as "UTF-8 with BOM". The "" bytes are not a corrupted filename, they are the literal first bytes stored in the BAT file. Re-save the BAT file as ANSI or UTF-16, which are the only encodings supported for BAT files.
Either use Unicode literals as described by J. F. Sebastian, or use Python 3, which always uses Unicode.
(For Python 3, your script will need a minor modification: print is a function now, so you have to put parentheses around the parameter list.)
please familiarize yourself to add # -*- coding: utf-8 -*- in your source code,
so python understanding about your unicode.

Categories