Openpyxl - trouble naming workbook - python

I have a python script that analyses a file tree and records its findings in an xlsx.
Analysis is going fine, but when I try to record my results, I get an error:
Traceback (most recent call last):
File ".\call_validation.py", line 103, in <module>
wb.save(wb_name)
File "C:\Python\lib\site-packages\openpyxl\workbook\workbook.py", line 298, i
save_workbook(self, filename)
File "C:\Python\lib\site-packages\openpyxl\writer\excel.py", line 196, in sav
writer.save(filename, as_template=as_template)
File "C:\Python\lib\site-packages\openpyxl\writer\excel.py", line 178, in sav
archive = ZipFile(filename, 'w', ZIP_DEFLATED)
File "C:\Python\lib\zipfile.py", line 923, in __init__
self.fp = io.open(file, modeDict[mode])
OSError: [Errno 22] Invalid argument: 'move_generated-2015-05-07 10:08:26.xlsx'
I am generating my filename using datetime.datetime.now() like so:
save_time = str(datetime.datetime.now()).split(".")[0]
wb_name = "move_generated-" + save_time + ".xlsx"
wb.save(wb_name)
I don't believe the filename is too long, its only in C:\code\call_flow and I've tried stripping all the non-alphanumeric characters out of the name. Any ideas?
EDIT: Solution ended up being that I had failed to strip the colons from the time. As #nivix zixer said I fixed it by replacing
save_time = str(datetime.datetime.now()).split(".")[0]
with
save_time = str(datetime.datetime.now()).split(".")[0].replace(':', '_')

Perhaps the problem is you have a space in the filename?
Replace str(datetime.datetime.now()).split(".")[0] with this: str(datetime.datetime.now()).split(".")[0].replace(' ', '_').
Glad I could help Will!

Related

Why can't I edit my .xlsx file with openpyxl?

I am encountering a problem when running with openpyxl the code below
import openpyxl
import os
wb = openpyxl.load_workbook('example.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
sheet["A1"].value
sheet["A1"].value == None
sheet["A1"].value = 42
sheet["A3"].value = 'Hello'
os.chdir("/Users/mac/Desktop")
wb.save('exceeeel.xlsx')
The error is
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/openpyxl/reader/excel.py", line 312, in load_wo
rkbook
reader = ExcelReader(filename, read_only, keep_vba,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/openpyxl/reader/excel.py", line 124, in __init_
_
self.archive = _validate_archive(fn)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/openpyxl/reader/excel.py", line 96, in _validat
e_archive
What am I doing wrong? I am using the current version of the openpyxl library.
I can't provide a confident answer because the question only includes a partial traceback. That being said, it looks like the traceback one would get for a FileNotFoundError:
C:\Python37\python.exe C:/Users/user/PycharmProjects/scratch/scratch2.py
Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/scratch/scratch2.py", line 3, in <module>
wb = openpyxl.load_workbook('example.xlsx')
File "C:\Python37\lib\site-packages\openpyxl\reader\excel.py", line 313, in load_workbook
data_only, keep_links)
File "C:\Python37\lib\site-packages\openpyxl\reader\excel.py", line 124, in __init__
self.archive = _validate_archive(fn)
File "C:\Python37\lib\site-packages\openpyxl\reader\excel.py", line 96, in _validate_archive
archive = ZipFile(filename, 'r')
File "C:\Python37\lib\zipfile.py", line 1207, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'example.xlsx'
Process finished with exit code 1
This error will be raised when the path your provided to the file you want to load with openpyxl.load_workbook does not contain the specified file. Since the only argument you provided in your call of that function is 'example.xlsx' that probably means there is no file in the folder you are running this script from.
If this 'example.xlsx' file is in a different folder then you'll want to either specify the relative path to that file as your argument or move the file into the same folder as your script.
If this isn't what's going on then you'll need to provide the full traceback that you are seeing on your end in order to get a better answer.

Python zipfile: file name with new line characters

Somebody managed somehow to add a new line character \r\n to the name of a file in a zip, and that makes ZipFile fail when it tries to extract the zip:
2019-07-23 14:05:12,285 - __main__ - ERROR - Error desconocido: [Errno 22] Invalid argument: 'descargados\\03_26298_19\\ANEXO\r\n.pdf'. Saliendo.
Traceback (most recent call last):
File "motor.py", line 51, in main
procesar_descarga(zip_object, ruta_temp, ruta_final)
File "C:\Users\david\pycharmProjects\descargueitor2\volcado.py", line 90, in procesar_descarga
zip_object.extractall(str(ruta_temp))
File "C:\Users\david\Anaconda3\lib\zipfile.py", line 1616, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\david\Anaconda3\lib\zipfile.py", line 1670, in _extract_member
open(targetpath, "wb") as target:
OSError: [Errno 22] Invalid argument: 'descargados\\03_26298_19\\ANEXO\r\n.pdf'
I tried the same file with several programs:
The built-in compressed files reader in Windows explorer just ignores the file: it is not listed nor extracted.
WinZip lists the file, but throws an error when opening or extracting the file.
7Zip can read and extract the file: it just converts the bad characters to underscores.
Is there any way to deal with this in Python? It looks like files in a zip cannot be renamed using the library.

cannot write file with full path in Python

I am using Pandas on Mac, to read and write a CSV file, and the weird thing is when using full path, it has error and when using just a file name, it works. I post my code which works and which not works in my comments below, and also detailed error messages. Anyone have any good ideas?
sourceDf = pd.read_csv(path_to_csv)
sourceDf['nameFull'] = sourceDf['nameFirst'] + ' ' + sourceDf['nameLast']
sourceDf.to_csv('newMaster.csv') # working
sourceDf.to_csv('~/Downloads/newMaster.csv') # not working
Traceback (most recent call last):
File "/Users/foo/PycharmProjects/DataWranglingTest/CSVTest1.py", line 36, in <module>
add_full_name(path_to_csv, path_to_new_csv)
File "/Users/foo/PycharmProjects/DataWranglingTest/CSVTest1.py", line 28, in add_full_name
sourceDf.to_csv('~/Downloads/newMaster.csv')
File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 1189, in to_csv
formatter.save()
File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/format.py", line 1442, in save
encoding=self.encoding)
File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/common.py", line 2831, in _get_handle
f = open(path, mode)
IOError: [Errno 2] No such file or directory: '~/Downloads/newMaster.csv'
Tried to use prefix r, but not working,
path_to_csv = r'~/Downloads/Master.csv'
path_to_new_csv = r'~/Downloads/Master_new.csv'
File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 1189, in to_csv
formatter.save()
File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/format.py", line 1442, in save
encoding=self.encoding)
File "/usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/common.py", line 2831, in _get_handle
f = open(path, mode)
IOError: [Errno 2] No such file or directory: '~/Downloads/Master_new.csv'
thanks in advance,
Lin
Try using os.path.join().
import os
(...)
output_filename = 'newMaster.csv'
output_path = os.path.join('Downloads', output_filename)
(...)
sourceDf.to_csv(output_path)
Use the same methodology to point pandas.read_csv() in the right direction.
You didn't specify python version.
On 3.4 you can use pathlib, otherwise use os.path.join() or quoting:
sourceDf.to_csv(r'~/Downloads/newMaster.csv')
Notice the r.
The problem is that /n is newline, which is not allowed in a path.

IOError: [Errno 22] invalid mode ('rb') using pandas.read_excel

I keep having the following error. you should know that file name is correct and this pandas method works in other py files, please help !!!!
the tablecouleurs is an excel table with no specific characters
import pandas as pd
colors=pd.read_excel('C:\Users\paul\tablecouleurs.xlsx', index_col=0, has_index_names=True)
and error:
runfile('C:/Users/paul/Documents/colors.py',
wdir='C:/Users/pauldufosse/Documents') Traceback (most recent call
last):
File "", line 1, in
runfile('C:/Users/paul/Documents/colors.py', wdir='C:/Users/pauldufosse/Documents')
File
"C:\Users\paul\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py",
line 685, in runfile
execfile(filename, namespace)
File
"C:\Users\paul\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py",
line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Users/paul/Documents/colors.py", line 12, in
colors=pd.read_excel('C:\Users\pauldufosse\tablecouleurs.xlsx', index_col=0, has_index_names=True)
File
"C:\Users\paul\Anaconda\lib\site-packages\pandas\io\excel.py",
line 151, in read_excel
return ExcelFile(io, engine=engine).parse(sheetname=sheetname, **kwds)
File
"C:\Users\paul\Anaconda\lib\site-packages\pandas\io\excel.py",
line 188, in init
self.book = xlrd.open_workbook(io)
File
"C:\Users\paul\Anaconda\lib\site-packages\xlrd_init_.py",
line 394, in open_workbook
f = open(filename, "rb")
IOError: [Errno 22] invalid mode ('rb') or filename:
'C:\Users\paul\tablecouleurs.xlsx'
Had the same problem. You can solve it by double escaping your path.
The error messages says:
IOError: [Errno 22] invalid mode ('rb') or filename: 'C:\Users\pauldufosse\tablecouleurs.xlsx'
Just do:
foo = pd.ExcelFile('C:\\Users\\pauldufosse\\tablecouleurs.xlsx')
This worked for me
open_workbook f = open(filename, 'rb')
If you check Python library you will see you have to use single quote instead of double quote.

Using os.path.join with os.path.getsize, returning FileNotFoundError

In conjunction with my last question, I'm onto printing the filenames with their sizes next to them in a sort of list. Basically I am reading filenames from one file (which are added by the user), taking the filename and putting it in the path of the working directory to print it's size one-by-one, however I'm having an issue with the following block:
print("\n--- Stats ---\n")
with open('userdata/addedfiles', 'r') as read_files:
file_lines = read_files.readlines()
# get path for each file and find in trackedfiles
# use path to get size
print(len(file_lines), "files\n")
for file_name in file_lines:
# the actual files should be in the same working directory
cwd = os.getcwd()
fpath = os.path.join(cwd, file_name)
fsize = os.path.getsize(fpath)
print(file_name.strip(), "-- size:", fsize)
which is returning this error:
tolbiac wpm-public → ./main.py --filestatus
--- Stats ---
1 files
Traceback (most recent call last):
File "./main.py", line 332, in <module>
main()
File "./main.py", line 323, in main
parseargs()
File "./main.py", line 317, in parseargs
tracking()
File "./main.py", line 204, in tracking
fsize = os.path.getsize(fpath)
File "/usr/lib/python3.4/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/tolbiac/code/wpm-public/file.txt\n'
tolbiac wpm-public →
So it looks like something is adding a \n to the end of file_name, I'm not sure if thats something used in the getsize module, I tried this with os.stat, but it did the same thing.
Any suggestions? Thanks.
When you're reading in a file, you need to be aware of how the data is being seperated. In this case, the read-in file has a filename once per line seperated out by that \n operator. Need to strip it then before you use it.
for file_name in file_lines:
file_name = file_name.strip()
# rest of for loop

Categories