Rename multiple files using a list of names on excel - python

I have a bunch of PDF files with random names, like 95456356.pdf, 7896548965.pdf and so on. I also have a list of names in an Excel file with all the names in a column. Can I write a code that would read that Excel file and then rename all PDF files in the same order, like the first file would be renamed to the name in the fisrt row in the excel file? I can copy and paste that column to a .txt file if that makes it easier.

You used the xlrd package to read your Excel file. A nice tutorial on how to read Excel files can be found here: https://www.geeksforgeeks.org/reading-excel-file-using-python/
With this package it should be relatively easy to write code with the desired behaviour:
You should be able to read one cell with the name you want to use for renaming the files
You should be able to read the filenames of the pdf files
You should be able to rename files using the os package (see here: How to rename a file using Python)

Related

How to make a file with combination of multiple files with different extensions(xlsx, csv)?

Hey I'm looking for answers which can be solve my issue.
1.I have a csv files in one folder
2.Excel files in other folder
3.I want combine these two folder files as a single file
Note : Data is same in both folder files in terms of columns
For file handling I recommend using the pathlib built-in python module: pathlib examples. Use the glob method to fetch all files with a given ending - .csv and .xslx
Next you can use pandas to open the csv and .xslx files - check these examples for csv files, for excel files
Once you load the data into dataframes, you can combine them into one dataframe. If necessary do some data manipulation on the columns.
And lastly you can export the combined dataframe into a csv file - use the pd.to_csv() method - documentation on the method

Is there a way to append data to an excel file without reading its contents, in python?

I have a huge master data dump excel file. I have to append data to it on a regular basis. The data to be appended is stored as a pandas dataframe. Is there a way to append this data to the master dump file without having to read its contents.
The dump file is huge and takes a considerable amount of time for the program to load the file (using pandas).
I have already tried openpyxl and XlsxWriter but it didn't work.
It isn't possible to just append to an xlsx file like a text file. An xlsx file is a collection of XML files in a Zip container so to append data you would need to unzip the file, read the XML data, add the new data, rewrite the XML file(s) and then rezip them.
This is effectively what OpenPyXL does.

Modifying and writing data in an existing excel file using Python

I have an Excel file(xlsx) that already has lots of data in it. Now I am trying to use Python to write new data into this Excel file. I looked at xlwt, xldd, xlutils, and openpyxl, all of these modules requires you to load the data of my excel sheet, then apply changes and save to a new Excel file. Is there any way to just change the data in the existing excel sheet rather than load workbook or saving to new files?
This is not possible because XLSX files are zip archives and cannot be modified in place. In theory it might be possible to edit only a part of the archive that makes up an OOXML package but, in practice this is almost impossible because relevant data may be spread across different files.
Please check Openpyxl once again. You can load the data, do things with python, write your results in a new sheet in the same file or same sheet and save it (as everything is happening in memory).
e.g:
load data
wb = openpyxl.load_workbook("file.xlsx", data_only=True)
manipulate with python
# python codes
create sheet
some_sheet = wb.create_sheet("someSheet") # by default at the end
program to write in sheet
# program to write in sheet
save file (don't forget to close the excel file if its open before saving, as it will raise "Permission Error")
wb.save("file.xlsx"
here is the link
https://openpyxl.readthedocs.io/en/default/tutorial.html

Reading excel file from zip archive using python and openpyxl

I have a password protected zip archive containing some excel spreadsheets with some confidential data. I'd like to take the password from the user, open the zip, and analyze the spreadsheet inside, without actually extracting the excel file. Is it possible to construct openpyxl's workbook from the zip entry directly and do some analysis on the data in the workbook? I am trying to avoid extracting the excel to file system to avoid potential security problems (e.g. undeleted temp files).
Is this possible to do? Is it possible, for example, to treat the zip archive as some pseudo file system?
Thanks in advance!

Combine tab-separated value (TSV) files into an Excel 2007 (XLSX) spreadsheet

I need to combine several tab-separated value (TSV) files into an Excel 2007 (XLSX) spreadsheet, preferably using Python. There is not much cleverness needed in combining them - just copying each TSV file onto a separate sheet in Excel will do. Of course, the data needs to be split into columns and rows same as Excel does when I manually copy-paste the data into the UI.
I've had a look at the raw XML file Excel 2007 generates and it's huge and complex, so writing that from scratch doesn't seem realistic. Are there any libraries available for this?
Looks like xlwt may serve your needs -- you can read each TSV file with Python's standard library csv module (which DOES do tab-separated as well as comma-separated etc, don't worry!-) and use xlwt (maybe via this cheatsheet;-) to create an XLS file, make sheets in it, build each sheet from the data you read via csv, etc. Not sure about XLSX vs plain XLS support but maybe the XLS might be enough...?
The best python module for directly creating Excel files is xlwt, but it doesn't support XLSX.
As I see it, your options are:
If you only have "several", you could just do it by hand.
Use pythonwin to control Excel through COM. This requires you to run the code on a Windows machine with Excel 2007 installed.
Use python to do some preprocessing on the TSV to produce a format that will make step (1) easier. I'm not sure if Excel reads TSV, but it will certainly read CSV files directly.
Note that Excel 2007 will quite happily read "legacy" XLS files (those written by Excel 97-2003 and by xlwt). You need XLSX files because .....?
If you want to go with the defaults that Excel will choose when deciding whether each piece of your data is a number, a date, or some text, use pythonwin to drive Excel 2007. If the data is in a fixed layout such that other than a possible heading row, each column contains data that is all of one known type, consider using xlwt.
You may wish to approach xlwt via http://www.python-excel.org which contains an up-to-date tutorial for xlrd, xlwt, and xlutils.

Categories