Getting Last Modified by Name of xlsx file - python

I have an excel file that gets modified by a group of people and we need to keep track of when the file was last modified and by whom
I was able to retrieve the file properties through .properties but trying to figure out how to isolate the lastModifiedby and insert its value in to a column
from openpyxl import load_workbook
wb = load_workbook('Rec1.xlsx')
wb.properties.lastModifiedBy
It gets me the information I need but I am stumped on how to create a new column "lastmodifiedby" with the information provided in properties

From the documentation: https://openpyxl.readthedocs.io/en/stable/usage.html#write-a-workbook
Perhaps something like this?
ws1 = wb.active
ws1.cell(column=1, row=1, value=wb.properties.lastModifiedBy)
wb.save(filename='Rec1.xlsx')

is this what you are looking for??
Dim lastModifiedBy
lastModifiedBy = ThisWorkbook.BuiltinDocumentProperties("Last Author")

Related

How to save excel file with openpyxl and preserve pivot table as is?

I have an excel file - one sheet is used for writing data with python, other sheet contains pivot table. I want to keep pivot table exactly the same as source file.
The problem is that after saving new workbook with openpyxl I open excel file and refresh pivot table, it loses 'Field settings..' -> 'Repeat items label' checkbox and I need to manually turn it on each time. That is not very efficient, I would rather solve this with python.
Sample file has it checked, but checkbox seems to disappear after saving new file with openpyxl.
from openpyxl import load_workbook
from pathlib import Path
from datetime import date
import os
sample_file_path = Path('sample_excel.xlsx') # source excel
result_folder_path = Path('results')
wb = load_workbook(sample_file_path)
ws = wb["t_mm"] # worksheet with pivot table I want to preserve as is
# some manipulations to other worksheet
xlsx_filename = "test_my_file_%s.xlsx" % date.today().strftime('%d%m%Y')
completename = os.path.join(result_folder_path, xlsx_filename)
wb.save(completename)
I read the documentation https://openpyxl.readthedocs.io/en/stable/api/openpyxl.pivot.table.html, but couldn't figure out how to keep that checkbox. I am not excel or pivot table expert. I think this is the parameter I need "showMultipleLabel=True", but from docs I understand that it's "True" by default, so my chekbox should remain intact. Maybe other parameter?

Is there a way to save data in named Excel cells using Python?

I have used openpyxl for outputting values in Excel in my Python code. However, now I find myself in a situation where the cell locations in excel file may change based on the user. To avoid any problems with the program, I want to name the cells where the code can save the output to. Is there any way to have Python interact with named ranges in Excel?
For a workbook level defined name
import openpyxl
wb = openpyxl.load_workbook("c:/tmp/SO/namerange.xlsx")
ws = wb["Sheet1"]
mycell = wb.defined_names['mycell']
for title, coord in mycell.destinations:
ws = wb[title]
ws[coord] = "Update"
wb.save('updated.xlsx')
print("{} {} updated".format(ws,coord))
I was able to find the parameters of the named range using defined_names. After that I just worked like it was a normal Excel cell.
from openpyxl import load_workbook
openWB=load_workbook('test.xlsx')
rangeDestination = openWB.defined_names['testCell']
print(rangeDestination)
sheetName=str(rangeDestination.attr_text).split('!')[0]
cellName = str(rangeDestination.attr_text).split('!')[1]
sheetToWrite=openWB[sheetName]
cellToWrite=sheetToWrite[cellName]
sheetToWrite[cellName]='TEST-A3'
print(sheetName)
print(cellName)
openWB.save('test.xlsx')
openWB.close()

Reading cell value without redefining it with Openpyxl

I need to read this .xlsm database and some of the cells values I need are derived from Excel functions. To accomplish this I used:
from openpyxl import load_workbook
wb = load_workbook('file.xlsm', data_only=True, keep_vba=True)
ws = wb['Plan1']
And then, for every cell I wanted to read:
ws.cell(row=row, column=column).value
This works fine for getting the data out. But the problem comes with saving. When I do:
wb.save('file.xlsm')
It saves the file, but all the formulas inside the sheets are lost
My dilemma is reading the cell's displayed values on one of the database's sheet without modifying them, writing the code's output in a new sheet and saving it.
Read the file once in read-only and data-only mode to look at the values and another time keeping the VBA around. And save under a different name.

Import excel file in python and identify cells of which the content is strikethrough

I want to read in many Excel documents and I would like to receive at least one important bit of information on the format. However, I am afraid that there is no tool for it, so my hope is on you!
Each excel file that I am reading in contains a few cells that of which the content is strikethrough. For those who don't know the word (I didn't know it either), strikethrough means that there is a horizontal line through the content.
I have figured out that I will need to read in my documents with xlrd to be able to identify the fonts. However, I have been going over a list of possibilities and none of them contains a check on strikethrough.
You have to open the workbook with formatting_info kwarg as True. Then, get the XF object of the cells and get the Font object. The struck_out attribute is what you're looking for. An example:
workbook = xlrd.open_workbook(filename, formatting_info=True)
sh = workbook.sheet_by_name(sheet)
xf = workbook.xf_list[sh.cell_xf_index(row, col)]
font = workbook.font_list[xf.font_index]
if font.struck_out:
print(row, col)
from openpyxl import load_workbook
book = load_workbook('xyz.xlsx')
sheet = book.get_sheet_names()[0] #This will consider **Sheet1** of our excel file
ws = book.get_sheet_by_name(sheet)
for row in ws.iter_rows():
for cell in row:
if cell.font.strike:
print(cell.value)

Reading xls file with Python

import xlrd
cord = xlrd.open_workbook('MT_coordenadas_todas.xls')
id = cord.sheet_by_index(0)
print id
When I run my code in terminal,I got
<xlrd.sheet.Sheet object at 0x7f897e3ecf90>
I wanted to take the first column,so what should I change in my code?
id is a reference to the sheet object. You need to use values = id.col_values(0) to read the values from the first column of that sheet.

Categories