Increment substring numbers by +1 - python

I have a Python script running on AWS Lambda (serverless). When triggered it needs to add a new row to a Google Spreadsheet via the API and that involves copying some of the formulas from the row above to the new row. But for the formulas to be valid the cell row numbers need to be incremented by +1 before copying to the new row.
Example say I have a cell with formula =if(countif(G198:G201,"<=3")<=3,0,1). I need to change this to =if(countif(G199:G202,"<=3")<=3,0,1) and add to the new row.
Whats the easiest way to identify the numbers that identify cells in the formula and increment them to create the new formula?

def update_formula(formula):
import re
cells = re.findall('[A-Z][0-9]+',formula)
for cell in cells:
row_number = re.search('[0-9]+',cell).group(0)
new_row_number = int(row_number) + 1
new_cell = cell.replace(str(row_number),str(new_row_number))
formula = formula.replace(cell,new_cell)
return formula

Related

Using openpyxl in Python, how do I return the current row number?

I am trying to create Python code that reads data from a certain cell in an Excel file and returns a specified value based on a dictionary key.
I am using a for loop to iterate over each row, checking if the key exists within the cell.value located in column 'D' and want it to then assign the value to a column 'H' and row x where x is the current row.
logging.debug(wb.sheetnames)
ws = wb['Sheet1']
for cell in ws['D']:
for k, v in dictKey.items():
if k in cell.value:
logging.debug(v)
ws['H1'] = v
I have shown H1 as a static example of what I am trying to achieve but want it to be H2, H3, H4 for each row being interated.
I thought perhaps I could create a simple:
count = 0
count += 1
at the start of the for loop and then concatenate the value using:
ws['H' + str(count)] = v
But i thought there could be a more elegant solution using an inbuilt function in openpyxl?
I could not find a clean solution after a quick search and wondering if the above would work/be good code?
Thanks
I believe you're looking for ws.cell(row, column).value which let's you get and set based on index (index starting 1 cause it's openpyxl).
https://openpyxl.readthedocs.io/en/stable/api/openpyxl.cell.cell.html?highlight=cell.value

Updating cells in a range with a for loop

I'm pretty new to coding so apologies if this is easy.
I have two columns in google sheets and I want to add a formula into a third column that is something like this:
=(E3*90)+(F3*10) - the values in the columns are grades and the 90 and 10 are weightings that are fixed.
I created a for loop to try and iterate through a range(3,90) as as it updates each cell in the column.
It prints the formula in every cell but it's only the last iteration '=(E89*90)+(F89*10)'
I managed to get this working by adding report.update_acell('E'+str(i),'=(E'+str(i)+'*90)+(F'+str(i)+'*10)') to the for loop but this create too many calls and causes problems.
sh = client.open("grading")
report = sh.worksheet("Report")
weighted = report.range('G3:G89')
for cell in weighted:
for i in range(3,90):
cell.value = '=(E'+str(i)+'*90)+(F'+str(i)+'*10)'
report.update_cells(weighted, value_input_option='USER_ENTERED')
What I'd like to see is every cell in the 'weighted' range be updated with a formula that looks at the two cells next to them and adds them into the formula so that a result is visible in weighted column.
eg.
row 3 should be =(E3*90)+(F3*10)
row 4 should be =(E4*90)+(F4*10) and so on until the range is completed.
I fixed this after a lot of trial and error. For anyone who is trying to do the same here is my solution:
sh = client.open("grading")
report = sh.worksheet("Report")
weighted = report.range('G3:G89')
for i, cell in enumerate(weighted,3):
cell.value = '=(E'+str(i)+'*90)+(F'+str(i)+'*10)'
report.update_cells(weighted, value_input_option='USER_ENTERED')

Grouping data and comparing it from excel in python

I am working on a project using python to select certain values from an excel file. I am using the xlrd library and openpyxl library to do this.
The way the python program should we working is :
Grouping all the data point entries that are in a certain card tase. These are marked in column E. For example, all of the entries between row 26 and row 28 are in Card Task A, and hence they should be grouped together. All entries without a “Card Task” value in column E should not be considered as anything.
Next…
looking at the value from column N (lastExecTime) from a row and compare that time with the following value in column M
If it is seen that the times overlap (column M is less than the previous N value) it will increment a variable called “count” . Count stores the number of times a procedure overlaps.
Finally…
As for the output, the goal is to create a separate text file that displays which tasks are overlapping, and how many tasks overlap in a certain Card Task.
The problem that I am running into is that I cannot pair the data from a card task
Here is a sample of the excel data:
The data (a picture of it)
Here is a picture of more data (this will probably be more helpful)
Click here for it
And here is the code that I have written that tells me if there are multiple procedures going on:
from openpyxl import load_workbook
book = load_workbook('LearnerSummaryNoFormat.xlsx')
sheet = book['Sheet1']
for row in sheet.rows:
if ((row[4].value[:9]) != 'Card Task'):
print ("Is not a card task: " + str(row[1].value))
Essentially my problem is that I am not able to compare all the values from one card task with each other.
Blockquote
I would read through the data once like you have already but store all rows with 'Card Task' in a separate list. Once you have a list of only card task items you can compare.
card_task_row_object_list = []
count = 0
for row in sheet.rows:
if 'Card Task' in row[4]:
card_task_row_object_list.append(row)
From here you would want to compare the time values. What are you needed to check, if two different card task times overlap?
(row 12: start, row 13: end)
def compare_times(card_task_row_object_list):
for row in card_task_row_object_list:
for comparison_row in card_task_row_object_list:
if (comparison_row[12] <= row[13] && comparison_row[13] >= row[12])
# No overlap
else
count+=1

Imitate the copy function of Excel or LibreOffice Calc with openpyxl and python3 (copy with properties to new position)

is there a way to imitate the copy function of "Excel" or "LibreOffice Calc" using openpyxl and python3?
I would like to specify a certain range (e.g. "A1:E5") and copy the "border", "alignment", "number_format", "value", "merged_cells", ... properties of each cell to an other position (and probably to another worksheet), whereby the used formulas should be updated automatically to the new position. The new formulas are intended to refer to cells within the target worksheet and not to the old cells in the original worksheet.
My project:
I generate a workbook for every month. This workbook contains yield monitoring tables that list all working days.
Although the tables differ from month to month, all have the same structure within a workbook, so I would like to create a template and paste it into the individual worksheets.
Copying the entire worksheet is not really a solution because I also like to specify the position of the table individually for every worksheet. So the position in the target sheet could differ from the postion in the template.
My current code (where the formulas are not automatically updated):
import copy
# The tuple "topLeftCell" represents the assumed new position in the target worksheet. It is zero-based. (e.g. (0,0) or (7,3))
# "templateSheet" is the template from which to copy.
# "worksheet" is the target worksheet
# Create the same "merged_cells" as in the template at the new positioin
for cell_range in templateSheet.merged_cells.ranges:
startCol, startRow, endCol, endRow = cell_range.bounds
worksheet.merge_cells(start_column=topLeftCell[0] + startCol,
start_row=topLeftCell[1] + startRow,
end_column=topLeftCell[0] + endCol,
end_row=topLeftCell[1] + endRow)
colNumber = topLeftCell[0] + 1 # 1 is added up because topLeftCell is zero based.
rowNumber = topLeftCell[1] + 1 # 1 is added up because topLeftcell is zero based.
# Copy the properties of the old cells into the target worksheet
for row in templateSheet[templateSheet.dimensions]:
tmpCol = colNumber # sets the column back to the original value
for cell in row:
ws_cell = worksheet.cell(column=tmpCol, row=rowNumber)
ws_cell.alignment = copy.copy(cell.alignment)
ws_cell.border = copy.copy(cell.border)
ws_cell.font = copy.copy(cell.font)
ws_cell.number_format = copy.copy(cell.number_format)
ws_cell.value = cell.value
tmpCol += 1 # move one column further
rowNumber += 1 # moves to the next line
Since copying ranges is actually a common task, I assumed that openpyxl provides a function or method for doing so. Unfortunately, I could not find one so far.
I'm using openpyxl version 2.5.1 (and Python 3.5.2).
Best regards
AFoeee
openpyxl will let you copy entire worksheets within a workbook. That should be sufficient for your work but if you need any more you will need to write your own code.
Since it seems that openpyxl does not provide a general solution for my problem, I proceeded as follows:
I created a template with the properties of the cells set (borders, alignment, number format, etc.). Although the formulas are entered in the respective cells, columns and rows are replaced by placeholders. These placeholders indicate the offset to the "zero point".
The template area is copied as described above, but when copying "cell.value", the placeholder is used to calculate the actual position in the target worksheet.
Best regards
AFoeee

Looping and XLSXwriter formatting of a row

I have a workbook with a number of sheets that I want to format after it's created, and I want to alter the colors of the header row based on column. For example, I want the first 9 columns to be one color, then column 10 should be another, then all the rest should be a third color.
This is what I am looping through...it sort of works, but all the cells in row 0 end up the same color; the last color assigned always overwrites the previous columns.
visitFormat = mtbook.add_format({'bg_color':'#e9ccfc'})
cognotesFormat = mtbook.add_format({'bg_color':'#d2eff2'})
filedateFormat = mtbook.add_format({'bg_color':'#8cbcff'})
for worksheet in mtbook.worksheets():
print(worksheet)
# for every column
for i in range(len(subreportCols)):
# set header bgcolor based on current column (i)
if [i] in range(0,11):
useheader = visitFormat
elif [i] == 10:
useheader = cognotesFormat
else:
useheader = filedateFormat
# Write the value from cell (first row, column=1) back into that cell with formatting applied
worksheet.write(0, i, subreportCols[i], useheader)
I'm confused by this, since I thought it was writing each column separately. Do I need to do this cell by cell somehow?
Thank you!
Solved it through troubleshooting, leaving up in case it helps someone else (there is an "Answer Your Question" button, after all).
In this line:
if [i] in range(0,11):
...what I thought I was doing was using [i] as a reference to the i'th value in my list, but I was actually referencing the WHOLE list. I swapped out [i] for just i, and that worked fine.

Categories