Writing new excel file without data from previous workbook - python

I am trying to write a programm to compare strings from a fixed matrix to 2 specific columns from an excel file. So far, I am first trying to achieve that a comparison with a match in row takes place. So far, the comparison of one string from the matrix is successful.
import openpyxl as xl
from IDM import idm_matrix
wb = xl.load_workbook('Auswertung_C33.xlsx')
sheet = wb['TriCad_Format']
for row in range(2, sheet.max_row + 1):
cell = sheet.cell(row, 8)
if idm_matrix[0][0] in cell.value:
sheet.cell(row=2, column=1).value = cell.value
wb.save('Auswertung.xlsx')
Question: How can I achieve that the matching values are saved in a new file WITHOUT the loaded workbook above?
For further help with this project I will get back to you as soon as I am facing more difficulties with the matrix comparison.
Thanks for your help.
Regards, Alex

You will need to create a new workbook to save your answer (comparison result). something like below. Hope this is helpful.
import openpyxl as xl
from IDM import idm_matrix
wb = xl.load_workbook('Auswertung_C33.xlsx')
result_wb = xl.Workbook() #workbook to save your result.
result_sheet = result_wb.active #get the active sheet to save your result.
sheet = wb['TriCad_Format']
for row in range(2, sheet.max_row + 1):
row_list = []
for col in range(1, sheet.max_col+1):
cell = sheet.cell(row, col)
row_list.append(cell)
#adjust row,col offset to match your matrix index below, e.g. row-2, col-1. you might need another loop to loop through your matrix.
if idm_matrix[i][j] in row_list:
result_sheet.append(row_list)
result_wb.save('Auswertung.xlsx') #save the result workbook

#henjiFire: Thats how the code looks like right now:
for row in range(2, sheet.max_row + 1):
row_list = []
for col in range(1, sheet.max_column + 1):
cell = sheet.cell(row, col)
row_list.append(cell.value)
# adjust row,col offset to match your matrix index below, e.g. row-2, col-1. you might need another loop to loop through your matrix.
if idm_matrix[0][0] in row_list:
if row_list[14] is not None and idm_matrix[0][1] in row_list[14]:
result_sheet.append(row_list)

Related

Search specific text(text pattern) in excel and copy all resulting rows to another sheet in same workbook using openpyxl

I have an excel file with multiple sheets, 3rd column(contains around 500 rows) of sheet3 contains various names. I want to search column 3 for specific text and if it matches then copy the whole row along with the header row to new sheet within same excel.
Issue with "name column" is that most of the text refer to same item but naming convention is different, so:
अपर तहसीलदार,
नायाब तहसीलदार,
नायब तहसीलदार,
अतिरिक्त तहसीलदार
refers to same item but written differently, so for that I have to search for all variants.
I have no prior Python or openpyxl background so what I've got so far is:
import openpyxl
wb = openpyxl.load_workbook(r'C:/Users/Anas/Downloads/rcmspy.xlsx')
#active worksheet data
ws = wb.active
def wordfinder(searchString):
for i in range(1, ws.max_row + 1):
for j in range(1, ws.max_column + 1):
if searchString == ws.cell(i,j).value:
print("found")
print(ws.cell(i,j))
wordfinder("अपर तहसीलदार")
It is not showing any error but don't print anything either.
The excel sheet looks something like this:
I'm not certain, but I would suggest something along the lines of:
variants = {'alpha','alfa','elfa'}
data = []
rowCount = 0
for row in ws.values:
//each row is an array of cells
if rowCount == 0:
//header row
data.append(row)
elif row[2] in variants:
data.append(row)
rowCount += 1
wsNew = wb.create_sheet('Variations')
for line in data:
wsNew.append(line)
wb.save('newWorkbook.xlsx')

How to copy data from one sheet to another while skipping empty cells - Python and Openpyxl

I'm using python to format an Excel spreadsheet. I need to copy data from Column L in Sheet #1, "Main", and paste it into Column A in Sheet #2, "Data". I've gotten this working, but I also want to skip empty cells, which occur randomly in Sheet #1, and here I ran intro trouble.
I tried:
for i in range(2, 50):
for j in range(12, 13):
if cell.value != None:
data.cell(row=i, column=j-11).value = main.cell(row=i, column=j).value
However I get the error message "NameError: name 'value' is not defined"
Any ideas?
This is the code we got working (see the comments for the back and forth):
import os
import openpyxl
wb = openpyxl.load_workbook('/Users/path/.xlsx')
main = wb['Sheet1']
wb.create_sheet(title='Formatted Data')
data = wb['Formatted Data']
for i in range(2, 50):
for j in range(12, 13):
if main.cell(i,j).value != None:
data.cell(data.max_row+1, column=j-11).value = main.cell(row=i, column=j).value
Wherever possible you should avoid using your own counters and let openpyxl do the work for you. For a new worksheet this is pretty easy.
empty_row = [None] * 11
for row in main.iter_rows(min_col=12, max_col=2, min_row=2, values_only=True):
if row[0] != None:
data.append(empty_row + row]

Python-Excel: How to write lines of a row to an existing excel file?

I have searched the site but I could not find anything related to the following question.
I have an existing spreadsheet that I am going to pull data from on a daily basis, the information in the spreadsheet will change everyday.
What I want to do is create a file that tracks certain information from this cell, I want it to pull the data from the spreadsheet and write it to another spreadsheet. The adding of the data to a new spreadsheet should not overwrite the existing data.I would really appreciate the help on this. See code below:
import os
import openpyxl
import xlrd
wb=openpyxl.load_workbook('Test_shorts.xlsx','r')
sheet = wb.active
rows = sheet.max_row
col = sheet.max_column
rows = rows+1
print rows
new =[]
for x in range (2, 3):
for y in range(1,10):
z= sheet.cell(row=x,column=y).value
new.append(z)
print(new)
If you want to copy the whole worksheet, you can use copy_worksheet() function directly. It will create a copy of your active worksheet.
I don't know your data, but I am sure you can finish it by yourself. Hope this may help
from openpyxl import load_workbook
file_name = "Test_shorts.xlsx"
wb = load_workbook(file_name)
sheet = wb.active
target = wb.copy_worksheet(sheet)
# you can code to append new data here
new = wb.get_sheet_by_name(target.title) # to get copied sheet
for x in range (2, 3):
for y in range(1,10):
print(x,y)
z= sheet.cell(row=x,column=y).value
new.append(z)
wb.save(file_name)
as commented, a loop of cells are required so I altered your code a little.
from openpyxl import load_workbook
file_name = "Test_shorts.xlsx"
wb = load_workbook(file_name)
current_sheet = wb.active
new_sheet = wb.create_sheet("New", 1)
for row in current_sheet.rows:
col = 0 # set the column to 0 when 1 row ends
for cell in row:
col += 1 # cell.column will return 'ABC's so I defined col for the column
new_sheet.cell(cell.row, col, cell.value)
wb.save(file_name)

Iterate through worksheets adding data to each iteration

I have an excel file in which all data is listed in rows(first Image), I need to take this data and list it in column A of individual worksheets in a newly created workbook(Needs to look like the 2nd image). I am having issues getting the proper 'for' loop, so the data is written each separate worksheet. My code now writes that data all on the same worksheet.
import openpyxl
import os
import time
wb = openpyxl.load_workbook('IP-Results.xlsx') #load input file
sheet = wb.get_sheet_by_name('IP-Results-32708') #get sheet from input file
wbOutput = openpyxl.Workbook() #open a new workbook
wbOutput.remove_sheet(wbOutput.get_sheet_by_name('Sheet')) #remove initial worksheet named 'sheet'
for cell in sheet['A']: #iterate through firewall names in column A and make those the title of the sheets in new workbook
value = cell.value
wbOutput.create_sheet(title=cell.value)
inputwb = wb
inputsheet = inputwb.active
outputwb = wbOutput
outputsheet = outputwb.active
maxRow = inputsheet.max_row
maxCol = inputsheet.max_column
for i in range(1, max(maxRow, maxCol) +1):
for j in range(1, min(maxRow, maxCol) + 1):
for sheet in outputwb.get_sheet_names():
outputsheet.cell(row=i, column=j).value = inputsheet.cell(row=j, column=i).value
outputsheet.cell(row=j, column=i).value = inputsheet.cell(row=i, column=j).value
wbOutput.save("Decom-" + time.strftime("%m-%d-%Y")+ ".xlsx")
'outputsheet' is assigned to refer to the first (the default) sheet in wbOutput:
outputwb = wbOutput
outputsheet = outputwb.active
Then the main loop writes to outputsheet which always refers to the same original worksheet, causing all your data to appear on the same sheet:
for i in range(1, max(maxRow, maxCol) +1):
for j in range(1, min(maxRow, maxCol) + 1):
for sheet in outputwb.get_sheet_names():
**outputsheet**.cell(row=i, column=j).value = inputsheet.cell(row=j, column=i).value
**outputsheet**.cell(row=j, column=i).value = inputsheet.cell(row=i, column=j).value
The easiest solution would be dropping the third inner loop and using get_sheet_by_name:
for i in range(1, max(maxRow, maxCol) +1):
sheet_name = inputsheet.cell(row=i, column=1).value
a_sheet = outputwb .get_sheet_by_name(sheet_name)
for j in range(1, min(maxRow, maxCol) + 1):
a_sheet.cell(row=i, column=1).value = inputsheet.cell(row=j, column=i).value
I can't test the code at the moment but the general idea should work.
edit
Although it might be worth redesigning to something like this pseudo code:
for each inputwb_row in inputworkbook:
new_sheet = create a new_sheet in outputworkbook
set new_sheet.title = inputworkbook.cell[row,1].value
for each column in inputwb_row:
new_sheet.cell[column, 1].value = inputworkbook.cell[inputwb_row ,column].value

XLS with formula in more than one cells within a column with Python

After a long day playing with lots of variants I was left with this code:
from xlrd import open_workbook
from xlwt import Workbook, Formula
from xlutils.copy import copy
rb = open_workbook("test.xls")
wb = copy(rb)
s = wb.get_sheet(0)
s.write(2,4, Formula('D3-B3') )
wb.save('test.xls')
This works to edit a XLSfile and allowed me to enter a formula in a cell. Now I'm stuck on how can I edit a column to put a formula in the more than one cell that would continue to each cell in the column with the data from the cells in that row, like I did with D3-B3 the row number would change each cell to match that row.
With a simple loop:
s = wb.get_sheet(0)
last_row = 10 # change to your last required row
for i in range(4, last_row + 1):
s.write(2, i, Formula('D{row}-B{row}'.format(row=i-1)))
wb.save('test.xls')

Categories