How to get Python script to write to existing sheet - python

I am writing a Python script and stuck on one of the early steps. I am opening an existing sheet and want to add two columns so I have used this:
#import the writer
import xlwt
#import the reader
import xlrd
#open the sussex results spreadsheet
book = xlrd.open_workbook('sussex.xlsx')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
#in cell 0,0 (first cell of the first row) write "NIF"
sheet1.write(0, 6, "NIF")
#in cell 0,0 (first cell of the first row) write "Points scored"
sheet1.write(0, 6, "Points scored")
On line 12 I get an error:
name 'sheet1' is not defined
How do I define sheet 1 within the sheet that I have already opened?

sheet1 is never declared. Try changing it to
#import the writer
import xlwt
#import the reader
import xlrd
#open the sussex results spreadsheet
book = xlrd.open_workbook('sussex.xlsx')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
#in cell 0,0 (first cell of the first row) write "NIF"
first_sheet.write(0, 6, "NIF")
#in cell 0,0 (first cell of the first row) write "Points scored"
first_sheet.write(0, 6, "Points scored")
edit: You could also use Pandas to read and write to Excel:
import pandas as pd
import numpy as np
#open the sussex results spreadsheet, first sheet is used automatically
df = pd.read_excel('sussex.xlsx')
#print the values in the second column of the first sheet
print(df.iloc[:,1])
#Create column 'NIF'
df['NIF'] = np.nan #I don't know what you want to do with this column, so I filled it with NaN's
#in cell 0,7 (first cell of the first row) write "Points scored"
df['Points scored'] = np.nan #I don't know what you want to do with this column, so I filled it with NaN's
<.... Do whatever calculations you want with NIF and Points scored ...>
# Write output
df.to_excel('sussex.xlsx')

I guess you need to have something like
sheet1 = book.sheet_by_index(0); because now sheet1 is not defined.
Also, document is opened using xlrd which is reader, and you need to write there values - so document should be opened also using xlwt.

Related

Format and manipulate data across multiple Excel sheets in Python using openpyxl before converting to Dataframe

I need some help with editing the sheets within my Excel workbook in python, before I stack the data using pd.concat(). Each sheet (~100) within my Excel workbook is structured identically, with the unique identifier for each sheet being a 6-digit code that is found in line 1 of the worksheet.
I've already done the following steps to import the file, unmerge rows 1-4, and insert a new column 'C':
import openpyxl
import pandas as pd
wb = openpyxl.load_workbook('data_sheets.xlsx')
for sheet in wb.worksheets:
sheet.merged_cells
for merge in list(sheet.merged_cells):
sheet.unmerge_cells(range_string=str(merge))
sheet.insert_cols(3, 1)
print(sheet)
wb.save('workbook_test.xlsx')
#concat once worksheets have been edited
df= pd.concat(pd.read_excel('workbook_test.xlsx, sheet_name= None), ignore_index= True)
Before stacking the data however, I would like to make the following additonal (sequential) changes to every sheet:
Extract from row 1 the right 8 characters (in excel the equivalent of this would be =RIGHT(A1, 8) - this is to pull the unique code off of each sheet, which will look like '(000000)'.
Populate column C from rows 6-282 with the unique code.
Delete rows 1-5
The end result would make each sheet within the workbook look like this:
Is this possible to do with openpyxl, and if so, how? Any direction or assistance with this would be much appreciated - thank you!
Here is a 100% openpyxl approach to achieve what you're looking for :
from openpyxl import load_workbook
wb = load_workbook("workbook_test.xlsx")
for ws in wb:
ws.unmerge_cells("A1:O1") #unmerge first row till O
ws_uid = ws.cell(row=1, column=1).value[-8:] #get the sheet's UID
for num_row in range(6, 282):
ws.cell(row=num_row, column=3).value = '="{}"'.format(ws_uid) #write UID in Column C
ws.delete_rows(1, 5) #delete first 5 rows
wb.save("workbook_test.xlsx")
NB : This assume there is already an empty column (C).

Formatting of Excel sheets in Python

In my project I am opening an Excel file with multiple sheets. I want to manipulate "sheet2" in Python (which works fine) and after that overwrite the old "sheet2" with the new one but KEEP the formatting.. so something like this:
import pandas as pd
update_sheet2 = pd.read_excel(newest_isaac_file, sheet_name='sheet2')
#do stuff with the sheet
with pd.ExcelWriter(filepath, engine='openpyxl', if_sheet_exists='replace', mode='a',
KEEP_FORMATTING = True) as writer:
df.to_excel(writer, sheet_name=sheetname, index=index)
In other words: Is there a way to get the formatting from an existing Excel sheet?
I could not find anything about that. I know I can manually set the formatting in Python but the formatting of the existing sheet is really complicated and has to stay the same.
thanks for your help!
As per your comment, try this code. It will open a file (Sample.xlsx), go to a sheet (Sheet1), insert new row at 15, copy the text and formatting from row 22 and paste it in the empty row (row 15). Code and final screen shot attached.
import openpyxl
from copy import copy
wb=openpyxl.load_workbook('Sample.xlsx') #Load workbook
ws=wb['Sheet1'] #Open sheet
ws.insert_rows(15, 1) #Insert one row at 15 and move everything one row downwards
for row in ws.iter_rows(min_row=22, max_row=22, min_col=1, max_col=ws.max_column): # Read values from row 22
for cell in row:
ws.cell(row=15, column=cell.column).value = cell.value #Update value to row 22 to new row 15
ws.cell(row=15, column=cell.column)._style = copy(cell._style) #Copy formatting
wb.save('Sample.xlsx')
How excel looks after running the code

Write an excel formula all column with python

I have existing excel document and want to update M column according to A column. And I want to start from second row to maintain first row 'header'.
Here is my code;
import openpyxl
wb = openpyxl.load_workbook('D:\Documents\Desktop\deneme/formula.xlsx')
ws=wb['Sheet1']
for i, cellObj in enumerate(ws['M'], 1):
cellObj.value = '=_xlfn.ISOWEEKNUM(A2)'.format(i)
wb.save('D:\Documents\Desktop\deneme/formula.xlsx')
When I run that code;
-first row 'header' changes.
-all columns in excel "ISOWEEKNUM(A2)", but I want it to change according to row number (A3,A4,A5... "ISOWEEKNUM(A3), ISOWEEKNUM(A4), ISOWEEKNUM(A5)....")
Edit:
I handled right now the ISOWEEKNUM issue with below code. I changed A2 to A2:A5.
import openpyxl
wb = openpyxl.load_workbook('D:\Documents\Desktop\deneme/formula.xlsx')
ws=wb['Sheet1']
for i, cellObj in enumerate(ws['M'], 1):
cellObj.value = '=_xlfn.ISOWEEKNUM(A2:A5)'.format(i)
wb.save('D:\Documents\Desktop\deneme/formula.xlsx')
But still starts from first row.
Here is an answer using pandas.
Let us consider the following spreadsheet:
First import pandas:
import pandas as pd
Then load the third sheet of your excel workbook into a dataframe called df:
df=pd.read_excel('D:\Documents\Desktop\deneme/formula.xlsx', sheet_name='Sheet3')
Update column 'column_to_update' using column 'deneme'. The line below converts the dates in the 'deneme' column from strings to datetime objects and then returns the week of the year associated with each of those dates.
df['Column_to_update'] = pd.to_datetime(df['deneme']).dt.week
You can then save your dataframe to a new excel document:
df.to_excel('./newspreadsheet.xlsx', index=False)
Here is the result:
You can see that the values in 'column_to_update' got updated from 1, 2 and 3 to 12, 12 and 18.

How to write printed value from one sheet to newly created one

I am new to Python so only just getting to grips with it and would really appreciate some help as I can't figure out how to write the values from file A into file B.
I would like to:
filter the values of column D from 'mf_mar_2018.xls' (filter of 'saxon')
write the found values into a new file called 'saxons.xls'
I am able to get the non-filtered values and print them in Terminal.
My script is below:
#import the writer
import xlwt
#open the spreadsheet
workbook = xlwt.Workbook()
#add a sheet named "Club BFA ranking"
worksheet1 = workbook.add_sheet("Club BFA ranking")
#in cell 0,0 (first cell of the first row) write "Ranking"
worksheet1.write(0, 0, "Ranking")
#in cell 0,1 (second cell of the first row) write "Name"
worksheet1.write(0, 1, "Name")
#save and create the spreadsheet file
workbook.save("saxons.xls")
#import the reader
import xlrd
#open the rankings spreadsheet
book = xlrd.open_workbook('mf_mar_2018.xls')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
Try something like this.
name = []
rank = []
for i in range(first_sheet.nrows):
#print(first_sheet.cell_value(i,3))
if('Saxon' in first_sheet.cell_value(i,3)):
name.append(first_sheet.cell_value(i,2))
rank.append(first_sheet.cell_value(i,8))
print('a')
for j in range(len(name)):
worksheet1.write(j+1,0,rank[j])
worksheet1.write(j+1,1,name[j])
workbook.save("saxons.xls")

dbf to xls - first non header row not writing

I would like to convert .dbf file to .xls using python. I've referenced this snippet, however I cannot get the first non header row to write using this code:
from xlwt import Workbook, easyxf
import dbfpy.dbf
dbf = dbfpy.dbf.Dbf("C:\\Temp\\Owner.dbf")
book = Workbook()
sheet1 = book.add_sheet('Sheet 1')
header_style = easyxf('font: name Arial, bold True, height 200;')
for (i, name) in enumerate(dbf.fieldNames):
sheet1.write(0, i, name, header_style)
for row in range(1, len(dbf)):
for col in range(len(dbf.fieldNames)):
sheet1.row(row).write(col, dbf[row][col])
book.save("C:\\Temp\\Owner.xls")
How can I get the first non header row to write?
Thanks
You are missing out row 0 in the dbf which is the first row. In dbf files the column names are not a row. However row 0 in the Excel file is the header so the index needs to differ in the dbf and the xls so you need to add 1 to the row used in the Excel worksheet.
So
for row in range(len(dbf)):
for col in range(len(dbf.fieldNames)):
sheet1.row(row+1).write(col, dbf[row][col])
Note the snipper referred to does not add the 1 in the range either

Categories