Append data into new excel sheet with openpyxl - python

I have bunch of excel workbooks and I would like to get cell values from them and write to a new sheet.
My code is not appending new data.It is just overwriting cells with values from last workbook.
(I've changed the pasted code.It was pasted incorrect.)
Here is my code
from openpyxl import load_workbook
booklist = ["17_02.xlsx", "17_03.xlsx",
"17_04.xlsx", "17_05.xlsx",
"17_06.xlsx", "17_08.xlsx",
"17_09.xlsx", "17_10.xlsx"]
for wb in booklist:
book = load_workbook(filename =wb,data_only=True)
report = load_workbook(filename="dest.xlsx", data_only=True)
print(book)
sheet = book['Sheet']
reportsheet=report['First']
row_count=sheet.max_row
column_count=sheet.max_column
for r in range(1,row_count+1):
for c in range(1,column_count+1):
source=sheet.cell(row=r, column=c)
dest = reportsheet.cell(row=r, column=c)
dest.value = source.value
sheet.title = 'First'
book.save("dest.xlsx")
Edit:
After the mickNeill's answer I changed the code and it worked for appending.But now there is another problem.
If I run the code (after clearing the cells) second time or more it's appending the data to the rows after the cleared cells.
First run:
Data appended to A1:A20
Clear the cells,save and close the workbook.
Second run:
Data appended to A21:A20 instead of A1:A20 (cleared cells)
Every time I run the code value of the reportRow continues to increase (1,20,40 ...) and appending data to higher number of rows.
from openpyxl import load_workbook
booklist = ["17_02.xlsx", "17_03.xlsx",
"17_04.xlsx", "17_05.xlsx",
"17_06.xlsx", "17_08.xlsx",
"17_09.xlsx", "17_10.xlsx"]
for wb in booklist:
book = load_workbook(filename =wb,data_only=True)
report = load_workbook(filename="dest.xlsx", data_only=True)
print(book)
sheet = book['Sheet']
reportsheet=report['First']
row_count=sheet.max_row
reportRow = reportsheet.max_row
column_count=sheet.max_column
for r in range(1,row_count+2):
for c in range(1,column_count+1):
source=sheet.cell(row=r, column=c)
dest = reportsheet.cell(row=reportRow, column=c)
dest.value = source.value
reportRow += 1
report.save("dest.xlsx")

Try this: Editied, you are saving the wrong book, last line
from openpyxl import load_workbook
booklist = ["Book5.xlsx", "Book6.xlsx","Book7.xlsx"]
report = load_workbook(filename="dest.xlsx", data_only=True)
for wb in booklist:
book = load_workbook(filename =wb,data_only=True)
#print(book)
sheet = book['Sheet1']
reportsheet=report['First']
row_count=sheet.max_row
reportRow = reportsheet.max_row + 1
print reportRow
column_count=sheet.max_column
for r in range(1,row_count+1):
for c in range(1,column_count+1):
print reportRow
source=sheet.cell(row=r, column=c)
dest = reportsheet.cell(row=reportRow, column=c)
dest.value = source.value
sheet.title = 'First'
reportRow += 1
report.save("dest.xlsx")

Related

Python Excel Program Update Every Run

I have this simple code and it creates a file "example.xlsx"
I only need the A1 Cell to have an output for the first run.
This is my initial code
from openpyxl import Workbook
import requests
workbook = Workbook()
sheet = workbook.active
success= "DONE"
sheet["A1"] = requests.get('http://ip.42.pl/raw').text
workbook.save(filename="example.xlsx")
print(success)
The first output is an excel file example.xlsx. I am required to update the same excel file every time we run the program. Example.
The 1st run has only A1 with the output from the website http://ip.42.pl/raw and the following will be input to A2, A3 and so on every run.
THANK YOU. I AM BEGINNER. PLEASE BEAR WITH ME
I modified the code, and now I think it does what you ask for:
from openpyxl import Workbook, load_workbook
import os
import requests
workbook = Workbook()
filename = "example.xlsx"
success = "DONE"
# First verifies if the file exists
if os.path.exists(filename):
workbook = load_workbook(filename, read_only=False)
sheet = workbook.active
counter = 1
keep_going = True
while keep_going:
cell_id = 'A' + str(counter)
if sheet[cell_id].value is None:
sheet[cell_id] = requests.get('http://ip.42.pl/raw').text
keep_going = False
else:
counter += 1
workbook.save(filename)
print(success)
else:
# If file does not exist, you have to create an empty file from excel first
print('Please create an empty file ' + filename + ' from excel, as it throws error when created from openpyxl')
Check the question xlsx and xlsm files return badzipfile: file is not a zip file for clarification about why you have to create an empty file from excel so openpyxl can work with it (line in the else: statement).
You could use sheet.max_row in openpyxl to get the length. Like so:
from openpyxl import Workbook
import requests
workbook = Workbook()
sheet = workbook.active
max_row = sheet.max_row
success= "DONE"
sheet.cell(row=max_row+1, column=1).value = requests.get('http://ip.42.pl/raw').text
# sheet["A1"] = requests.get('http://ip.42.pl/raw').text
workbook.save(filename="example.xlsx")
print(success)

How can I still use functions in excel and print their output ( and not the function) to my python prog.?

This is the code I'm using but, If I use a function in excel, python prints only the function and not it's output. I imagine this is normally fixed using an intermediary like notebook, but the Arabic typeface does not transfer over. Any help you could give would be greatly appreciated.
Thanks,
William M. Hollingsworth
rom openpyxl import load_workbook
def main():
wb = load_workbook(filename ="F:\\Quran Gematria.xlsx", read_only=True)
ws = wb['Sheet1']
bigdict = {}
rowcount = 0
for row in ws.rows:
rowdict = {}
rowdict['words'] = row[0].value
rowdict['sum'] = row[1].value
rowdict['prime']=row[2].value
rowdict['form'] = row[3].value
rowdict['verse'] = row[4].value
bigdict[rowcount] = rowdict
rowcount += 1
for wordkey in bigdict:
if(bigdict[wordkey]['form'] ==74):
print(bigdict[wordkey])
main()
set data_only to True when load workbook:
import openpyxl
wb = openpyxl.load_workbook('forecast.xlsx', data_only=True)

How to write multiple sheets into a new excel, without overwriting each other?

I'm trying to write multiple excels' column A into a new excel's column A (assuming all the excels have one worksheet each.) I've written some code, which can write one excel's column A into the new excel's column A; but if there are multiple excels, the new excel's column A will be overwritten multiple times. So how could I just add all the column As to the new excel sheet one after another without overwriting each other?
Below are my code:
import os, openpyxl
path = os.getcwd()
def func(file):
for file in os.listdir(path):
if file.endswith('.xlsx'):
wb = openpyxl.load_workbook(file)
sheet = wb.active
colA = sheet['A']
wb = openpyxl.Workbook()
r = 1
for i in colA:
sheet = wb.active
sheet.cell(row=r, column=1).value = i.value
r += 1
wb.save('new.xlsx')
func(file)
Thank you so much!!
you could proceed for example as:
import os, openpyxl
path = os.getcwd()
def func(outputFile):
c = 0
#create output workbook
wbOut = openpyxl.Workbook()
sheetOut = wbOut.active
for fName in os.listdir(path):
if fName.endswith('.xlsx'):
c += 1 #move to the next column in output
wb = openpyxl.load_workbook(fName)
sheet = wb.active #input sheet
#for r in range(1, sheet.max_row+1):
# sheetOut.cell(row=r, column=c).value = sheet.cell(row = r, column = 1).value
for r, cell in enumerate(sheet['A']):
sheetOut.cell(row = r+1, column = c).value = cell.value
wbOut.save(outputFile)
#"concatenate" all columns A into one single column
def funcAppend(outputFile):
wbOut = openpyxl.Workbook()
sheetOut = wbOut.active
r = 1
for fName in os.listdir(path):
if fName.endswith('.xlsx'):
wb = openpyxl.load_workbook(fName)
sheet = wb.active
for cell in sheet['A']:
sheetOut.cell(row = r, column = 1).value = cell.value
r += 1
wbOut.save(outputFile)
func('test.xlsx')

Insert hyperlink to a local folder in Excel with Python

The piece of code reads an Excel file. This excel file holds information such as customer job numbers, customer names, sites, works description ect..
What this code will do when completed (I hope) is read the last line of the worksheet (this is taken from a counter on the worksheet at cell 'P1'), create folders based on cell content, and create a hyperlink on the worksheet to open the lowest local folder that was created.
I have extracted the info I need from the worksheet to understand what folders need to be created, but I am not able to write a hyperlink to the cell on the row in column B.
#Insert Hyperlink to folder
def folder_hyperlink(last_row_position, destination):
cols = 'B'
rows = str(last_row_position)
position = cols + rows
final_position = "".join(position)
print final_position # This is just to check the value
# The statement below should insert hyperlink in eps.xlsm > worksheet jobnoeps at column B and last completed row.
ws.cell(final_position).hyperlink = destination
The complete code is below but here is the section that is meant to create the hyperlink. I have also tried the 'xlswriter' package with no joy. Searched the internet and the above snippet is the result of what I found.
Anyone know what I am doing wrong?
__author__ = 'Paul'
import os
import openpyxl
from openpyxl import load_workbook
import xlsxwriter
site_info_root = 'C:\\Users\\paul.EPSCONSTRUCTION\\PycharmProjects\\Excel_Jobs\\Site Information\\'
# This function returns the last row on eps.xlsm to be populated
def get_last_row(cell_ref = 'P1'): #P1 contains the count of the used rows
global wb
global ws
wb = load_workbook("eps.xlsm", data_only = True) #Workbook
ws = wb["jobnoeps"] #Worksheet
last_row = ws.cell(cell_ref).value #Value of P1 from that worksheet
return last_row
# This function will read the job number in format EPS-XXXX-YR
def read_last_row_jobno(last_row_position):
last_row_data = []
for cols in range(1, 5):
last_row_data += str(ws.cell(column = cols, row = last_row_position).value)
last_row_data_all = "".join(last_row_data)
return last_row_data_all
#This function will return the Customer
def read_last_row_cust(last_row_position):
cols = 5
customer_name = str(ws.cell(column = cols, row = last_row_position).value)
return customer_name
#This function will return the Site
def read_last_row_site(last_row_position):
cols = 6
site_name = str(ws.cell(column = cols, row = last_row_position).value)
return site_name
#This function will return the Job Discription
def read_last_row_disc(last_row_position):
cols = 7
site_disc = str(ws.cell(column = cols, row = last_row_position).value)
return site_disc
last_row = get_last_row()
job_no_details = read_last_row_jobno(last_row)
job_customer = read_last_row_cust(last_row)
job_site = read_last_row_site(last_row)
job_disc = read_last_row_disc(last_row)
cust_folder = job_customer
job_dir = job_no_details + "\\" + job_site + " - " + job_disc
#Insert Hyperlink to folder
def folder_hyperlink(last_row_position, destination):
cols = 'B'
rows = str(last_row_position)
position = cols + rows
final_position = "".join(position)
print final_position # This is just to check the value
# The statement below should insert hyperlink in eps.xlsm > worksheet jobnoeps at column B and last completed row.
ws.cell(final_position).hyperlink = destination
folder_location = site_info_root + job_customer + "\\" + job_dir
print folder_location # This is just to check the value
folder_hyperlink(last_row, folder_location)
Now my hyperlink function looks like this after trying xlsxwriter as advised.
##Insert Hyperlink to folder
def folder_hyperlink(last_row_position, destination):
import xlsxwriter
cols = 'B'
rows = str(last_row_position)
position = cols + rows
final_position = "".join(position)
print final_position # This is just to check the value
workbook = xlsxwriter.Workbook('eps.xlsx')
worksheet = workbook.add_worksheet('jobnoeps')
print worksheet
worksheet.write_url(final_position, 'folder_location')
workbook.close()
The function overwrites the exsisting eps.xlsx, creates a jobnoeps table and then inserts the hyperlink. I have played with the following lines but don't know how to get it to open the existing xlsx and existing jobnoeps tab and then enter the hyperlink.
workbook = xlsxwriter.Workbook('eps.xlsx')
worksheet = workbook.add_worksheet('jobnoeps')
worksheet.write_url(final_position, 'folder_location')
The XlsxWriter write_url() method allows you to link to folders or other workbooks and worksheets as well as internal links and links to web urls. For example:
import xlsxwriter
workbook = xlsxwriter.Workbook('links.xlsx')
worksheet = workbook.add_worksheet()
worksheet.set_column('A:A', 50)
# Link to a Folder.
worksheet.write_url('A1', r'external:C:\Temp')
# Link to a workbook.
worksheet.write_url('A3', r'external:C:\Temp\Book.xlsx')
# Link to a cell in a worksheet.
worksheet.write_url('A5', r'external:C:\Temp\Book.xlsx#Sheet1!C5')
workbook.close()
See the docs linked to above for more details.
Here is the code that did the trick:-
# Creates hyperlink in existing workbook...
def set_hyperlink():
from openpyxl import load_workbook
x = "hyperlink address"
wb = load_workbook("filename.xlsx")
ws = wb.get_sheet_by_name("sheet_name")
ws.cell(row = x?, column = y?).hyperlink = x
wb.save("filename.xlsx")
set_hyperlink()
Tried again with openpyxl as advised.

Activate second worksheet with openpyxl

I am trying to activate multiple excel worksheets and write to both multiple sheets within both workbook(s) using python and openpyxl. I am able to load the second workbook f but I am unable to append cell G2 of my second workbook with the string Recon
from openpyxl import Workbook, load_workbook
filename = 'sda_2015.xlsx'
wb = Workbook()
ws = wb.active
ws['G1'] = 'Path'
ws.title = 'Main'
adf = "Dirty Securities 04222015.xlsx"
f = "F:\\ana\\xlmacro\\" + adf
wb2 = load_workbook(f)
"""
wb22 = Workbook(wb2)
ws = wb22.active
ws['G1'] = "Recon2"
ws.title = 'Main2'
"""
print wb2.get_sheet_names()
wb.save(filename)
I commented out the code which is broken
Update
I adjusted my code with the below answer. The value in cell H1 is written onto wb2 in column H, but for some reason the column is hidden. I have adjusted the column to other columns but still I have seen the code hide multiple columns. There are also occurences when the code executes and titles ws2 as Main21 but the encoded value is Main2
from openpyxl import Workbook, load_workbook
filename = 'sda_2015.xlsx'
wb1 = Workbook()
ws1 = wb1.active
ws1['G1'] = 'Path'
ws1.title = 'Main'
adf = "Dirty Securities 04222015.xlsx"
f = "F:\\ana\\xlmacro\\" + adf
wb2 = load_workbook(f)
ws2 = wb2.active
ws2['H1'] = 'Recon2'
ws2.title = 'Main2'
print wb2.get_sheet_names()
wb1.save(filename)
wb2.save(f)
If you have two workbooks open, wb1 and wb2, you'll also need different names for the various worksheets: ws1 = wb1.active and ws2 = wb2.active.
If you're working with a file with macros, you'll need to set the keep_vba flag to True when opening it in order to preserve the macros.
I had experienced the same thing with hidden cells. Eventually, I unpacked the Excel file and looked at the raw XML to find out that not all of the columns had a dimension for width. Those without a width were being by Excel.
A quick fix is to do something like this...
for col in 'ABCDEFG':
if not worksheet.column_dimensions[col].width:
worksheet.column_dimensions[col].width = 10

Categories