Copying image from a worksheet using openpyxl - python

I have a Python program that creates a new excel file based on some worksheets from a few other files. The following code I have copies the worksheets perfectly, but is unable to copy the image that is present in the worksheet. How do I copy images in an Excel worksheet to another Excel workbook using Python?
path1 = "/mnt/e/RecEasy-MVP-Python/FlaskApp/Uploaded_files/" + key
print path1
path2 = "/mnt/e/RecEasy-MVP-Python/FlaskApp/Compiled/" + current_acc_group + "_" + current_gl_account + ".xlsx"
print path2
path_to_key_sheet = "/mnt/e/RecEasy-MVP-Python/FlaskApp/Uploaded_files/" + key + "_key_sheet.txt"
print "Path to key sheet file:"
print path_to_key_sheet
wb1 = xl.load_workbook(filename=path1, read_only=True, data_only=True)
ws1 = wb1.worksheets[2]
counter = 0
for sheet in wb1:
if (str(sheet.title) == str(content_of_key_sheet_file)):
ws1 = wb1.worksheets[counter]
print "Sheet selected"
print sheet.title
counter = counter + 1
ws2 = wb2.create_sheet(ws1.title)
print "Copying from the Excel file: " + path1
for row in ws1:
for cell in row:
if (cell.value != None):
ws2[cell.coordinate].value = cell.value
wb2.save(path2)

install Pillow (just pip install Pillow, not needed import in your file)
then:
from openpyxl import drawing
.
.
.
img = drawing.image.Image('yourImg.png')
yourSheet.add_image(img, 'A2')
where A2 is your cell

I'd been struggling with this for a bit as most of the libraries I typically use to manipulate xlsx files seemed to not want to support this.
Fortunately, .xlsx is ooxml format. Thus, all you need to do is unzip the .xlsx and locate the pictures in xl/media/ of the directory you extracted your workbook to.
zip = ZipFile('yourWorkbook.xlsx')
zip.extractall()
Now you can insert them back into your new spreadsheet spreadsheet
import openpyxl
wb = openpyxl.Workbook()
ws = wb.worksheets[0]
img = openpyxl.drawing.image.Image('test.jpg')
img.anchor(ws.cell('A1'))
ws.add_image(img)
wb.save('out.xlsx')

Related

How could I make my python code run faster?

I have a python code which saves the content of tables (1 column, second from the right) into an XLS file and does some editing on the content. I tested it in a folder with 38 files which in total has 22.000 lines and it took it 2.5hours to finish.
from docx import Document
from openpyxl import Workbook
import datetime
folder = os.path.dirname(os.path.abspath(__file__))
wb = Workbook()
ws = wb.active
for filename in os.listdir(folder):
if filename.endswith(".docx"):
doc = Document(os.path.join(folder, filename))
for table in doc.tables:
ws.append(["File name: " + filename])
for row in table.rows:
ws.append([row.cells[-2].text])
wb.save("output.xlsx")
for row in ws.iter_rows():
for cell in row:
if cell.value and (cell.value[0] in ["+", "-", "="]):
cell.value = "'" + cell.value
wb.save("output.xlsx")

How to save transformed file into new excel file using Openpyxl Python?

I have 3 excel files currently in my working directory. All 3 files has name that ends with "_Updated.xlsx". I wanted to transform the files such that all empty rows in each of the files get deleted. I have created function for it, but the only issue is I cannot save all transformed file using below code. Not sure what is wrong ? The reason for creating new file is I would like to save my raw files.
Python code
import openpyxl
import os
from openpyxl import load_workbook,Workbook
import glob
from pathlib import Path
Excel_file_path="/Excel"
for file in Path(Excel_file_path).glob('*_Updated.xlsx'):
wb=load_workbook(file)
wb_modified = False
for sheet in wb.worksheets:
max_row_in_sheet = sheet.max_row
max_col_in_sheet = sheet.max_column
sheet_modified = False
if max_row_in_sheet > 1:
first_nonempty_row = nonempty_row() # Function to find nonempty row
sheet_modified = del_rows_before(first_nonempty_row) #Function to delete nonempty row
wb_modified = wb_modified or sheet_modified
if wb_modified:
for workbook in workbooks:
for sheet in wb.worksheets:
new_wb = Workbook()
ws = new_wb.active
for row_data in sheet.iter_rows():
for row_cell in row_data:
ws[row_cell.coordinate].value = row_cell.value
new_wb.save("/Excel/"+sheet.title+"_Transformed.xlsx")
In case, if any one is still looking for answer to my above question. Below is the code that worked for me.
import openpyxl
import os
from openpyxl import load_workbook
import glob
from pathlib import Path
Excel_file_path="/Excel"
for file in Path(Excel_file_path).glob('*_Updated.xlsx'):
wb=load_workbook(file)
wb_modified = False
for sheet in wb.worksheets:
max_row_in_sheet = sheet.max_row
max_col_in_sheet = sheet.max_column
sheet_modified = False
if max_row_in_sheet > 1:
first_nonempty_row = get_first_nonempty_row() # Function to find nonempty row
sheet_modified = del_rows_before(first_nonempty_row) #Function to delete nonempty roW
file_name = os.path.basename(file)
wb.save("Excel/"+file_name[:-5]+"_Transformed.xlsx")
wb.close()

Python Excel Program Update Every Run

I have this simple code and it creates a file "example.xlsx"
I only need the A1 Cell to have an output for the first run.
This is my initial code
from openpyxl import Workbook
import requests
workbook = Workbook()
sheet = workbook.active
success= "DONE"
sheet["A1"] = requests.get('http://ip.42.pl/raw').text
workbook.save(filename="example.xlsx")
print(success)
The first output is an excel file example.xlsx. I am required to update the same excel file every time we run the program. Example.
The 1st run has only A1 with the output from the website http://ip.42.pl/raw and the following will be input to A2, A3 and so on every run.
THANK YOU. I AM BEGINNER. PLEASE BEAR WITH ME
I modified the code, and now I think it does what you ask for:
from openpyxl import Workbook, load_workbook
import os
import requests
workbook = Workbook()
filename = "example.xlsx"
success = "DONE"
# First verifies if the file exists
if os.path.exists(filename):
workbook = load_workbook(filename, read_only=False)
sheet = workbook.active
counter = 1
keep_going = True
while keep_going:
cell_id = 'A' + str(counter)
if sheet[cell_id].value is None:
sheet[cell_id] = requests.get('http://ip.42.pl/raw').text
keep_going = False
else:
counter += 1
workbook.save(filename)
print(success)
else:
# If file does not exist, you have to create an empty file from excel first
print('Please create an empty file ' + filename + ' from excel, as it throws error when created from openpyxl')
Check the question xlsx and xlsm files return badzipfile: file is not a zip file for clarification about why you have to create an empty file from excel so openpyxl can work with it (line in the else: statement).
You could use sheet.max_row in openpyxl to get the length. Like so:
from openpyxl import Workbook
import requests
workbook = Workbook()
sheet = workbook.active
max_row = sheet.max_row
success= "DONE"
sheet.cell(row=max_row+1, column=1).value = requests.get('http://ip.42.pl/raw').text
# sheet["A1"] = requests.get('http://ip.42.pl/raw').text
workbook.save(filename="example.xlsx")
print(success)

Activate second worksheet with openpyxl

I am trying to activate multiple excel worksheets and write to both multiple sheets within both workbook(s) using python and openpyxl. I am able to load the second workbook f but I am unable to append cell G2 of my second workbook with the string Recon
from openpyxl import Workbook, load_workbook
filename = 'sda_2015.xlsx'
wb = Workbook()
ws = wb.active
ws['G1'] = 'Path'
ws.title = 'Main'
adf = "Dirty Securities 04222015.xlsx"
f = "F:\\ana\\xlmacro\\" + adf
wb2 = load_workbook(f)
"""
wb22 = Workbook(wb2)
ws = wb22.active
ws['G1'] = "Recon2"
ws.title = 'Main2'
"""
print wb2.get_sheet_names()
wb.save(filename)
I commented out the code which is broken
Update
I adjusted my code with the below answer. The value in cell H1 is written onto wb2 in column H, but for some reason the column is hidden. I have adjusted the column to other columns but still I have seen the code hide multiple columns. There are also occurences when the code executes and titles ws2 as Main21 but the encoded value is Main2
from openpyxl import Workbook, load_workbook
filename = 'sda_2015.xlsx'
wb1 = Workbook()
ws1 = wb1.active
ws1['G1'] = 'Path'
ws1.title = 'Main'
adf = "Dirty Securities 04222015.xlsx"
f = "F:\\ana\\xlmacro\\" + adf
wb2 = load_workbook(f)
ws2 = wb2.active
ws2['H1'] = 'Recon2'
ws2.title = 'Main2'
print wb2.get_sheet_names()
wb1.save(filename)
wb2.save(f)
If you have two workbooks open, wb1 and wb2, you'll also need different names for the various worksheets: ws1 = wb1.active and ws2 = wb2.active.
If you're working with a file with macros, you'll need to set the keep_vba flag to True when opening it in order to preserve the macros.
I had experienced the same thing with hidden cells. Eventually, I unpacked the Excel file and looked at the raw XML to find out that not all of the columns had a dimension for width. Those without a width were being by Excel.
A quick fix is to do something like this...
for col in 'ABCDEFG':
if not worksheet.column_dimensions[col].width:
worksheet.column_dimensions[col].width = 10

How do I save a workbook using xlwings?

I have an excel worksheet, some buttons and some macros. I use xlwings to make it work. Is there a way to save the workbook through xlwings ? I want to extract a specific sheet after doing an operation, but the saved sheet is the extracted sheet before the operation without the generated data.
My code for extracting the sheet I need is the following:
Set objFSO = CreateObject("Scripting.FileSystemObject")
src_file = objFSO.GetAbsolutePathName(Wscript.Arguments.Item(0))
sheet_name = Wscript.Arguments.Item(1)
dir_name = Wscript.Arguments.Item(2)
file_name = Wscript.Arguments.Item(3)
Dim objExcel
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
Dim objWorkbook
Set objWorkbook = objExcel.Workbooks(src_file)
objWorkbook.Sheets(sheet_name).Copy
objExcel.DisplayAlerts = False
objExcel.ActiveWorkbook.SaveAs dir_name + file_name + ".xlsx", 51
objExcel.ActiveWorkbook.SaveAs dir_name + file_name + ".csv", 6
objWorkbook.Close False
objExcel.Quit
Book.save() has now been implemented: see the docs.

Categories