Python Openpyxl Writing in a cell - python

I cannot seem to write any value in an excel sheet. I open two files at the same time. I want to copy a value from file 1 to file 2. it gives the error
File
"C:\Python34\lib\site-packages\openpyxl\writer\dump_worksheet.py", line 214, in removed_method
raise NotImplementedError
Only the line with the writing part gives an error. The function code is as follows
def data_input(size):
from openpyxl import load_workbook
wb1 = load_workbook('150318 load matching_Storage_v4.xlsm',data_only=True)
wb1s1 = wb1.get_sheet_by_name('Home load options')
from openpyxl import Workbook
wb2 = Workbook('Data',write_only=True)
wb2s1 = wb2.create_sheet(0)
wb2s1.title = "Consumption"
wb2s1.cell(row = 1, column = 1).value = 4 - this line gives the error
#what i have to write but block yet to test if i can write at all
'''i = 0
r = 0
while i < 8760:
d = wb2s1.cell(row = r, column = 1)
d.value = i
i = i + 0.25
r += 1'''
for i in range(4,35040):
cell_value1 = wb1s1.cell(row = i, column = (12+size)).value
print(cell_value1)
# cell_value1 = wb2s1.cell(row = i-3, column = 1)
wb2.save('Data.xlsx')
I tried all the different ways in the documentation but nothing works so far.
please help.
Thank you

You are creating a write-only workbook. At the name suggests, this is designed for streaming data to a workbook so some operations, such as looking up cells do not work. To add data you should use the append() method. If you do need to add formatting or comments to individual cells you can include a WriteOnlyCell in the iterable that you pass into append().

Related

How to implement the iterative way to change the filename reading and how to combine result into single excel file

I am new to python. Have a task that have to find some of the following for all the excel files(1.xlsx-350.xlsx) around 350 excel files, which contained in single folder(Videos). and written following code it works fine but it is time consuming, manually have to change file name every iteration. even in the end of the process, I have to combine all 350 excel file operated data into single excel file. But in my code it overwrite each and every iteration. please help me to resolve this problem.
data12 = pd.read_excel (r'C:\Users\Videos\1.xlsx')
gxt = data12.iloc [:,0]
gyan = data12.iloc [:,1]
int= gyan.iloc[98:197]
comp= gyan.iloc[197:252]
seg= gyan.iloc[252:319]
A= max(int)
B= max(comp)
C= min(comp)
D= max(seg)
s = pd.Series([A, B, C, D])
frame_data= [gyan, comp, seg, stat]
result = pd.concat(frame_data)
result.to_excel("output.xlsx", sheet_name='modify_data', index=False)
thank you for helping.
Please check below code:
import pandas as pd
import numpy as np
import openpyxl
from openpyxl import load_workbook, Workbook
import os
# Give an excel filename and worksheet name
output='C:\Users\Videos\output.xlsx'
worksheet = 'Sheet'
wb = Workbook()
# If file not present at location, then create one
if os.path.isfile(output):
print('File Present')
else:
print('Creatted New file')
ws = wb.create_sheet(worksheet)
wb.save(output)
# Loop for all 350 files
for i in range(1, 351):
print('File {}:'.format(i))
data12 = pd.read_excel('C:\Users\Videos\{}.xlsx'.format(i))
gxt = data12.iloc [:,0]
gyan = data12.iloc [:,1]
int= gyan.iloc[8:19]
comp= gyan.iloc[19:25]
seg= gyan.iloc[25:31]
A= max(int)
B= max(comp)
C= min(comp)
D= max(seg)
s = pd.Series([A, B, C, D])
frame_data= [gyan, comp, seg]
result = pd.DataFrame(pd.concat(frame_data))
ws = wb.active
result_list = result.to_numpy()
print('Total rows = ', len(result_list))
for row in result_list.tolist():
ws.append(row)
wb.save(output)
This will help to run through all 350 files and save it to output file.
Also make changes to frame_data accordingly. I hope this works for you.
Flowing code gives you all the files in a folder
filenames = listdir(r'C:\Users\Videos')
count = 1
for file in filenames:
print (file)
.......
#At the end
output = "output-" + str(count) + ".xlsx"
count = count + 1
result = pd.concat(frame_data)
result.to_excel(output, sheet_name='modify_data', index=False)
For the master file, you can save the data in a pandas dataframe and keep appending each file in the loop.

How to filter column data using openpyxl

I am trying to apply a filter to an existing Excel file, and export it to another Excel file. I would like to extract rows that only contain the value 16, then export the table to another excel file (as shown in the picture below).
I have tried reading the openpyxl documentation multiple times and googling for solutions but I still can't make my code work. I have also attached the code and files below
import openpyxl
# Is use to create a reference of the Excel to wb
wb1 = openpyxl.load_workbook('test_data.xlsx')
wb2 = openpyxl.load_workbook('test_data_2.xlsx')
# Refrence the workbook to the worksheets
sh1 = wb1["data_set_1"]
sh2 = wb2["Sheet1"]
sh1.auto_filter.ref = "A:A"
sh1.auto_filter.add_filter_column(0, ["16"])
sh1.auto_filter.add_sort_condition("B2:D6")
sh1_row_number = sh1.max_row
sh1_col_number = sh1.max_column
rangeSelected = []
for i in range(1, sh1_row_number+1, 1):
rowSelected = []
for j in range(1, sh1_col_number+1, 1):
rowSelected.append(sh1.cell(row = i, column = j))
rangeSelected.append(rowSelected)
del rowSelected
for i in range(1, sh1_row_number+1, 1):
for j in range(1, sh1_col_number+1, 1):
sh2.cell(row = i, column = j).value = rangeSelected[i-1][j-1].value
wb1.save("test_data.xlsx")
wb2.save("test_data_2.xlsx")
The pictures shows what should be the desire result
The auto filter doesn't actually filter the data, it is just for visualization.
You probably want to filter while looping through the workbook. Please note with this code I assume you have the table headers already in the second workbook. It does not overwrite the data, it appends to the table.
import openpyxl
# Is use to create a reference of the Excel to wb
wb1 = openpyxl.load_workbook('test_data.xlsx')
wb2 = openpyxl.load_workbook('test_data_2.xlsx')
# Refrence the workbook to the worksheets
sh1 = wb1["data_set_1"]
sh2 = wb2["data_set_1"] # use same sheet name, different workbook
for row in sh1.iter_rows():
if row[0].value == 16: # filter on first column with value 16
sh2.append((cell.value for cell in row))
wb1.save("test_data.xlsx")
wb2.save("test_data_2.xlsx")

Push Multi line string to excel using python

I am trying to add a multiline string to excelsheet to a particular row and column. I intend to do something like we do to a cell
worksheet.cell('A1').style.alignment.wrap_text = True
But unable to figure out how to do same when passing row no. and col. no.
Below code works except for the 4th line.Can you suggest what is the correct method to achieve the same?
import openpyxl
xfile = openpyxl.load_workbook('template.xlsx')
sheet = xfile.get_sheet_by_name(sheet_name)
sheet.cell(row=row_no, column=8).style.alignment.wrap_text = True
sheet.cell(row=row_no, column=8).value = config
xfile.save('template.xlsx')
The below code works successfully:
import openpyxl
xfile = openpyxl.load_workbook('template.xlsx')
sheet = xfile.get_sheet_by_name(sheet_name)
sheet.cell(row=row_no, column=8).style.alignment.wrap_text = True
sheet.cell(row=row_no, column=8).alignment = Alignment(wrapText=True)
xfile.save('template.xlsx')

openpyxl error: 'str' object has no attribute 'BLACK'

I am trying to set styles on an excel spreadsheet using pythons OPENPYXL module. I keep coming up with this error:
'str' object has no attribute 'BLACK'
Basically, my code reads a list of known values from a .xlsx file and places them into a python list. I use that list to compare the values in a column from an access table to make sure the values in each cell is correct as compared to the know values.
Where my script blows out is when I try to set styles using openpyxl. For some reason, the above error comes up. The wird thing is, I'm not even using BLACK in the styles anywhere and it seems to error out when I try to set the fill. In the SearchCursor portion of the script, it iterates through each row. It's on the second pass, that the script blows out. I have a feeling it wants to overwrite something, but I can't figure out what.
import openpyxl, arcpy
from arcpy import env
from openpyxl import Workbook
env.workspace = r"Z:\Access_Tables\Access_Table.gdb"
TableList = []
for row in arcpy.SearchCursor(r"Z:\Domains\Domains.xlsx\DOMAINS$"):
TableList.append(row.Code)
# Create workbook for report. Openpyxl
workbook = openpyxl.Workbook()
ws = workbook.get_active_sheet()
ws.title = "Test"
workbook.remove_sheet(ws)
# List out the access tables in the workspace
for fc in arcpy.ListFeatureClasses():
# Processing SOIL Point access table
if fc == "SOIL":
# List the column headings from the access table to be applied to the .xlsx table
fieldnames = [f.name for f in arcpy.ListFields(fc)]
# Create Sheet. Openpyxl
new_sheet = workbook.create_sheet(None,fc)
dictFieldnames = {}
for num,fname in enumerate(fieldnames):
dictFieldnames[num] = fname
# Write to cell, openpyxl
new_sheet.cell(None,0,num).value = fname
col_let = openpyxl.cell.get_column_letter(num + 1)
new_sheet.column_dimensions[col_let].width = len(fname) + 3
# Process SOIL Field
if "SOIL" in fieldnames:
# Set a counter and Loop through each row of the access table
x = 1
for row in arcpy.SearchCursor(fc):
for key, value in dictFieldnames.iteritems():
if value == "SOIL":
fieldKey = key
if not row.SOIL or len(row.SOIL.strip()) == 0:
# Openpyxl write. Set fill and color for cell. Write the unique id to the cell.
new_sheet.cell(None,x,fieldKey).style.fill.fill_type = openpyxl.style.Fill.FILL_SOLID
new_sheet.cell(None,x,fieldKey).style.fill.start_color.index = openpyxl.style.Color = 'FF808000'
new_sheet.cell(None,x,fieldKey).value = row.OBJECTID
x += 1
print 'first'
elif len(row.INCLUSION_TYPE) not in range(2,5):
# Openpyxl write. Set fill and color for cell. Write the unique id to the cell.
new_sheet.cell(None,x,fieldKey).style.fill.fill_type = openpyxl.style.Fill.FILL_SOLID
new_sheet.cell(None,x,fieldKey).style.fill.start_color.index = openpyxl.style.Color = 'FF2F4F4F'
new_sheet.cell(None,x,fieldKey).value = row.OBJECTID
x += 1
print 'second'
elif row.SOIL.upper() not in [y.upper() for y in TableList]:
# Openpyxl write. Set fill and color for cell. Write the unique id to the cell.
new_sheet.cell(None,x,fieldKey).style.fill.fill_type = openpyxl.style.Fill.FILL_SOLID
new_sheet.cell(None,x,fieldKey).style.fill.start_color.index = openpyxl.style.Color = 'FF00FFFF'
new_sheet.cell(None,x,fieldKey).value = row.OBJECTID
x += 1
print 'third'
print x
The problem is in lines there you are defining colors. Just assign the color to style.fill.start_color.index there. For example:
new_sheet.cell(None,x,fieldKey).style.fill.start_color.index = 'FF808000'
instead of:
new_sheet.cell(None,x,fieldKey).style.fill.start_color.index = openpyxl.style.Color = 'FF808000'

How can I open an Excel file in Python?

How do I open a file that is an Excel file for reading in Python?
I've opened text files, for example, sometextfile.txt with the reading command. How do I do that for an Excel file?
Edit:
In the newer version of pandas, you can pass the sheet name as a parameter.
file_name = # path to file + file name
sheet = # sheet name or sheet number or list of sheet numbers and names
import pandas as pd
df = pd.read_excel(io=file_name, sheet_name=sheet)
print(df.head(5)) # print first 5 rows of the dataframe
Check the docs for examples on how to pass sheet_name: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html
Old version:
you can use pandas package as well....
When you are working with an excel file with multiple sheets, you can use:
import pandas as pd
xl = pd.ExcelFile(path + filename)
xl.sheet_names
>>> [u'Sheet1', u'Sheet2', u'Sheet3']
df = xl.parse("Sheet1")
df.head()
df.head() will print first 5 rows of your Excel file
If you're working with an Excel file with a single sheet, you can simply use:
import pandas as pd
df = pd.read_excel(path + filename)
print df.head()
Try the xlrd library.
[Edit] - from what I can see from your comment, something like the snippet below might do the trick. I'm assuming here that you're just searching one column for the word 'john', but you could add more or make this into a more generic function.
from xlrd import open_workbook
book = open_workbook('simple.xls',on_demand=True)
for name in book.sheet_names():
if name.endswith('2'):
sheet = book.sheet_by_name(name)
# Attempt to find a matching row (search the first column for 'john')
rowIndex = -1
for cell in sheet.col(0): #
if 'john' in cell.value:
break
# If we found the row, print it
if row != -1:
cells = sheet.row(row)
for cell in cells:
print cell.value
book.unload_sheet(name)
This isn't as straightforward as opening a plain text file and will require some sort of external module since nothing is built-in to do this. Here are some options:
http://www.python-excel.org/
If possible, you may want to consider exporting the excel spreadsheet as a CSV file and then using the built-in python csv module to read it:
http://docs.python.org/library/csv.html
There's the openpxyl package:
>>> from openpyxl import load_workbook
>>> wb2 = load_workbook('test.xlsx')
>>> print wb2.get_sheet_names()
['Sheet2', 'New Title', 'Sheet1']
>>> worksheet1 = wb2['Sheet1'] # one way to load a worksheet
>>> worksheet2 = wb2.get_sheet_by_name('Sheet2') # another way to load a worksheet
>>> print(worksheet1['D18'].value)
3
>>> for row in worksheet1.iter_rows():
>>> print row[0].value()
You can use xlpython package that requires xlrd only.
Find it here https://pypi.python.org/pypi/xlpython
and its documentation here https://github.com/morfat/xlpython
This may help:
This creates a node that takes a 2D List (list of list items) and pushes them into the excel spreadsheet. make sure the IN[]s are present or will throw and exception.
this is a re-write of the Revit excel dynamo node for excel 2013 as the default prepackaged node kept breaking. I also have a similar read node. The excel syntax in Python is touchy.
thnx #CodingNinja - updated : )
###Export Excel - intended to replace malfunctioning excel node
import clr
clr.AddReferenceByName('Microsoft.Office.Interop.Excel, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c')
##AddReferenceGUID("{00020813-0000-0000-C000-000000000046}") ''Excel C:\Program Files\Microsoft Office\Office15\EXCEL.EXE
##Need to Verify interop for version 2015 is 15 and node attachemnt for it.
from Microsoft.Office.Interop import * ##Excel
################################Initialize FP and Sheet ID
##Same functionality as the excel node
strFileName = IN[0] ##Filename
sheetName = IN[1] ##Sheet
RowOffset= IN[2] ##RowOffset
ColOffset= IN[3] ##COL OFfset
Data=IN[4] ##Data
Overwrite=IN[5] ##Check for auto-overwtite
XLVisible = False #IN[6] ##XL Visible for operation or not?
RowOffset=0
if IN[2]>0:
RowOffset=IN[2] ##RowOffset
ColOffset=0
if IN[3]>0:
ColOffset=IN[3] ##COL OFfset
if IN[6]<>False:
XLVisible = True #IN[6] ##XL Visible for operation or not?
################################Initialize FP and Sheet ID
xlCellTypeLastCell = 11 #####define special sells value constant
################################
xls = Excel.ApplicationClass() ####Connect with application
xls.Visible = XLVisible ##VISIBLE YES/NO
xls.DisplayAlerts = False ### ALerts
import os.path
if os.path.isfile(strFileName):
wb = xls.Workbooks.Open(strFileName, False) ####Open the file
else:
wb = xls.Workbooks.add# ####Open the file
wb.SaveAs(strFileName)
wb.application.visible = XLVisible ####Show Excel
try:
ws = wb.Worksheets(sheetName) ####Get the sheet in the WB base
except:
ws = wb.sheets.add() ####If it doesn't exist- add it. use () for object method
ws.Name = sheetName
#################################
#lastRow for iterating rows
lastRow=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Row
#lastCol for iterating columns
lastCol=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Column
#######################################################################
out=[] ###MESSAGE GATHERING
c=0
r=0
val=""
if Overwrite == False : ####Look ahead for non-empty cells to throw error
for r, row in enumerate(Data): ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
if col.Value2 >"" :
OUT= "ERROR- Cannot overwrite"
raise ValueError("ERROR- Cannot overwrite")
##out.append(Data[0]) ##append mesage for error
############################################################################
for r, row in enumerate(Data): ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
ws.Cells[r+1+RowOffset,c+1+ColOffset].Value2 = col.__str__()
##run macro disbled for debugging excel macro
##xls.Application.Run("Align_data_and_Highlight_Issues")
import pandas as pd
import os
files = os.listdir('path/to/files/directory/')
desiredFile = files[i]
filePath = 'path/to/files/directory/%s'
Ofile = filePath % desiredFile
xls_import = pd.read_csv(Ofile)
Now you can use the power of pandas DataFrames!
This code worked for me with Python 3.5.2. It opens and saves and excel. I am currently working on how to save data into the file but this is the code:
import csv
excel = csv.writer(open("file1.csv", "wb"))

Categories