I am trying to add a multiline string to excelsheet to a particular row and column. I intend to do something like we do to a cell
worksheet.cell('A1').style.alignment.wrap_text = True
But unable to figure out how to do same when passing row no. and col. no.
Below code works except for the 4th line.Can you suggest what is the correct method to achieve the same?
import openpyxl
xfile = openpyxl.load_workbook('template.xlsx')
sheet = xfile.get_sheet_by_name(sheet_name)
sheet.cell(row=row_no, column=8).style.alignment.wrap_text = True
sheet.cell(row=row_no, column=8).value = config
xfile.save('template.xlsx')
The below code works successfully:
import openpyxl
xfile = openpyxl.load_workbook('template.xlsx')
sheet = xfile.get_sheet_by_name(sheet_name)
sheet.cell(row=row_no, column=8).style.alignment.wrap_text = True
sheet.cell(row=row_no, column=8).alignment = Alignment(wrapText=True)
xfile.save('template.xlsx')
Related
I have been building a project in python and i have been having a little problem when working with python and excel. I have an excel document with 50+ sheets(Sheet1, Sheet2, ...) and I want to find which of the sheets has a word inside them. For example: I am looking for the sheets that have the word "work"(in one of the cells) inside them, and as result have the name of the sheets that have that word inside them(the result can be multiple sheets for this example, like Sheet4, Sheet43, Sheet50). Thank you for reading and for the help.
I tried to find a answer by myself and I failed. Then I tried to find the answer on the internet and most of the posts discus the next problem: finding the sheets that have a specific word in their name. This is not for what I am looking. I am looking for finding the sheets that have a specific word in them(not in the name but in one of the cells). So far I have been using pandas for context.
import pandas as pd
exel_data = pd.read_excel("data.xlsx")
##### converting into comma-separated values
exel_data.to_csv("data.txt")
##### Open in read mode
file = open("ptry.txt", "r")
##### reading comma-separated values
file_str = filex.read()
##### Spliting it on the basis on , (in my case) you can use whatever suit your data type and creating a list
file_list = file_str.split(",")
#### if "hello world is in it return true else false
if "hello world" in file_list:
print("True")
else:
print("false")
from openpyxl import load_workbook
xls = load_workbook(filename= excel_path , data_only=True)
for i in xls.sheetnames:
ws = xls[str(i)]
for num_row in range (1, ws.max_row +1):
# print(ws.max_row)
if ws['A{}'.format(num_row)].value=='work':
print (str(i))
Using sheet_name=None and a list comp:
import pandas as pd
file = "path/to/file/file.xlsx"
search_for = "work"
sheet_mapping = pd.read_excel(file, sheet_name=None).items()
found_in = [sheet for sheet, df in sheet_mapping if search_for in df.values.astype(str)]
print(found_in)
Background-
The following code snippet will iterate over all worksheets in a workbook, and write a formula to every last column -
import openpyxl
filename = 'filename.xlsx'
filename_output = 'filename_output.xlsx'
wb = openpyxl.load_workbook(filename)
for sheet in wb.worksheets:
sheet.insert_cols(sheet.max_column)
for row in sheet.iter_rows():
row[-1].value = "=SUMIFS(J:J,M:M,#M:M)"
wb.save(filename_output)
Question -I cannot find documentation on how to name the column. Does anyone know how to achieve this?
Context -I want this column (in each worksheet) to be called 'Calculation'.
To get the last column, you can use sheet.max_column. Once you have updated the formulas, you can use sheet.cell(1,col).value = "Calc" to update the header. Updated code below...
import openpyxl
filename = 'filename.xlsx'
filename_output = 'filename_output.xlsx'
wb = openpyxl.load_workbook(filename)
for sheet in wb.worksheets:
sheet.insert_cols(sheet.max_column)
for row in sheet.iter_rows():
row[-1].value = "=SUMIFS(J:J,M:M,#M:M)"
sheet.cell(1,sheet.max_column).value = "Calculation" ## Add line inside FOR loop
wb.save(filename_output)
Output would look something like this.
I want to make a filter or exception in my code for the excel file.
I have this table in excel
But in my result I only want the Machine 'S9401-1', how can I Get this.
This is my code
import xlrd
#First open the workbook
wb = xlrd.open_workbook('Book1.xlsx')
#Then select the sheet. Replace the sheet1 with name of your sheet
sheet = wb.sheet_by_name('connx 94')
#Then get values of each column. Excuse first item which is header
machine = sheet.cell_value(1,0)
alid = sheet.cell_value(1,1)
descripcion = sheet.cell_value(1,3)
result=[machine,alid,descripcion]
print (result)
Using only xlrd package, you could do brute force like this:
import xlrd
wb = xlrd.open_workbook(r'c:\debug\py.xlsx')
sheet = wb.sheet_by_name('Sheet1')
def filterdata(sh,ID):
vals = sh.row_values
data = [[vals(r,0)[1], vals(r,0)[3]] for r in range(sh.nrows) if vals(r,0)[0] == ID]
return data
print(filterdata(sheet,'S9401-1))
Making a function call, you can use different IDs:
print(filterdata(sheet,'S9401-1'))
print(filterdata(sheet,'S9401-3')) # should return an empty list
I have several excel files that use lots of comments for saving information.
For example, one cell has value 2 and there is a comment attached to the cell saying
"2008:2#2009:4". it seems that value 2 is for the current year (2010) value. The comment keeps all previous year values separated by '#'. I would like to create a dictionary to keep all this info like {2008:2, 2009:4, 2010:2} but I don't know how to parse (or read) this comment attached to the cell. Python excel readin module has this function (reading in comment)?
You can do this without an Excel COM object using openpyxl:
from openpyxl import load_workbook
workbook = load_workbook('/tmp/data.xlsx')
first_sheet = workbook.get_sheet_names()[0]
worksheet = workbook.get_sheet_by_name(first_sheet)
for row in worksheet.iter_rows():
for cell in row:
if cell.comment:
print(cell.comment.text)
The parsing of the comments itself can be done the same as with Steven Rumbalski's answer.
(example adapted from here)
Normally for reading from Excel, I would suggest using xlrd, but xlrd does not support comments. So instead use the Excel COM object:
from win32com.client import Dispatch
xl = Dispatch("Excel.Application")
xl.Visible = True
wb = xl.Workbooks.Open("Book1.xls")
sh = wb.Sheets("Sheet1")
comment = sh.Cells(1,1).Comment.Text()
And here's how to parse the comment:
comment = "2008:2#2009:4"
d = {}
for item in comment.split('#'):
key, val = item.split(':')
d[key] = val
Often, Excel comments are on two lines with the first line noting who created the comment. If so your code would look more like this:
comment = """Steven:
2008:2#2009:4"""
_, comment = comment.split('\n')
d = {}
for item in comment.split('#'):
key, val = item.split(':')
d[key] = val
After running the last posted code here, can you store that information later in a word document?
from openpyxl import load_workbook
workbook = load_workbook('/tmp/data.xlsx')
first_sheet = workbook.get_sheet_names()[0]
worksheet = workbook.get_sheet_by_name(first_sheet)
for row in worksheet.iter_rows():
for cell in row:
if cell.comment:
print(cell.comment.text)
How do I open a file that is an Excel file for reading in Python?
I've opened text files, for example, sometextfile.txt with the reading command. How do I do that for an Excel file?
Edit:
In the newer version of pandas, you can pass the sheet name as a parameter.
file_name = # path to file + file name
sheet = # sheet name or sheet number or list of sheet numbers and names
import pandas as pd
df = pd.read_excel(io=file_name, sheet_name=sheet)
print(df.head(5)) # print first 5 rows of the dataframe
Check the docs for examples on how to pass sheet_name: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html
Old version:
you can use pandas package as well....
When you are working with an excel file with multiple sheets, you can use:
import pandas as pd
xl = pd.ExcelFile(path + filename)
xl.sheet_names
>>> [u'Sheet1', u'Sheet2', u'Sheet3']
df = xl.parse("Sheet1")
df.head()
df.head() will print first 5 rows of your Excel file
If you're working with an Excel file with a single sheet, you can simply use:
import pandas as pd
df = pd.read_excel(path + filename)
print df.head()
Try the xlrd library.
[Edit] - from what I can see from your comment, something like the snippet below might do the trick. I'm assuming here that you're just searching one column for the word 'john', but you could add more or make this into a more generic function.
from xlrd import open_workbook
book = open_workbook('simple.xls',on_demand=True)
for name in book.sheet_names():
if name.endswith('2'):
sheet = book.sheet_by_name(name)
# Attempt to find a matching row (search the first column for 'john')
rowIndex = -1
for cell in sheet.col(0): #
if 'john' in cell.value:
break
# If we found the row, print it
if row != -1:
cells = sheet.row(row)
for cell in cells:
print cell.value
book.unload_sheet(name)
This isn't as straightforward as opening a plain text file and will require some sort of external module since nothing is built-in to do this. Here are some options:
http://www.python-excel.org/
If possible, you may want to consider exporting the excel spreadsheet as a CSV file and then using the built-in python csv module to read it:
http://docs.python.org/library/csv.html
There's the openpxyl package:
>>> from openpyxl import load_workbook
>>> wb2 = load_workbook('test.xlsx')
>>> print wb2.get_sheet_names()
['Sheet2', 'New Title', 'Sheet1']
>>> worksheet1 = wb2['Sheet1'] # one way to load a worksheet
>>> worksheet2 = wb2.get_sheet_by_name('Sheet2') # another way to load a worksheet
>>> print(worksheet1['D18'].value)
3
>>> for row in worksheet1.iter_rows():
>>> print row[0].value()
You can use xlpython package that requires xlrd only.
Find it here https://pypi.python.org/pypi/xlpython
and its documentation here https://github.com/morfat/xlpython
This may help:
This creates a node that takes a 2D List (list of list items) and pushes them into the excel spreadsheet. make sure the IN[]s are present or will throw and exception.
this is a re-write of the Revit excel dynamo node for excel 2013 as the default prepackaged node kept breaking. I also have a similar read node. The excel syntax in Python is touchy.
thnx #CodingNinja - updated : )
###Export Excel - intended to replace malfunctioning excel node
import clr
clr.AddReferenceByName('Microsoft.Office.Interop.Excel, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c')
##AddReferenceGUID("{00020813-0000-0000-C000-000000000046}") ''Excel C:\Program Files\Microsoft Office\Office15\EXCEL.EXE
##Need to Verify interop for version 2015 is 15 and node attachemnt for it.
from Microsoft.Office.Interop import * ##Excel
################################Initialize FP and Sheet ID
##Same functionality as the excel node
strFileName = IN[0] ##Filename
sheetName = IN[1] ##Sheet
RowOffset= IN[2] ##RowOffset
ColOffset= IN[3] ##COL OFfset
Data=IN[4] ##Data
Overwrite=IN[5] ##Check for auto-overwtite
XLVisible = False #IN[6] ##XL Visible for operation or not?
RowOffset=0
if IN[2]>0:
RowOffset=IN[2] ##RowOffset
ColOffset=0
if IN[3]>0:
ColOffset=IN[3] ##COL OFfset
if IN[6]<>False:
XLVisible = True #IN[6] ##XL Visible for operation or not?
################################Initialize FP and Sheet ID
xlCellTypeLastCell = 11 #####define special sells value constant
################################
xls = Excel.ApplicationClass() ####Connect with application
xls.Visible = XLVisible ##VISIBLE YES/NO
xls.DisplayAlerts = False ### ALerts
import os.path
if os.path.isfile(strFileName):
wb = xls.Workbooks.Open(strFileName, False) ####Open the file
else:
wb = xls.Workbooks.add# ####Open the file
wb.SaveAs(strFileName)
wb.application.visible = XLVisible ####Show Excel
try:
ws = wb.Worksheets(sheetName) ####Get the sheet in the WB base
except:
ws = wb.sheets.add() ####If it doesn't exist- add it. use () for object method
ws.Name = sheetName
#################################
#lastRow for iterating rows
lastRow=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Row
#lastCol for iterating columns
lastCol=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Column
#######################################################################
out=[] ###MESSAGE GATHERING
c=0
r=0
val=""
if Overwrite == False : ####Look ahead for non-empty cells to throw error
for r, row in enumerate(Data): ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
if col.Value2 >"" :
OUT= "ERROR- Cannot overwrite"
raise ValueError("ERROR- Cannot overwrite")
##out.append(Data[0]) ##append mesage for error
############################################################################
for r, row in enumerate(Data): ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
ws.Cells[r+1+RowOffset,c+1+ColOffset].Value2 = col.__str__()
##run macro disbled for debugging excel macro
##xls.Application.Run("Align_data_and_Highlight_Issues")
import pandas as pd
import os
files = os.listdir('path/to/files/directory/')
desiredFile = files[i]
filePath = 'path/to/files/directory/%s'
Ofile = filePath % desiredFile
xls_import = pd.read_csv(Ofile)
Now you can use the power of pandas DataFrames!
This code worked for me with Python 3.5.2. It opens and saves and excel. I am currently working on how to save data into the file but this is the code:
import csv
excel = csv.writer(open("file1.csv", "wb"))