Excel File and Sheet Location - python

I have a list of excel files and their corresponding sheet number. I need python to go to those sheets and find out the cell location for a particular content. Thanks to "alecxe", I used the following code and it worked well.
import xlrd
value = 'Avg.'
fn = ('C:/ab1.xls', 'C:/ab2.xls','C:/ab3.xls','C:/ab4.xls','C:/ab5.xls',)
sn = ('505840', '505608', '430645', '505464', '505084')
for name, sheet_name in zip(fn, sn):
book = xlrd.open_workbook(name)
sheet = book.sheet_by_name(sheet_name)
for row in range(sheet.nrows):
for column in range(sheet.ncols):
if sheet.cell(row,column).value == value:
print row, column
Later I wanted to make changes and instead of writing down the filename and sheetnumber, I wanted python to grab them from an excel sheet. But the program is not printing anything. Can anyone show me where I made the mistake? Highly appreciate your comment!
import xlrd
import glob
import os
value = 'Avg.'
sheetnumber = []
filename = []
xlfile = "C:\\Users\\tsengineer\\Desktop\\Joydip Trial\\Simple.xls"
workbook = xlrd.open_workbook(xlfile)
sheet = workbook.sheet_by_index(0)
for row in range(sheet.nrows):
value = str(sheet.cell_value(row, 17))
filename.append(value)
for row in range(sheet.nrows):
value = str(sheet.cell_value(row, 15))
sheetnumber.append(value)
fn = tuple(filename)
sn = tuple(sheetnumber)
for name, sheet_name in zip(fn, sn):
book = xlrd.open_workbook(name)
sheet = book.sheet_by_name(sheet_name)
for row in range(sheet.nrows):
for column in range(sheet.ncols):
if sheet.cell(row,column).value == value:
print row, column
Definitely for some reasons, the loop is not working as I am getting two empty lists as output. Any thoughts?
import xlrd
value = 'Avg.'
sheetnumber = []
filename = []
rowlist = []
columnlist = []
xlfile = "C:/Users/Joyd/Desktop/Experiment/Simple_1.xls"
workbook = xlrd.open_workbook(xlfile)
sheet = workbook.sheet_by_index(0)
for row in range(sheet.nrows):
value = str(sheet.cell_value(row, 17))
filename.append(value)
for row in range(sheet.nrows):
value = str(sheet.cell_value(row, 15))
sheetnumber.append(value)
fn = tuple(filename)
sn = tuple(sheetnumber)
for fname, sname in zip(fn, sn):
book = xlrd.open_workbook(fname)
sheet = book.sheet_by_name(sname)
for row in range(sheet.nrows):
for column in range(sheet.ncols):
if sheet.cell(row,column).value == value:
rowlist.append(row)
columnlist.append(column)
print rowlist
print columnlist

Related

Save copied data to different excel files

I'm trying to save different parts of copied data from one excel file, to the correspond new excel file, but I'm getting everything on one new file.
What am I doing wrong?
For example, in the picture I want to have 4 new excel files, with the names AAA, BBB, CCC and DDD.
The separation is every time a blank row appears.
The data to be copied will be from that row until the next blank row. (all columns)
Thanks
wb1 = xl.load_workbook(xref_file)
ws1 = wb1.worksheets[0]
mr = ws1.max_row
mc = ws1.max_column
for row in range(1,ws1.max_row):
if(ws1.cell(row,1).value is None):
wb2 = openpyxl.Workbook()
ws2 = wb2.active
conveyor_name = ws1["A1"].value
conveyor_name.split(' -')
conveyor_name = conveyor_name.split(' -')[0]
filename = conveyor_name + ".xlsx"
destination_file = os.path.join(destination,filename)
wb2.save(destination_file)
# copying the cell values from source
# excel file to destination excel file
for i in range (1, mr+1):
for j in range (1, mc+1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i, column = j).value = c.value
wb2.save(destination_file)
ws1.delete_rows(1,mr)
wb1.save(xref_file)
Edit 1:
The changes that I have made to your code are that I have additional conditions which check whether an empty row is present or not, if it is an empty row then it will set the new_sheet variable to true. If it is not empty and if new_sheet is true then a new sheet is created, else a loop starts which copies the content to the new sheet.
Hence the updated code should be as follows:
wb1 = openpyxl.load_workbook(destination + "/" + xref_file)
ws1 = wb1.worksheets[0]
mr = ws1.max_row
mc = ws1.max_column
new_sheet = True
# for row in range(1,ws1.max_row):
row = 1
while row < ws1.max_row:
if(ws1.cell(row,1).value is not None):
if new_sheet == True:
wb2 = openpyxl.Workbook()
ws2 = wb2.active
conveyor_name = ws1["A" + str(row)].value
conveyor_name = conveyor_name.split()[0]
filename = conveyor_name + ".xlsx"
destination_file = os.path.join(destination,filename)
print(destination_file)
wb2.save(destination_file)
new_sheet = False
row = row + 1
else:
# copying the cell values from source
# excel file to destination excel file
for i in range (row, mr+1):
if ws1.cell(i,1).value is None:
break
for j in range (1, mc+1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i - row + 1, column = j).value = c.value
row = i
wb2.save(destination_file)
else:
new_sheet = True
row = row + 1
Edit 0: There are several optimizations and errors that can be observed. Since I don't know the nature of the data in excel sheet, some of these may not apply to you
wb1 = xl.load_workbook(xref_file) ws1 = wb1.worksheets[0]
mr = ws1.max_row
mc = ws1.max_column
for row in range(1,ws1.max_row):
if(ws1.cell(row,1).value is None):
wb2 = openpyxl.Workbook()
ws2 = wb2.active
If the above If condition is true then there shouldn't be any value in A1. Further, if there should be a value in A1 then you may want to check the values from the second row to pass the if condition, hence row should have values starting from 2 to ws1.max_row
conveyor_name = ws1["A1"].value
This below line doesn't make much sense because you are again using the same value in the line after that
conveyor_name.split(' -') # this is not required
conveyor_name = conveyor_name.split(' -')[0]
filename = conveyor_name + ".xlsx"
destination_file = os.path.join(destination,filename)
wb2.save(destination_file)
# copying the cell values from source
# excel file to destination excel file
for i in range (1, mr+1):
for j in range (1, mc+1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i, column = j).value = c.value
wb2.save(destination_file)
ws1.delete_rows(1,mr) wb1.save(xref_file)

Getting wrong value when reading an xlsx file using openpyxl

I'm trying to read values from an xlsx file containing formulas using openpyxl; however, I noticed that for some cells, I'm getting a wrong value.
Here's the XLSX example:
Here's the result I get:
The code:
wb = openpyxl.load_workbook(excel_file, data_only=True)
# getting all sheets
sheets = wb.sheetnames
print(sheets)
# getting a particular sheet
worksheet = wb["Feuil1"]
print(worksheet)
# getting active sheet
active_sheet = wb.active
print(active_sheet)
# reading a cell
print(worksheet["A1"].value)
excel_data = list()
# iterating over the rows and
# getting value from each cell in row
for row in worksheet.iter_rows():
row_data = list()
for cell in row:
#cell.number_format='0.0########'
print(cell.number_format)
row_data.append(str(cell.value))
print(cell.value)
excel_data.append(row_data)
return render(request, 'myapp/index.html', {"excel_data":excel_data})
Hey What you want from an open excel file means which type of format do you gate
data,
This My answer for get data from excel file with xlrd.
import xlrd
from xlrd import open_workbook
fp = tempfile.NamedTemporaryFile(delete= False, suffix=filetype)
fp.write(binascii.a2b_base64(selected file))
workbook = xlrd.open_workbook(file name)
sheet = workbook.sheet_by_name(sheet name)
row = [c or '' for c in sheet.row_values(header_row)]
first_row = []
for col in range(sheet.ncols):
first_row.append(sheet.cell_value(0,col) )
archive_lines = []
for row in range(1, sheet.nrows):
elm = {}
for col in range(sheet.ncols):
elm[first_row[col]]=sheet.cell_value(row,col)
archive_lines.append(elm)

How do I get user input into an excel spreadsheet via input() either in a csv or xlsx spreadsheet?

So far, I have been able to access csv and xlsx files in python, but I am unsure how to put in user inputs input() to add data to the spreadsheet.
I would also want this input() to only be enterable once per day but for different columns in my spreadsheet. (this is a separate issue)
Here is my code so far, first for csv, second for xlsx, I don't need both just either will do:
# writing to a CSV file
import csv
def main():
filename = "EdProjDBeg.csv"
header = ("Ans1", "Ans2", "Ans3")
data = [(0, 0, 0)]
writer(header, data, filename, "write")
updater(filename)
def writer(header, data, filename, option):
with open(filename, "w", newline = "") as csvfile:
if option == "write":
clidata = csv.writer(csvfile)
clidata.writerow(header)
for x in data:
clidata.writerow(x)
elif option == "update":
writer = csv.DictWriter(csvfile, fieldnames = header)
writer.writeheader()
writer.writerows(data)
else:
print("Option is not known")
# Updating the CSV files with new data
def updater(filename):
with open(filename, newline= "") as file:
readData = [row for row in csv.DictReader(file)]
readData[0]['Ans2'] = 0
readHeader = readData[0].keys()
writer(readHeader, readData, filename, "update")
# Reading and updating xlsx files
import openpyxl
theFile = openpyxl.load_workbook(r'C:\Users\joe_h\OneDrive\Documents\Data Analysis STUDYING\Excel\EdProjDBeg.xlsx')
print(theFile.sheetnames)
currentsheet = theFile['Customer1']
print(currentsheet['B3'].value)
wb = openpyxl.load_workbook(r'C:\Users\joe_h\OneDrive\Documents\Data Analysis STUDYING\Excel\EdProjDBeg.xlsx')
ws = wb.active
i = 0
cell_val = ''
# Finds which row is blank first
while cell_val != '':
cell_val = ws['A' + i].value
i += 1
# Modify Sheet, Starting With Row i
wb.save(r'C:\Users\joe_h\OneDrive\Documents\Data Analysis STUDYING\Excel\EdProjDBeg.xlsx')
x = input('Prompt: ')
This works for inputting data into an xlsx file.
Just use:
ws['A1'] = "data"
to input into cell A1
See code below for example using your original code:
wb = openpyxl.load_workbook('sample.xlsx')
print(wb.sheetnames)
currentsheet = wb['Sheet']
ws = currentsheet
#ws = wb.active <-- defaults to first sheet
i = 0
cell_val = ''
# Finds which row is blank first
while cell_val != None:
i += 1
cell_val = ws['A' + str(i)].value
print(cell_val)
x = input('Prompt: ')
#sets A column of first blank row to be user input
ws['A' + str(i)] = x
#saves spreadsheet
wb.save("sample.xlsx")
Also just made a few edits to your original while loop in the above code:
When a cell is blank, 'None' is returned
A1 is the first cell on the left, not A0 (moved i += 1 above finding value of cell)
Converted variable 'i' to a string when accessing the cell
See https://openpyxl.readthedocs.io/en/stable/ for the full documentation

How to write multiple sheets into a new excel, without overwriting each other?

I'm trying to write multiple excels' column A into a new excel's column A (assuming all the excels have one worksheet each.) I've written some code, which can write one excel's column A into the new excel's column A; but if there are multiple excels, the new excel's column A will be overwritten multiple times. So how could I just add all the column As to the new excel sheet one after another without overwriting each other?
Below are my code:
import os, openpyxl
path = os.getcwd()
def func(file):
for file in os.listdir(path):
if file.endswith('.xlsx'):
wb = openpyxl.load_workbook(file)
sheet = wb.active
colA = sheet['A']
wb = openpyxl.Workbook()
r = 1
for i in colA:
sheet = wb.active
sheet.cell(row=r, column=1).value = i.value
r += 1
wb.save('new.xlsx')
func(file)
Thank you so much!!
you could proceed for example as:
import os, openpyxl
path = os.getcwd()
def func(outputFile):
c = 0
#create output workbook
wbOut = openpyxl.Workbook()
sheetOut = wbOut.active
for fName in os.listdir(path):
if fName.endswith('.xlsx'):
c += 1 #move to the next column in output
wb = openpyxl.load_workbook(fName)
sheet = wb.active #input sheet
#for r in range(1, sheet.max_row+1):
# sheetOut.cell(row=r, column=c).value = sheet.cell(row = r, column = 1).value
for r, cell in enumerate(sheet['A']):
sheetOut.cell(row = r+1, column = c).value = cell.value
wbOut.save(outputFile)
#"concatenate" all columns A into one single column
def funcAppend(outputFile):
wbOut = openpyxl.Workbook()
sheetOut = wbOut.active
r = 1
for fName in os.listdir(path):
if fName.endswith('.xlsx'):
wb = openpyxl.load_workbook(fName)
sheet = wb.active
for cell in sheet['A']:
sheetOut.cell(row = r, column = 1).value = cell.value
r += 1
wbOut.save(outputFile)
func('test.xlsx')

Insert hyperlink to a local folder in Excel with Python

The piece of code reads an Excel file. This excel file holds information such as customer job numbers, customer names, sites, works description ect..
What this code will do when completed (I hope) is read the last line of the worksheet (this is taken from a counter on the worksheet at cell 'P1'), create folders based on cell content, and create a hyperlink on the worksheet to open the lowest local folder that was created.
I have extracted the info I need from the worksheet to understand what folders need to be created, but I am not able to write a hyperlink to the cell on the row in column B.
#Insert Hyperlink to folder
def folder_hyperlink(last_row_position, destination):
cols = 'B'
rows = str(last_row_position)
position = cols + rows
final_position = "".join(position)
print final_position # This is just to check the value
# The statement below should insert hyperlink in eps.xlsm > worksheet jobnoeps at column B and last completed row.
ws.cell(final_position).hyperlink = destination
The complete code is below but here is the section that is meant to create the hyperlink. I have also tried the 'xlswriter' package with no joy. Searched the internet and the above snippet is the result of what I found.
Anyone know what I am doing wrong?
__author__ = 'Paul'
import os
import openpyxl
from openpyxl import load_workbook
import xlsxwriter
site_info_root = 'C:\\Users\\paul.EPSCONSTRUCTION\\PycharmProjects\\Excel_Jobs\\Site Information\\'
# This function returns the last row on eps.xlsm to be populated
def get_last_row(cell_ref = 'P1'): #P1 contains the count of the used rows
global wb
global ws
wb = load_workbook("eps.xlsm", data_only = True) #Workbook
ws = wb["jobnoeps"] #Worksheet
last_row = ws.cell(cell_ref).value #Value of P1 from that worksheet
return last_row
# This function will read the job number in format EPS-XXXX-YR
def read_last_row_jobno(last_row_position):
last_row_data = []
for cols in range(1, 5):
last_row_data += str(ws.cell(column = cols, row = last_row_position).value)
last_row_data_all = "".join(last_row_data)
return last_row_data_all
#This function will return the Customer
def read_last_row_cust(last_row_position):
cols = 5
customer_name = str(ws.cell(column = cols, row = last_row_position).value)
return customer_name
#This function will return the Site
def read_last_row_site(last_row_position):
cols = 6
site_name = str(ws.cell(column = cols, row = last_row_position).value)
return site_name
#This function will return the Job Discription
def read_last_row_disc(last_row_position):
cols = 7
site_disc = str(ws.cell(column = cols, row = last_row_position).value)
return site_disc
last_row = get_last_row()
job_no_details = read_last_row_jobno(last_row)
job_customer = read_last_row_cust(last_row)
job_site = read_last_row_site(last_row)
job_disc = read_last_row_disc(last_row)
cust_folder = job_customer
job_dir = job_no_details + "\\" + job_site + " - " + job_disc
#Insert Hyperlink to folder
def folder_hyperlink(last_row_position, destination):
cols = 'B'
rows = str(last_row_position)
position = cols + rows
final_position = "".join(position)
print final_position # This is just to check the value
# The statement below should insert hyperlink in eps.xlsm > worksheet jobnoeps at column B and last completed row.
ws.cell(final_position).hyperlink = destination
folder_location = site_info_root + job_customer + "\\" + job_dir
print folder_location # This is just to check the value
folder_hyperlink(last_row, folder_location)
Now my hyperlink function looks like this after trying xlsxwriter as advised.
##Insert Hyperlink to folder
def folder_hyperlink(last_row_position, destination):
import xlsxwriter
cols = 'B'
rows = str(last_row_position)
position = cols + rows
final_position = "".join(position)
print final_position # This is just to check the value
workbook = xlsxwriter.Workbook('eps.xlsx')
worksheet = workbook.add_worksheet('jobnoeps')
print worksheet
worksheet.write_url(final_position, 'folder_location')
workbook.close()
The function overwrites the exsisting eps.xlsx, creates a jobnoeps table and then inserts the hyperlink. I have played with the following lines but don't know how to get it to open the existing xlsx and existing jobnoeps tab and then enter the hyperlink.
workbook = xlsxwriter.Workbook('eps.xlsx')
worksheet = workbook.add_worksheet('jobnoeps')
worksheet.write_url(final_position, 'folder_location')
The XlsxWriter write_url() method allows you to link to folders or other workbooks and worksheets as well as internal links and links to web urls. For example:
import xlsxwriter
workbook = xlsxwriter.Workbook('links.xlsx')
worksheet = workbook.add_worksheet()
worksheet.set_column('A:A', 50)
# Link to a Folder.
worksheet.write_url('A1', r'external:C:\Temp')
# Link to a workbook.
worksheet.write_url('A3', r'external:C:\Temp\Book.xlsx')
# Link to a cell in a worksheet.
worksheet.write_url('A5', r'external:C:\Temp\Book.xlsx#Sheet1!C5')
workbook.close()
See the docs linked to above for more details.
Here is the code that did the trick:-
# Creates hyperlink in existing workbook...
def set_hyperlink():
from openpyxl import load_workbook
x = "hyperlink address"
wb = load_workbook("filename.xlsx")
ws = wb.get_sheet_by_name("sheet_name")
ws.cell(row = x?, column = y?).hyperlink = x
wb.save("filename.xlsx")
set_hyperlink()
Tried again with openpyxl as advised.

Categories