I'm trying to write a script that will pull one row of excel at a time and print it. I would like to use a method to change the row. I am able to get the value of the row to change (variable rrowx) but when I print the currentRow string, I get the original row and not the newly adjusted row.
import xlrd
class Loader(object): ## engine to load and unload spread sheets
## then sets them to a variable
# set the variables
workbook = " " # name of the file
sheetCount = 0 # amount of sheets in the spreadsheet
sheetNumber = 0 # current sheet (index)
rowCount = 0 # amount of rows in the spreadsheet
currentSheet = " " # name of current sheet
topRow = " " # row 0 string
currentRow = " " # row x string
global rrowx
rrowx = 0
# begin the load
workbook = xlrd.open_workbook('test.xlsx')
sheetCount = workbook.nsheets
sheetNames = workbook.sheet_names()
currentSheet = workbook.sheet_by_index(sheetNumber)
#topRow = currentSheet.row_values(rowx=rrowx, start_colx=scolx, end_colx=ecolx)
currentRow = currentSheet.row_values(rowx=rrowx)
# methods to navigate the sheet
def nextrow(self):
global rrowx
print(rrowx)
rrowx += 1
print(rrowx)
return rrowx
spreadsheet = Loader()
## Debuggin prints
print(spreadsheet.sheetNames)
print(spreadsheet.sheetCount)
print("What Sheet would you like to use? (Use numbers)")
spreadsheetadjust = int(input()) # takes input as a interger
spreadsheet.currentSheet = spreadsheetadjust - 1 # takes input and -1 for index value
print ('Current sheet name: %s' % spreadsheet.currentSheet)# prints current sheet name
print('top row:')
#print(spreadsheet.topRow)
print('row 1 ')
print(spreadsheet.currentRow)
print("NextRow")
spreadsheet.nextrow()
print(spreadsheet.currentRow)
I thought after changing the rrowx variable and calling print again on the currentRow would change the row that is printed. But instead I am getting the same row printed twice, even though I can see the value of rrowx is changing with the prints I added in the method.
disclosure: I've only been programming for a month so sorry if this is a easy answer i'm just missing.
I highly recommend you read python object oriented basics. You have multiple issues with your code, I will mention some:
Your class variables are class variables, meaning all instances of
the class will share the same variable. so if you create multiple
instances of your class, you will get unexpected/undesired behavior.
The use of a global variable is not recommended, especially when you
can do without it.
You don't need to initialize variables in python
In your implementation you have
spreadsheet.currentSheet = spreadsheetadjust - 1 which will cause you to fail even if you fix
your problem. You want
spreadsheet.sheetNumber = spreadsheetadjust - 1
Here is code that works with proper usage of python classes:
import xlrd
class Loader:
def __init__(self, path_to_xlsx='test.xlsx'):
self.sheetNumber = 0
self.rrowx = 0
self.workbook = xlrd.open_workbook(path_to_xlsx)
self.sheetCount = self.workbook.nsheets
self.sheetNames = self.workbook.sheet_names()
self.currentSheet = self.workbook.sheet_by_index(self.sheetNumber)
self.currentRow = self.currentSheet.row_values(rowx=self.rrowx)
def nextrow(self):
self.rrowx += 1
self.currentRow = self.currentSheet.row_values(rowx=self.rrowx)
spreadsheet = Loader()
print("What Sheet would you like to use? (Use numbers)")
spreadsheetadjust = int(input()) # takes input as a interger
spreadsheet.sheetNumber = spreadsheetadjust - 1 # takes input and -1 for index value
print('Current sheet name: %s' % spreadsheet.currentSheet) # prints current sheet name
print(spreadsheet.currentRow)
spreadsheet.nextrow()
print(spreadsheet.currentRow)
Related
Hi I have written some code in which a user adds in items and the prices they bought and sold for, then it automatically adds this to an Excel sheet. However, I am having trouble thinking of a way to insert multiple rows of the same item if the quantity sold is >1 without the user entering in the same fields x amount of times. Thanks
I am using Tkinter where a Button executes the Add_Data function.
def Add_Data(Event=None):
workbook = load_workbook('Accounting Sheet.xlsx')
sheet = workbook.active
sheet.insert_rows(idx=2)
sheet['A2'] = str(Date_Entry.get())
sheet['B2'] = str(Shoe_Name_Entry.get())
sheet['C2'] = str(Purchase_Price_Entry.get())
sheet['D2'] = str(Sale_Price_Entry.get())
RRP_Float = float(Purchase_Price_Entry.get())
Sale_Float = float(Sale_Price_Entry.get())
sheet['E1'] = str(Sale_Float-RRP_Float)
Shoe_Name_Entry.delete(0, 30)
Purchase_Price_Entry.delete(0, 6)
Sale_Price_Entry.delete(0, 6)
workbook.save('Accounting Sheet.xlsx')
Include a textbox for 'Quantity' on the input form (which defaults to 1) and obtain that value prior to writing the values to excel. When writing the values to excel include a loop that will iterate 'quantity' times like below.
If the user has Quantity 1 then one entry is made as before. If the Quantity is 2 or more then then after the first row is added to excel it will loop and add another entry with exact same values a 2nd time and so on till the Quantity value is reached.
def Add_Data(Event=None):
workbook = load_workbook('Accounting Sheet.xlsx')
sheet = workbook.active
quantity = int('0'+Quantity_Entry.get())
for sale in range(quantity):
sheet.insert_rows(idx=2)
sheet['A2'] = str(Date_Entry.get())
sheet['B2'] = str(Shoe_Name_Entry.get())
sheet['C2'] = str(Purchase_Price_Entry.get())
sheet['D2'] = str(Sale_Price_Entry.get())
RRP_Float = float(Purchase_Price_Entry.get())
Sale_Float = float(Sale_Price_Entry.get())
sheet['E1'] = str(Sale_Float-RRP_Float)
Shoe_Name_Entry.delete(0, 30)
Purchase_Price_Entry.delete(0, 6)
Sale_Price_Entry.delete(0, 6)
workbook.save('Accounting Sheet.xlsx')
In an Excel file I have two large tables. Table A ("Dissection", 409 rows x 25 cols) contains unique entries, each separated by a unique ID. Table B ("Dissection", 234 rows x 39 columns) uses the ID of Table A in the first cell and extends it. To analyze the data in Minitab, all data must be in a single long row, meaning the values of "Damage" have to follow "Dissection". The whole thing looks like this:
Table A - i.e. Dissection
- ID1 [valueTabA] [valueTabA]
- ID2 [valueTabA] [valueTabA]
- ID3 [valueTabA] [valueTabA]
- ID4 [valueTabA] [valueTabA]
Table B - i.e. Damage
- ID1 [valueTabB1] [valueTabB1]
- ID1 [valueTabB2] [valueTabB2]
- ID4 [valueTabB] [valueTabB]
They are supposed to combine something like this:
Table A
- ID1 [valueTabA] [valueTabA] [valueTabB1] [valueTabB1] [valueTabB2] [valueTabB2]
- ID2 [valueTabA] [valueTabA]
- ID3 [valueTabA] [valueTabA]
- ID4 [valueTabA] [valueTabA] [valueTabB] [valueTabB]
What is the best way to do that?
The following describes my two approaches. Both use the same data in the same tables but in two different files, to be able to test both scenarios.
The first approach uses a file, where both tables are in the same worksheet, the second uses a file where both tables are in different worksheets.
Scenario: both tables are in the same worksheet, where I'm trying to move the row as a range
current_row = 415 # start without headers of table A
current_line = 2 # start without headers of table B
for row in ws.iter_rows(min_row=415, max_row=647):
# loop through damage
id_A = ws.cell(row=current_row, column=1).value
max_col = 25
for line in ws.iter_rows(min_row=2, max_row=409):
# loop through dissection
id_B = ws.cell(row=current_line, column=1).value
if id_A == id_B:
copy_range = ((ws.cell(row=current_line, column=2)).column_letter + str(current_line) + ":" +
(ws.cell(row=current_line, column=39)).column_letter + str(current_line))
ws.move_range(copy_range, rows=current_row, cols=max_col+1)
print("copied range: " + copy_range +" to: " + str(current_row) + ":"+str(max_col+1))
count += 1
break
if current_line > 409:
current_line = 2
else:
current_line += 1
current_row += 1
-> Here I'm struggling to append the range to the right row of Table A, without overwriting the previous row (see example ID1 above)
Scenario: both tables are located in separated sheets
dissection = wb["Dissection"]
damage = wb["Damage"]
recovery = wb["Recovery"]
current_row, current_line = 2, 2
for row in damage.iter_rows():
# loop through first table
id_A = damage.cell(row=current_row, column=1).value
for line in dissection.iter_rows():
# loop through second table
id_B = dissection.cell(row=current_line, column=1).value
copyData = []
if id_A == id_B:
for col in range(2, 39):
# add data to the list, skipping the ID
copyData.append(damage.cell(row=current_line, column=col).value)
# print(copyData) for debugging purposes
for item in copyData:
column_count = dissection.max_column
dissection.cell(row=current_row, column=column_count).value = item
column_count += 1
current_row += 1
break
if not current_line > 409:
# prevent looping out of range
current_line += 1
else:
current_line = 2
-> Same problem as in 1., at some point it's not adding the damage values to copyData anymore but None instead, and finally it's just not pasting the items (cells stay blank)
I've tried everything excel related that I could find, but unfortunately nothing worked. Would pandas be more useful here or am I just not seeing something?
Thanks for taking the time to read this :)
I highly recommend using pandas for situations like this. It is still a bit unclear how your data is formatted in the excel file, but given your second option I assume that the tables are both on different sheets in the excel file. I also assume that the first row contains the table title (e.g. Table A - i.e. Dissection). If this is not the case, just remove skiprows=1:
import pandas as pd
df = pd.concat(pd.read_excel("filename.xlsx", sheet_name=None, skiprows=1, header=None), axis=1, ignore_index=True)
df.to_excel('combined_data.xlsx) #save to excel
read_excel will load the excel file into a pandas dataframe. sheet_name=None indicates that all sheets should be loaded into an OrderedDict of dataframes. pd.concat will concatenate these dataframes into one single dataframe (axis=1 indicates the axis). You can explore the data with df.head(), or save the dataframe to excel with df.to_excel.
I ended up using the 2. scenario (one file, two worksheets) but this code should be adaptable to the 1. scenario (one file, one worksheet) as well.
I copied the rows of Table B using code taken from here.
And handled the offset with code from here.
Also, I added a few extras to my solution to make it more generic:
import openpyxl, os
from openpyxl.utils import range_boundaries
# Introduction
print("Welcome!\n[!] Advice: Always have a backup of the file you want to sort.\n[+] Please put the file to be sorted in the same directory as this program.")
print("[+] This program assumes that the value to be sorted by is located in the first column of the outgoing table.")
# File listing
while True:
files = [f for f in os.listdir('.') if os.path.isfile(f)]
valid_types = ["xlsx", "xltx", "xlt", "xls"]
print("\n[+] Current directory: " + os.getcwd())
print("[+] Excel files in the current directory: ")
for f in files:
if str(f).split(".")[1] in valid_types:
print(f)
file = input("\nWhich file would you like to sort: ")
try:
ending = file.split(".")[1]
except IndexError:
print("please only enter excel files.")
continue
if ending in valid_types:
break
else:
print("Please only enter excel files")
wb = openpyxl.load_workbook(file)
# Handling Worksheets
print("\nAvailable Worksheets: " + str(wb.sheetnames))
print("Which file would you like to sort? (please copy the name without the parenthesis)")
outgoing_sheet = wb[input("Outgoing sheet: ")]
print("\nAvailable Worksheets: " + str(wb.sheetnames))
print("Which is the receiving sheet? (please copy the name without the parenthesis)")
receiving_sheet = wb[input("Receiving sheet: ")]
# Declaring functions
def copy_row(source_range, target_start, source_sheet, target_sheet):
# Define start Range(target_start) in the new Worksheet
min_col, min_row, max_col, max_row = range_boundaries(target_start)
# Iterate Range you want to copy
for row, row_cells in enumerate(source_sheet[source_range], min_row):
for column, cell in enumerate(row_cells, min_col):
# Copy Value from Copy.Cell to given Worksheet.Cell
target_sheet.cell(row=row, column=column).value = cell.value
def ask_yes_no(prompt):
"""
:param prompt: The question to be asked
:return: Value to check
"""
while True:
answer = input(prompt + " (y/n): ")
if answer == "y":
return True
elif answer == "n":
return False
print("Please only enter y or n.")
def ask_integer(prompt):
while True:
try:
answer = int(input(prompt + ": "))
break
except ValueError:
print("Please only enter integers (e.g. 1, 2 or 3).")
return answer
def scan_empty(index):
print("Scanning for empty cells...")
scan, fill = False, False
min_col = outgoing_sheet.min_column
max_col = outgoing_sheet.max_column
cols = range(min_col, max_col+1)
break_loop = False
count = 0
if not scan:
search_index = index
for row in outgoing_sheet.iter_rows():
for n in cols:
cell = outgoing_sheet.cell(row=search_index, column=n).value
if cell:
pass
else:
choice = ask_yes_no("\n[!] Empty cells found, would you like to fill them? (recommended)")
if choice:
fill = input("Fill with: ")
scan = True
break_loop = True
break
else:
print("[!] Attention: This can produce to mismatches in the sorting algorithm.")
confirm = ask_yes_no("[>] Are you sure you don't want to fill them?\n[+] Hint: You can also enter spaces.\n(n)o I really don't want to\noka(y) I'll enter something, just let me sort already.\n")
if confirm:
fill = input("Fill with: ")
scan = True
break_loop = True
break
else:
print("You have chosen not to fill the empty cells.")
scan = True
break_loop = True
break
if break_loop:
break
search_index += 1
if fill:
search_index = index
for row in outgoing_sheet.iter_rows(max_row=outgoing_sheet.max_row-1):
for n in cols:
cell = outgoing_sheet.cell(row=search_index, column=n).value
if cell:
pass
elif cell != int(0):
count += 1
outgoing_sheet.cell(row=search_index, column=n).value = fill
search_index += 1
print("Filled " + str(count) + " cells with: " + fill)
return fill, count
# Declaring basic variables
first_value = ask_yes_no("Is the first row containing values the 2nd in both tables?")
if first_value:
current_row, current_line = 2, 2
else:
current_row = ask_integer("Sorting table first row")
current_line = ask_integer("Receiving table first row")
verbose = ask_yes_no("Verbose output?")
reset = current_line
rec_max = receiving_sheet.max_row
scan_empty(current_row)
count = 0
print("\nSorting: " + str(outgoing_sheet.max_row - 1) + " rows...")
for row in outgoing_sheet.iter_rows():
# loop through first table - Table you want to sort
id_A = outgoing_sheet.cell(row=current_row, column=1).value
if verbose:
print("\nCurrently at: " + str(current_row - 1) + "/" + str(outgoing_sheet.max_row - 1) + "")
try:
print("Sorting now: " + id_A)
except TypeError:
# Handling None type exceptions
pass
for line in receiving_sheet.iter_rows():
# loop through second table - The receiving table
id_B = receiving_sheet.cell(row=current_line, column=1).value
if id_A == id_B:
try:
# calculate the offset
offset = max((row.column for row in receiving_sheet[current_line] if row.value is not None)) + 1
except ValueError:
# typical "No idea why, but it doesn't work without it" - code
pass
start_paste_from = receiving_sheet.cell(row=current_line, column=offset).column_letter + str(current_line)
copy_Range = ((outgoing_sheet.cell(row=current_row, column=2)).column_letter + str(current_row) + ":" +
(outgoing_sheet.cell(row=current_row, column=outgoing_sheet.max_column)).column_letter + str(current_row))
# Don't copy the ID, alternatively set damage.min_column for the first and damage.max_column for the second
copy_row(copy_Range, start_paste_from, outgoing_sheet, receiving_sheet)
count += 1
current_row += 1
if verbose:
print("Copied " + copy_Range + " to: " + str(start_paste_from))
break
if not current_line > rec_max:
# prevent looping out of range
current_line += 1
else:
current_line = reset
wb.save(file)
print("\nSorted: " + str(count) + " rows.")
print("Saving the file to: " + os.getcwd())
print("Done.")
Note: The values of table B ("Damage") are sorted according to the ID, although that is not required. However, if you choose to do so, this can be done using pandas.
import pandas as pd
df = pd.read_excel("excel/separated.xlsx","Damage")
# open the correct worksheet
df.sort_values(by="Identification")
df.to_excel("sorted.xlsx")
I am writing a bot using gspread and IMDbPy. The script right now takes input(a movie title), it then grabs the movie ID, finds the movie's rating on IMDB.com, then posts the rating onto a spreadsheet into a specific cell.
There is a function named "update_cell" that updates the the specific cell based off the given row and column parameters. Once the bot is complete, I don't want to have to keep going into the code to update the row cell parameter. I want it to update by 1 each time the bot executes.
Is there a way to do this? I'll post the code below:
ia = imdb.IMDb()
def take_input():
fd = open('movielist.txt',"w")
print("Input your movie please: \n")
inp = input()
fd.write(inp)
fd.close()
take_input()
# Wed 8/28/19 - movie_list is a list object. Must set it equal to our ia.search_movies
# Need to find out where to put movie_list = ia.search_movies in the code, and what to
# remove or keep.
a = int(52)
b = int(18)
def Main():
c = """Python Movie Rating Scraper by Nickydimebags"""
print(c)
time.sleep(2)
f1 = open('movielist.txt')
movie_list = []
for i in f1.readlines():
movie_list.append(i)
movie_list = ia.search_movie(i)
movie_id = movie_list[0].movieID
print(movie_id)
m = ia.get_movie(movie_id)
print(m)
rating = m['rating']
print(rating)
scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets', "https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name("creds.json", scope)
client = gspread.authorize(creds)
sheet = client.open("Movie Fridays").sheet1
sheet.update_cell(a, b, rating) #updates specific cell
Main()
^ The a variable is what I need to update by 1 everytime the bot runs
I am guessing the a variable tracks the row index. You could get the index of the next empty row cell in the column you are adding the values to.
def next_available_row(worksheet, col):
return len(worksheet.col_values(col)) + 1
sheet = client.open("Movie Fridays").sheet1
sheet.update_cell(next_available_row(sheet, b), b, rating)
You are going to need to save the current or next value of your a variable somewhere and update it every time the script runs.
You could abuse a cell in the spreadsheet for this, or write it out to a file.
I am just starting to learn python and am looking for some direction on a script I am working on to text out daily pick up for my drivers. The vendor name is entered into a spreadsheet along with a purchase order # and notes. What i would like to do is cycle through column "A", find all instances of a vendor name, grab the corresponding B & C cell values and save all info to a text file. I can get it to work if I name the search string explicitly but not if its a variable. Here is what I have so far:
TestList=[]
TestDict= {}
LineNumber = 0
for i in range(1, maxrow + 1):
VendorName = sheet.cell(row = i, column = 1)
if VendorName.value == "CERTIFIED LETTERING":#here is where im lost
#print (VendorName.coordinate)
VendLoc = str(VendorName.coordinate)
TestList.append(VendLoc)
TestDict[VendorName.value]=[TestList]
test = (TestDict["CERTIFIED LETTERING"][0])
ListLength = (len(test))
ListPo = []
List_Notes = []
number = 0
for i in range (0, ListLength):
PO = (str('B'+ test[number][1]))
Note = (str('C'+ test[number][1]))
ListPo.append(PO)
List_Notes.append(Note)
number = number + 1
number = 0
TestVend =(str(VendorName.value))
sonnetFile = open('testsaveforpickups.txt', 'w')
sonnetFile.write("Pick up at:" + '\n')
sonnetFile.write(str(VendorName.value)+'\n')
for i in range (0, ListLength):
sonnetFile.write ("PO# "+ str(sheet[ListPo[number]].value)+'\n'
+"NOTES: " + str(sheet[List_Notes[number]].value)+'\n')
number = number + 1
sonnetFile.close()
the results are as follows:
Pick up at:
CERTIFIED LETTERING
PO# 1111111-00
NOTES: aaa
PO# 333333-00
NOTES: ccc
PO# 555555-00
NOTES: eee
I've tried everything i could think of to change the current string of "CERTIFIED LETTERING" to a variable name, including creating a list of all vendors in column A and using that as a dictionary to go off of. Any help or ideas to point me in the right direction would be appreciated. And I apologise for any formatting errors. I'm new to posting here.
I am writing a code which should compare values from 2 xls files. One of the files has more than 1 sheet and I always have to read the data only from the last sheet. I really don't know how manage with this. Below is my code:
#! /usr/bin/python
import xlrd #Import the package to read from excel
#start with station report
station_rep = xlrd.open_workbook("/home/fun/data/Station.xls",encoding_override='utf8') #Open the station report.xls
station_sheet = station_rep.sheet_by_index(0) #should get the last sheet
station_vn = station_sheet.col_values(5, start_rowx=1, end_rowx=None) #List of vouchers in station report
#start with billing export
billing_rep = xlrd.open_workbook("/home/fun/data/Export.xls",encoding_override='utf8') #Open billing report xls
billing_sheet = billing_rep.sheet_by_index(0) #get the current sheet
billing_vn = billing_sheet.col_values(1, start_rowx=0, end_rowx=None)#list of vouchers in billing reports
for vn in station_vn: #For every voucher in station report
if vn: #if there is data
vnb=vn[1:] #change data
vnb=float(vnb) #change data type to float
if vnb in billing_vn: # check if voucher exist in billing report
row=station_vn.index(vn)+1 #take the row of current voucher
station_vn_data = station_sheet.row_values(row, start_colx=0, end_colx=15) #take the data for current row from station report
billing_vn_data = billing_sheet.row_values(billing_vn.index(vnb),start_colx=0, end_colx=15) #take the data for current voucher from billing report
if float(station_vn_data[5])==billing_vn_data[1]: #check if vouchers are equal
print "nomer na vouchera", station_vn_data[5], billing_vn_data[1]
if round(station_vn_data[10],3)<>round(billing_vn_data[5],3): #check for differences in ammount
print "Razlika v edinichna cena", round(station_vn_data[10],3),"-" , round(billing_vn_data[5],3),"=", round(station_vn_data[10]-billing_vn_data[5],3)
if station_vn_data[11]<>billing_vn_data[4]: #check for difference in price
print "kolichestvo", round(station_vn_data[11],4),"-", round(billing_vn_data[4],4),"=",round(station_vn_data[11]-billing_vn_data[4],4) #Ako ima razliki kolichestvata se printirat
if station_vn_data[12]<>billing_vn_data[6]:# check for 1 more difference
print "obshta suma", round(station_vn_data[12],3),"-", round(billing_vn_data[6],3),"=",round(station_vn_data[12]-billing_vn_data[6],3)
else:
print "voucher is OK"
print " " #print empty row for more clear view
else: #if voucher do not exist in billing
if vnb:
print vnb, "does not exist in billing report" #print the voucher number wich don`t exist
station_sheet = station_rep.sheet_by_index(0) #should get the last sheet
There is no reason this should get the last sheet; Python indices are zero-based, so 0 is the first element in a sequence:
>>> [1, 2, 3][0]
1
If you want the last worksheet, note that Python allows negative indexing from the end of a sequence:
>>> [1, 2, 3][-1]
3
On that basis, I think you want:
station_sheet = station_rep.sheet_by_index(-1) # get the last sheet
# ^ note index
I managed to fix it with that code:
for id in station_rep.sheet_names():
sheet_id=station_rep.sheet_names().index(id)
station_sheet = station_rep.sheet_by_index(sheet_id) #get the last sheet