I am writing a code which should compare values from 2 xls files. One of the files has more than 1 sheet and I always have to read the data only from the last sheet. I really don't know how manage with this. Below is my code:
#! /usr/bin/python
import xlrd #Import the package to read from excel
#start with station report
station_rep = xlrd.open_workbook("/home/fun/data/Station.xls",encoding_override='utf8') #Open the station report.xls
station_sheet = station_rep.sheet_by_index(0) #should get the last sheet
station_vn = station_sheet.col_values(5, start_rowx=1, end_rowx=None) #List of vouchers in station report
#start with billing export
billing_rep = xlrd.open_workbook("/home/fun/data/Export.xls",encoding_override='utf8') #Open billing report xls
billing_sheet = billing_rep.sheet_by_index(0) #get the current sheet
billing_vn = billing_sheet.col_values(1, start_rowx=0, end_rowx=None)#list of vouchers in billing reports
for vn in station_vn: #For every voucher in station report
if vn: #if there is data
vnb=vn[1:] #change data
vnb=float(vnb) #change data type to float
if vnb in billing_vn: # check if voucher exist in billing report
row=station_vn.index(vn)+1 #take the row of current voucher
station_vn_data = station_sheet.row_values(row, start_colx=0, end_colx=15) #take the data for current row from station report
billing_vn_data = billing_sheet.row_values(billing_vn.index(vnb),start_colx=0, end_colx=15) #take the data for current voucher from billing report
if float(station_vn_data[5])==billing_vn_data[1]: #check if vouchers are equal
print "nomer na vouchera", station_vn_data[5], billing_vn_data[1]
if round(station_vn_data[10],3)<>round(billing_vn_data[5],3): #check for differences in ammount
print "Razlika v edinichna cena", round(station_vn_data[10],3),"-" , round(billing_vn_data[5],3),"=", round(station_vn_data[10]-billing_vn_data[5],3)
if station_vn_data[11]<>billing_vn_data[4]: #check for difference in price
print "kolichestvo", round(station_vn_data[11],4),"-", round(billing_vn_data[4],4),"=",round(station_vn_data[11]-billing_vn_data[4],4) #Ako ima razliki kolichestvata se printirat
if station_vn_data[12]<>billing_vn_data[6]:# check for 1 more difference
print "obshta suma", round(station_vn_data[12],3),"-", round(billing_vn_data[6],3),"=",round(station_vn_data[12]-billing_vn_data[6],3)
else:
print "voucher is OK"
print " " #print empty row for more clear view
else: #if voucher do not exist in billing
if vnb:
print vnb, "does not exist in billing report" #print the voucher number wich don`t exist
station_sheet = station_rep.sheet_by_index(0) #should get the last sheet
There is no reason this should get the last sheet; Python indices are zero-based, so 0 is the first element in a sequence:
>>> [1, 2, 3][0]
1
If you want the last worksheet, note that Python allows negative indexing from the end of a sequence:
>>> [1, 2, 3][-1]
3
On that basis, I think you want:
station_sheet = station_rep.sheet_by_index(-1) # get the last sheet
# ^ note index
I managed to fix it with that code:
for id in station_rep.sheet_names():
sheet_id=station_rep.sheet_names().index(id)
station_sheet = station_rep.sheet_by_index(sheet_id) #get the last sheet
Related
Hi I have written some code in which a user adds in items and the prices they bought and sold for, then it automatically adds this to an Excel sheet. However, I am having trouble thinking of a way to insert multiple rows of the same item if the quantity sold is >1 without the user entering in the same fields x amount of times. Thanks
I am using Tkinter where a Button executes the Add_Data function.
def Add_Data(Event=None):
workbook = load_workbook('Accounting Sheet.xlsx')
sheet = workbook.active
sheet.insert_rows(idx=2)
sheet['A2'] = str(Date_Entry.get())
sheet['B2'] = str(Shoe_Name_Entry.get())
sheet['C2'] = str(Purchase_Price_Entry.get())
sheet['D2'] = str(Sale_Price_Entry.get())
RRP_Float = float(Purchase_Price_Entry.get())
Sale_Float = float(Sale_Price_Entry.get())
sheet['E1'] = str(Sale_Float-RRP_Float)
Shoe_Name_Entry.delete(0, 30)
Purchase_Price_Entry.delete(0, 6)
Sale_Price_Entry.delete(0, 6)
workbook.save('Accounting Sheet.xlsx')
Include a textbox for 'Quantity' on the input form (which defaults to 1) and obtain that value prior to writing the values to excel. When writing the values to excel include a loop that will iterate 'quantity' times like below.
If the user has Quantity 1 then one entry is made as before. If the Quantity is 2 or more then then after the first row is added to excel it will loop and add another entry with exact same values a 2nd time and so on till the Quantity value is reached.
def Add_Data(Event=None):
workbook = load_workbook('Accounting Sheet.xlsx')
sheet = workbook.active
quantity = int('0'+Quantity_Entry.get())
for sale in range(quantity):
sheet.insert_rows(idx=2)
sheet['A2'] = str(Date_Entry.get())
sheet['B2'] = str(Shoe_Name_Entry.get())
sheet['C2'] = str(Purchase_Price_Entry.get())
sheet['D2'] = str(Sale_Price_Entry.get())
RRP_Float = float(Purchase_Price_Entry.get())
Sale_Float = float(Sale_Price_Entry.get())
sheet['E1'] = str(Sale_Float-RRP_Float)
Shoe_Name_Entry.delete(0, 30)
Purchase_Price_Entry.delete(0, 6)
Sale_Price_Entry.delete(0, 6)
workbook.save('Accounting Sheet.xlsx')
I have 13 multiple sheet excel files in one folder to find degree eligibility.
I read all files using openpyxl.
Check the missing values.
Then convert Grade to Grade point value.
Sort by descending order and remove lower grade duplicates while keeping highest grade.
Extract Course code last digit as subject credit.
Then check whether the course code is applicable for GPA or not using credit multiplier.
Then calculate GPA=(Grade point valueSubject creditcredit multiplier)/Total subject credit.
The results write in a new sheet in same file.
My program shows last file last sheet output not all 13 files. What is missing here?
Program as follows:
dir=os.path.join('file_path')
dir
file_found = False
for files in os.listdir(dir):
print(f"processing file: '{files}'")
if files[-4::] == 'xlsx':
file_found = True
else:
print(f"current file does not end with xlsx. Its last 4 chars are: '{files[-4:]}'")
if not file_found:
print("ERROR: There were no files with ending xlsx")
file_1 = pd.ExcelFile(os.path.join(dir,files),engine='openpyxl')
print('Path of File: ', os.path.join(dir,files))
print('Student: ', pd.read_excel(file_1, sheet_name=0).iloc[0,1])
sheets_names = ['Yr1', 'Yr2', 'Yr3','Subjects']
for names in sheets_names:
sheet = file_1.sheet_names.index(names)
print('Sheet: ', file_1.sheet_names[sheet])
file_original = pd.read_excel(file_1, sheet_name=sheet,engine='openpyxl')
file_copy = file_original.copy()
print(file_copy.columns)
for i in range(len(file_copy)):
file_copy.loc[i,'Grades'] = grades[file_copy.loc[i,'Grade']]
#sortedby=file_copy.sort_values(file_copy.loc[:,'Course Code'])
# Rows Repeated
dupli = file_copy.loc[file_copy.duplicated(['Course Code'], keep='first')].reset_index()
cont = 0
rows_to_delete = []
for i in range(int(len(dupli))):
dupli.loc[cont,'Grades'] >= dupli.loc[cont+1,'Grades']:
rows_to_delete.append(dupli.loc[cont+1,'index'])
else:
rows_to_delete.append(dupli.loc[cont,'index'])
cont += 2
file_copy.drop(index=rows_to_delete, inplace=True)
file_copy.reset_index(drop=True, inplace=True)
# subject credit (all last digits of Coruse Code)
file_copy.loc[:,'subject_credits'] = [int(i[-1]) for i in file_copy.loc[:,'Course Code']]
print('') file_copy.loc[:,'Credit_Multiplier'] = [str(i[0:4]) for i in file_copy.loc[:,'Course Code']]
credits=[("AMAT",1),("BFIN",1),("DELT",0),("ELEC",1),("MGMT",1),("PHYS",1),("PMAT",1),("COST",1),("MAPS",1),("COSC",1),("STAT",1),("BOTA",1)] file_copy.loc[:,'course_unit']=file_copy.loc[:,'Credit_Multiplier'].map(dict(credits)) file_copy.loc[:,'subject_credits'].sum()#need to give specific cell
for index in range(len(file_copy)):
#total_credits[index] = last digit of Course Code
gpv= file_copy.loc[:,'Grades']*file_copy.loc[:,'subject_credits'] *file_copy.loc[:,'Course_Unit']
GPA=gpv/file_copy.loc[:,'subject_credits'].sum()
I am writing a bot using gspread and IMDbPy. The script right now takes input(a movie title), it then grabs the movie ID, finds the movie's rating on IMDB.com, then posts the rating onto a spreadsheet into a specific cell.
There is a function named "update_cell" that updates the the specific cell based off the given row and column parameters. Once the bot is complete, I don't want to have to keep going into the code to update the row cell parameter. I want it to update by 1 each time the bot executes.
Is there a way to do this? I'll post the code below:
ia = imdb.IMDb()
def take_input():
fd = open('movielist.txt',"w")
print("Input your movie please: \n")
inp = input()
fd.write(inp)
fd.close()
take_input()
# Wed 8/28/19 - movie_list is a list object. Must set it equal to our ia.search_movies
# Need to find out where to put movie_list = ia.search_movies in the code, and what to
# remove or keep.
a = int(52)
b = int(18)
def Main():
c = """Python Movie Rating Scraper by Nickydimebags"""
print(c)
time.sleep(2)
f1 = open('movielist.txt')
movie_list = []
for i in f1.readlines():
movie_list.append(i)
movie_list = ia.search_movie(i)
movie_id = movie_list[0].movieID
print(movie_id)
m = ia.get_movie(movie_id)
print(m)
rating = m['rating']
print(rating)
scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets', "https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name("creds.json", scope)
client = gspread.authorize(creds)
sheet = client.open("Movie Fridays").sheet1
sheet.update_cell(a, b, rating) #updates specific cell
Main()
^ The a variable is what I need to update by 1 everytime the bot runs
I am guessing the a variable tracks the row index. You could get the index of the next empty row cell in the column you are adding the values to.
def next_available_row(worksheet, col):
return len(worksheet.col_values(col)) + 1
sheet = client.open("Movie Fridays").sheet1
sheet.update_cell(next_available_row(sheet, b), b, rating)
You are going to need to save the current or next value of your a variable somewhere and update it every time the script runs.
You could abuse a cell in the spreadsheet for this, or write it out to a file.
I am trying to parse data from CSV files. The files are in a folder and I want to extract data and write them to the db. However the csvs are not set up in a table format. I know how to import csvs into the db with the for each loop container, adding data flow tasks, and importing with OLE DB Destination.
The problem is just getting one value out of these csvs. The format of the file is as followed:
Title Title 2
Date saved ##/##/#### ##:## AM
Comment
[ Main ]
No. Measure Output Unit of measure
1 Name 8 µm
Count 0 pcs
[ XY Measure ]
X
Y
D
[ Area ]
No. Area Unit Perimeter Unit
All I want is just the output which is "8", to snatch the name of the file to make it name of the result or add it to a column, and the date and time to add to their own columns.
I am not sure which direction to head into and i hope someone has some things for me to look into. Originally, I wasn't sure if I should do the parsing externally (python) before using SQL server. If anyone knows another way I should use to get this done please let me know. Sorry for the unclear post earlier.
The expect outcome:
Filename Date Time Outcome
jnnnnnnn ##/##/#### ##:## 8
I'd try this:
filename = # from the from the path of the file you're parsing
# define appropriate vars
for row in csv_file:
if row.find('Date saved') > 0:
row = row.replace('Date saved ')
date_saved = row[0:row.find(' ')]
row = row.replace(date_saved + ' ')
time = row[0:row.find(' ')]
elif row.find(u"\u03BC"):
split_row = row.split(' ')
outcome = split_row[2]
# add filename,date_saved,time,outcome to data that will go in DB
Using Python 2.7 on Mac OSX Lion with xlrd
My problem is relatively simple and straightforward. I'm trying to match a string to an excel cell value, in order to insure that other data, within the row that value will be matched to, is the correct value.
So, say for instance that player = 'Andrea Bargnani' and I want to match a row that looks like this:
Draft Player Team
1 Andrea Bargnani - Toronto Raptors
I do:
num_rows = draftSheet.nrows - 1
cur_row = -1
while cur_row < num_rows:
cur_row += 1
row = draftSheet.row(cur_row)
if row[1] == player:
ranking == row[0]
The problem is that the value of row[1] is text:u'Andrea Bargnani, as opposed to just Andrea Bargnani.
I know that Excel, after Excel 97, is all unicode. But even if I do player = u'Andrea Bargnani' there is still the preceding text:. So I tried player = 'text:'u'Andrea Bargnani', but when the variable is called it ends up looking like u'text: Andrea Bargnani and still does not produce a match.
I would like to then just strip the test: u' off of the returned row[1] value in order to get an appropriate match.
You need to get a value from the cell.
I've created a sample excel file with a text "Andrea Bargnani" in the A1 cell. And here the code explaining the difference between printing the cell and it's value:
import xlrd
book = xlrd.open_workbook("input.xls")
sheet = book.sheet_by_index(0)
print sheet.cell(0, 0) # prints text:u'Andrea Bargnani'
print sheet.cell(0, 0).value # prints Andrea Bargnani
Hope that helps.