Reading xls file with Python - python

import xlrd
cord = xlrd.open_workbook('MT_coordenadas_todas.xls')
id = cord.sheet_by_index(0)
print id
When I run my code in terminal,I got
<xlrd.sheet.Sheet object at 0x7f897e3ecf90>
I wanted to take the first column,so what should I change in my code?

id is a reference to the sheet object. You need to use values = id.col_values(0) to read the values from the first column of that sheet.

Related

Add data in the next empty row in python pandas

I'm making a small and simple program that put one name under another in an excel file, and i dont know how i can get the next empty row
I have this excel table:
Name
Carl
And i'm making a program to add new names. Here is the function:
def modifyexcel ():
book = openpyxl.load_workbook (r'C:\Users\usuario\Desktop\prueba.xlsx')
sheet = book ["a"]
sheet ["a3"] = str(entrada1.get())
book.save (r'C:\Users\usuario\Desktop\prueba.xlsx')
But i need, instead of modifying the "a3" cell, modify the next row that is empty, so every time i add a new name it gets placed on the next empty row
you can just use google colab to modify your excel !
you can mount your csv or excel to google drive or just load the csv to the side bar!
and copy path and paste it to your pandas read_csv or (read_excel is the same thing)!
https://colab.research.google.com/
from google.colab import files
import pandas as pd
#So first you read from your original excel
df=pd.read_csv('path')
list1=df.name.tolist()
##then create a variable to store your new name
name= "new name" ##param {type:"string"}
##append the new name to the list and return to pandas dataframe
list1.append(name)
df.name=list1
##output to csv and download
df.to_csv('newsheet.csv',index=False)
files.download('newsheet.csv')

How to append data to the last row (every time) of an Excel file?

I am looking for a way to append data from a Python program to an excel sheet. For this, I chose the openpyxl library to save this data.
My problem is how to put new data in the excel file without losing the current data, in the last row of the sheet. I look into the documentation but I did not see any answer.
I do not know if this library has a method to add new data or I need to make a logic to this task.
The last row of the sheet can be found using max_row():
from openpyxl import load_workbook
myFileName=r'C:\DemoFile.xlsx'
#load the workbook, and put the sheet into a variable
wb = load_workbook(filename=myFileName)
ws = wb['Sheet1']
#max_row is a sheet function that gets the last row in a sheet.
newRowLocation = ws.max_row +1
#write to the cell you want, specifying row and column, and value :-)
ws.cell(column=1,row=newRowLocation, value="aha! a new entry at the end")
wb.save(filename=myFileName)
wb.close()
What you're looking for is the Worksheet.append method:
Appends a group of values at the bottom of the current sheet.
If it’s a list: all values are added in order, starting from the first column
If it’s a dict: values are assigned to the columns indicated by the keys (numbers or letters)
So no need to check for the last row. Just use this method to always add the data at the end.
ws.append(["some", "test", "data"])

Getting Last Modified by Name of xlsx file

I have an excel file that gets modified by a group of people and we need to keep track of when the file was last modified and by whom
I was able to retrieve the file properties through .properties but trying to figure out how to isolate the lastModifiedby and insert its value in to a column
from openpyxl import load_workbook
wb = load_workbook('Rec1.xlsx')
wb.properties.lastModifiedBy
It gets me the information I need but I am stumped on how to create a new column "lastmodifiedby" with the information provided in properties
From the documentation: https://openpyxl.readthedocs.io/en/stable/usage.html#write-a-workbook
Perhaps something like this?
ws1 = wb.active
ws1.cell(column=1, row=1, value=wb.properties.lastModifiedBy)
wb.save(filename='Rec1.xlsx')
is this what you are looking for??
Dim lastModifiedBy
lastModifiedBy = ThisWorkbook.BuiltinDocumentProperties("Last Author")

How to edit Excel (xlsx and xlsm) in python

I am very new to Python and this is my first project in python.
What I am doing is...
1. Retrieved the data from Sql server
2. Put the data in predefined excel template (specific worksheet).
3. If is there any data in this sheet then it should be replaced and only column name should remain in the sheet.
3. Another sheet in excel template contains a Pivot representation of data from step 2.
4. I need to refresh this pivot with new data from sheet1.
5. no of row in sheet1 can be changed depends on data from database.
I am fine with Step1 but unable oto perform excel operations.
I tried openpyxl but not able to much understand of it.
https://openpyxl.readthedocs.io/en/stable/
code:
from openpyxl import load_workbook
wb2 = load_workbook('CnA_Rec.xlsx')
print (wb2.sheetnames)
rawsheet = wb2.get_sheet_by_name('RawData')
print (rawsheet.cell_range)
Error with above code:
AttributeError: 'Worksheet' object has no attribute 'cell_range'
I can access individual cell but not range.
I need to select current range and replace it will new data.
ref link: https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.cell_range.html
Can any one point me to some online example for the same or any sample code for this.
So, then let go for it with openpyxl. Where is your problem? This is a very basic start. We can change this script during the process.
import openpyxl
wb = openpyxl.load_workbook('hello_world.xlsx')
# do magic with openpyxl here and save
ws = wb.worksheets[0]
ws.cell(row=1, column=3).value = 'Hello' # example
ws.cell(row=2, column=3).value = 'World' # example
for i in range(2,20):
ws.cell(row=i,column=1).value = 'Row:' + str(i)
data = [ws.cell(row=i,column=1).value for i in range(1,11)]
print(data)
wb.save('hello_world.xlsx')

Iterating over rows in a column with XLRD

I have been able to get the column to output the values of the column in a separated list. However I need to retain these values and use them one by one to perform an Amazon lookup with them. The amazon lookup is not the problem. Getting XLRD to give one value at a time has been a problem. Is there also an efficient method of setting a time in Python? The only answer I have found to the timer issue is recording the time the process started and counting from there. I would prefer just a timer. This question is somewhat two parts here is what I have done so far.
I load the spreadsheet with xlrd using argv[1] i copy it to a new spreadsheet name using argv[2]; argv[3] i need to be the timer entity however I am not that far yet.
I have tried:
import sys
import datetime
import os
import xlrd
from xlrd.book import colname
from xlrd.book import row
import xlwt
import xlutils
import shutil
import bottlenose
AMAZON_ACCESS_KEY_ID = "######"
AMAZON_SECRET_KEY = "####"
print "Executing ISBN Amazon Lookup Script -- Please be sure to execute it python amazon.py input.xls output.xls 60(seconds between database queries)"
print "Copying original XLS spreadsheet to new spreadsheet file specified as the second arguement on the command line."
print "Loading Amazon Account information . . "
amazon = bottlenose.Amazon(AMAZON_ACCESS_KEY_ID, AMAZON_SECRET_KEY)
response = amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
shutil.copy2(sys.argv[1], sys.argv[2])
print "Opening copied spreadsheet and beginning ISBN extraction. . ."
wb = xlrd.open_workbook(sys.argv[2])
print "Beginning Amazon lookup for the first ISBN number."
for row in colname(colx=2):
print amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
I know this is a little vague. Should I perhaps try doing something like column = colname(colx=2) then i could do for row in column: Any help or direction is greatly appreciated.
The use of colname() in your code is simply going to return the name of the column (e.g. 'C' by default in your case unless you've overridden the name). Also, the use of colname is outside the context of the contents of your workbook. I would think you would want to work with a specific sheet from the workbook you are loading, and from within that sheet you would want to reference the values of a column (2 in the case of your example), does this sound somewhat correct?
wb = xlrd.open_workbook(sys.argv[2])
sheet = wb.sheet_by_index(0)
for row in sheet.col(2):
print amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
Although I think looking at the call to amazon.ItemLookup() you probably want to refer to row and not to "row" as the latter is simply a string and the former is the actual contents of the variable named row from your for loop.

Categories