Open and edit excel via python - python

I want to import an existing excel file and edit it. But when i copy the excel file and try to edit on it i get some errors. I did not get errors while trying to execute "write" command. But when i am trying to read some values in the cell, i am having problem.
import xlsxwriter
from xlrd import open_workbook
from xlwt import Workbook, easyxf
import xlwt
from xlutils.copy import copy
workbook=open_workbook("month.xlsx")
sheet=workbook.sheet_by_index(0)
print sheet.nrows
book = copy(workbook)
w_sheet=book.get_sheet(0)
print w_sheet.cell(0,0).value
Error: Traceback (most recent call last):
File "excel.py", line 18, in <module>
print w_sheet.cell(0,0).value
AttributeError: 'Worksheet' object has no attribute 'cell'

I haven't used this library, but looking at the documentation I think you are trying to do something it doesn't support. The worksheet documentation lists it's functionality and cell() is not there.
I think this library is for writing excel only, not reading.
Perhaps try pandas read_excel() to read the excel documents you create?
You can the use pandas iloc on the resulting dataframe to get the value you want:
value=pd.read_excel("file.xlsx", sheet_name="sheet").iloc[0,0]
I think that's correct, although I can't run the code to check just now...

Related

what is the correct way to read a csv file into a pandas dataframe?

I am doing a data analysis project and while importing the csv file into spyder I am facing this error. Please help me to debug this as I am new to programming.
#import library
>>>import pandas as pd
#read the data from from csv as a pandas dataframe
>>>df = pd.read.csv('/Documents/Melbourne_housing_FULL.csv')
This is the error shown when I use the pd.read.csv command:
File "C:/Users/mylaptop/.spyder-py3/temp.py", line 4, in <module>
df = pd.read.csv('/Documents/Melbourne_housing_FULL.csv')
AttributeError: module 'pandas' has no attribute 'read'
you should use :
df = pd.read_csv('/Documents/Melbourne_housing_FULL.csv')
see here docs
you need to use pandas.read_csv() instead of pandas.read.csv() the error is litterally telling you this method doesn't exist .

getting the error; attributeerror: 'Worksheet' object has no attribute 'delete_rows' openpyxl

i'm writing code for a too to perform GIS functions to an input of an excel sheet. sometimes the excel sheet will come in and have 2 separate rows across the top for its attributes fields, and when there is 2, I need to delete the top row. the value of cell A1 will be naming if I need to do this
I tried writing code to check this and delete it as below;
openpyxl
import arcpy, os, sys, csv, openpyxl
from arcpy import env
env.workspace = r"C:\Users\myname\Desktop\Yanko's tool"
arcpy.env.overwriteOutput = True
excel = r"C:\Users\myname\Desktop\Yanko's tool\Yanko's Duplicate tool\Construction_table_Example.xlsx"
layer = r"C:\Users\myname\Desktop\Yanko's tool\Yanko's Duplicate tool\Example_Polygons.shp"
output = r"C:\Users\myname\Desktop\Yanko's tool\\Yanko's Duplicate tool"
book = openpyxl.load_workbook(excel)
book.get_sheet_by_name("Construction Table format")
if ws.cell(row=1, column=1).value == "Naming":
ws.delete_rows(1, 1)
book.save
book.close
it should just delete the first row if the if function passes true, but I get the error;
Warning
(from warnings module):
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\openpyxl\reader\worksheet.py", line 310
warn(msg)
UserWarning: Data Validation extension is not supported and will be removed
Traceback (most recent call last):
File "C:\Users\ronan.corrigan\Desktop\Yanko's tool\Yanko's Duplicate tool\Yanko's Tool.py", line 31, in <module>
ws.delete_rows(1, 1)
AttributeError: 'Worksheet' object has no attribute 'delete_rows'
any help in figuring out what I've done wrong would be greatly appreciated
thanks
First of all, according to the docs, the get_sheet_by_name function is deprecated, and you should just be using the sheet name to get the function:
book["Construction Table format"]
Another thing to note, in your code I don't see you setting that ws value, which should be set to whatever sheet object is returned. If you're setting it somewhere else, so it may be possible that you are using a different sheet object which doesn't have that function
ws=book["Construction Table format"]
Other than that you'd have to share the stack trace to give a better understanding of what's breaking

Python Xlrd to import .xslx Template, the use Openpyxl to Edit and Re-save .xslx File

I have an .xslx file with specific formatting and objects that I am using for reports that I plan on producing on a large scale using Python. I originally was openpyxl to load a copy of the template (openpyxl.load_workbook()), write a Pandas dataframe to the file (openpyxl.dataframe_to_rows()), then save the file for future distribution. I found out that openpyxl.load_workbook does not load the formatting or objects so they are removed from the new file. So then tried xlrd to open the file (xlrd.open_workbook()) which loaded the formatting and objects properly. However openpyxl will no longer write to the file creating empty copies of the template file. Is there another package I can use that will handle the reading/writing by itself or a package I can use instead of openpyxl? Xlsxwriter didn't work either. See code sample below.
from xlrd import open_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
import pandas as pd
import shutil
shutil.copy2('template.xlsx', 'new_report.xlsx')
book = open_workbook('new_report.xlsx')
writer = pd.ExcelWriter(book, engine='openpyxl')
ws = book.sheet_by_name('Sheet1')
for r in dataframe_to_rows(result, index=False, header=False):
ws.cell(colx=1, rowx=1)
ws.append(r)
book.save('new_report.xlsx')
I'm also getting the errors: "AttributeError: 'Book' object has no attribute 'save'" and "AttributeError: 'Sheet' object has no attribute 'append'" from the code if anyone has suggestions for those problems.
I ended up using formulas to recreate any formatting I had in the existing Excel file after pasting the new data. I'm still missing the objects (Ex. shapes) but my reports will live without them until I can find another work around.

openpyxl read excel with filtered data

With openpyxl, I am reading an excel file which has some filters applied already.
from openpyxl import load_workbook
wb = load_workbook('C:\Users\dsivaji\Downloads\testcases.xlsx')
ws = wb['TestCaseList']
print ws['B3'].value
My goal to loop through the content of the column 'B'. With this I will be able to read the content of the cell 'B3'. If filters applied and in that case, I don't want to start from the initial cell.
i.e. whichever visible in the excel (after applying the filters) , those alone I want to fetch.
After searching in web for sometime, found that ws.row_dimensions can help with the visible property, but still no luck.
>>> ws.row_dimensions[1]
<openpyxl.worksheet.dimensions.RowDimension object at 0x03EF5B48>
>>> ws.row_dimensions[2]
<openpyxl.worksheet.dimensions.RowDimension object at 0x03EF5B70>
>>> ws.row_dimensions[3].visible
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'RowDimension' object has no attribute 'visible'
How to achieve this ?
You are almost there. The name of the attribute is hidden. If you replace visible in your code with hidden, it should work.
openpyxl is a library for the OOXML file format (.xlsx) and not a replacement for an application like Microsoft Excel. As such support for filters is limited to reading and writing their definitions but not applying them.

Why won't pandas.read_excel run?

I am trying to use pandas.read_excel but I keep getting " 'module' object has no attribute 'read_excel' " as an error in my terminal as shown
File "read.py", line 9, in <module>
cols = pd.read_excel('laucnty12', 'Poverty Data', index_col='State', \\ na_values=['NA'])
AttributeError: 'module' object has no attribute 'read_excel'
I have tried pd.read_excel() and pd.io.parsers.read_excel() but get the same error. I have python 2.7 installed and other parts of pandas work fine such as xls.parse and read_csv. My code is below:
import pandas as pd
from pandas import *
xls = pd.ExcelFile('laucnty12.xls')
data = xls.parse('laucnty12', index_col=None, na_values=['NA'])
cols = pd.read_excel('laucnty12', 'Poverty Data', index_col='State', na_values=['NA'])
print cols
df = pd.read_excel(filepath + 'Result.xlsx')
Check whether the extension of excel file is xls or xlsx then add the same in the query. I tried and its is working fine now.
You probably mean pd.io.excel.read_excel()
The problem is that your script is called "read.py". The Python file that defines read_excel already imports another module called "read" - so when you try and run your "read.py" script, it squashes the old "read" module that pandas is using, and thus breaks read_excel. This problem can happen with other "common" short names for scripts, like "email.py".
Try renaming your script.

Categories