Installing the latest version of XLRD - python

I am already using an xlrd package. The code I am working on always returns an error message:
Traceback (most recent call last):
File "diffoct8.py", line 17, in <module>
row = rs.get(row_number)
AttributeError: 'Sheet' object has no attribute 'get'
What could be the problem?
Is there a newer version of XLRD?. If yes, how can I install it in Ubuntu?

Here you can get latest xlrd package. https://pypi.python.org/pypi/xlrd
From my understanding, you just want to get information from a row in a sheet. I assume there are 10 elements in a row.
Try this:
...
element_num = 10
row = []
for i in xrange(element_num):
row.append(rs.cell(row_number, i).value)
...

The method get() does not exist (It was purely used to show the approach you should take and where the problem was in your previous question). I've update my answer to that question to show you how you should use the row() method, as instructed in the documentation.

Related

AttributeError: 'Worksheet' object has no attribute 'set_default_row'

I am getting an error that seems weird. Worksheet object does have set_default_row() function, in the docs. Not sure what I am missing here.
I got this code project from someone who made this and has been running for a long time. We are using different python versions. He's on 3.10 and I am on 3.9 and I don't see that to be any reason.
Error:
Traceback (most recent call last):
File "C:\Users\ajoshi\my_folder\misc\quick tools\CI-TestRun-Report-Generator\FileProvider.py", line 31, in create
worksheet.set_default_row(20)
AttributeError: 'Worksheet' object has no attribute 'set_default_row'
Code:
s = data.style.applymap(FileProvider.color_negative_red)
s.to_excel(writer, sheet_name=plan["name"], header=True, index=False)
workbook = writer.book
worksheet = writer.sheets[plan["name"]]
worksheet.set_default_row(20)
worksheet.set_row(0, 40)
The issue is that you are calling a xlsxwriter method but that, most probably, the module isn't installed so Pandas is defaulting to creating a openpyxl worksheet object which has different APIs and doesn't have that method. Try set up your Pandas xlsx writer like this:
writer = pd.ExcelWriter('filename.xlsx', engine='xlsxwriter')
If that fails then you need to install xlsxwriter.
If you are already using engine='xlsxwriter' then the issue could be that you have a very old version installed that doesn't support the set_default_row() method. In which case you should upgrade xlsxwriter.

getting the error; attributeerror: 'Worksheet' object has no attribute 'delete_rows' openpyxl

i'm writing code for a too to perform GIS functions to an input of an excel sheet. sometimes the excel sheet will come in and have 2 separate rows across the top for its attributes fields, and when there is 2, I need to delete the top row. the value of cell A1 will be naming if I need to do this
I tried writing code to check this and delete it as below;
openpyxl
import arcpy, os, sys, csv, openpyxl
from arcpy import env
env.workspace = r"C:\Users\myname\Desktop\Yanko's tool"
arcpy.env.overwriteOutput = True
excel = r"C:\Users\myname\Desktop\Yanko's tool\Yanko's Duplicate tool\Construction_table_Example.xlsx"
layer = r"C:\Users\myname\Desktop\Yanko's tool\Yanko's Duplicate tool\Example_Polygons.shp"
output = r"C:\Users\myname\Desktop\Yanko's tool\\Yanko's Duplicate tool"
book = openpyxl.load_workbook(excel)
book.get_sheet_by_name("Construction Table format")
if ws.cell(row=1, column=1).value == "Naming":
ws.delete_rows(1, 1)
book.save
book.close
it should just delete the first row if the if function passes true, but I get the error;
Warning
(from warnings module):
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\openpyxl\reader\worksheet.py", line 310
warn(msg)
UserWarning: Data Validation extension is not supported and will be removed
Traceback (most recent call last):
File "C:\Users\ronan.corrigan\Desktop\Yanko's tool\Yanko's Duplicate tool\Yanko's Tool.py", line 31, in <module>
ws.delete_rows(1, 1)
AttributeError: 'Worksheet' object has no attribute 'delete_rows'
any help in figuring out what I've done wrong would be greatly appreciated
thanks
First of all, according to the docs, the get_sheet_by_name function is deprecated, and you should just be using the sheet name to get the function:
book["Construction Table format"]
Another thing to note, in your code I don't see you setting that ws value, which should be set to whatever sheet object is returned. If you're setting it somewhere else, so it may be possible that you are using a different sheet object which doesn't have that function
ws=book["Construction Table format"]
Other than that you'd have to share the stack trace to give a better understanding of what's breaking

AttributeError: 'module' object has no attribute 'TimeSeries' after python Validation.py

Just starting Computational Investing by Tucker Balch. I'm using virtualbox and installed Ubuntu. After installing QSTK, I ran python Validation.py (Step 7). I keep getting an:
AttributeError: 'module' object has no attribute 'TimeSeries'
There are many similar questions so I believe problem is the use of the same name as the file somewhere in the code. I was wondering if anyone had a solution specific to this class and QSTK.
The full error is:
Traceback (most recent call last):
File "Validation.py", line 122 in <module>
import QSTK.qstkutil.tsutil as tsu
File "usr/local/lib/python2.7/dist-packages/QSTK-0.2.8 py2.7.egg/QSTK/qstkutil/tsutil.py", line 19, in <module>
from QSTK.qstkutil import qsdateutil
File "usr/local/lib/python2.7/dist-packages/QSTK-0.2.8-py2.7.egg/QSTK/qstkutil/qsdateutil.py", line 38, in <module>
GTS_DATES = _cache_dates()
File "usr/local/lib/python2.7/dist-packages/QSTK-0.2.8-py2.7.egg/QSTK/qstkutil/qsdateutil.py", line 36, in _cache_dates
return pd.TimeSeries(index=dates, data=dates)
AttributeError: 'module' object has no attribute 'TimeSeries'
I encountered this issue too. This caused by the pandas lib. You can get into the path(my file path is /Library/Python/2.7/site-packages/QSTK/qstkutil) where the qstkutil.py of QSTK located. Then change all the 'TimeSeries' of this file as 'Series'.
You can also get some insights from here(https://github.com/QuantSoftware/QuantSoftwareToolkit/issues/73)
Corley is spot on. You can solve the problem by changing 2 occurrences of "TimeSeries" to "Series" in /usr/local/lib/python2.7/dist-packages/QSTK-0.2.8-py2.7.egg/QSTK/qstkutil/qsdateutil.py. "TimeSeries" also appears once in /usr/local/lib/python2.7/dist-packages/QSTK-0.2.8-py2.7.egg/QSTK/qstkutil/tsutil.py but I haven't encountered an error yet due to it.
Changing TimeSeries to Series corrects the issue for me.
Seems that
import pandas as pd;
pd.TimeSeries = pd.Series
should work, but did not for me.

TypeError: read_excel() takes exactly 2 arguments (1 given)

I get this problem when i try to read file:
import numpy as np
import pandas as pd
pos = pd.read_excel('pos.xls', header=None)
and the error is like this:
Traceback (most recent call last):
File "one-hot.py", line 4, in <module>
pos = pd.read_excel('pos.xls', header=None)
TypeError: read_excel() takes exactly 2 arguments (1 given)
but to my surprise,when i run the code in my own pc by pycharm,it will not be an error.i get the problem only when i use my school's ubuntu(not use pycharm).
my own python is python 2.7.12,and python on school's ubuntu is python 2.7.6
My best guess (I can't try it on Python 2.7.6 since I don't have it) is that You use pandas version 0.13 or bellow. According to docs, You must also provide sheetname, which, in later version, has default value of 0.
pandas.io.excel.read_excel(io, sheetname, **kwds)
This sounds like an issue with a different version of the pandas library installed. Looking back at the older documentation pages for pandas library, it seems that pandas did in fact require 2 parameters back in version 0.13.0 (and potentially other old versions, but I did not check any others). For version 0.13.0, the docs define the function as:
pandas.read_excel(io, sheetname, **kwds)
You can read those details here: http://pandas.pydata.org/pandas-docs/version/0.13.0/generated/pandas.read_excel.html?highlight=read_excel#pandas.read_excel
Chances are, it is just an issue with a different library version.
I actually had a similar problem which was solved by adding '.xlsx' to the end of my proposed file name:
practicetoexcel.to_excel('Thisxldoc.xlsx', sheet_name = 'Practice')

openpyxl no attribution error

Python 3.5 openpyxl 2.4
Hi everyone, I got a simple but confusing problem here.
FYI the API doc relating to worksheet is
http://openpyxl.readthedocs.io/en/default/api/openpyxl.worksheet.worksheet.html
Here is some simple code for testing.
# -*- coding: utf-8 -*-
from openpyxl import load_workbook
wb2 = load_workbook('example.xlsx')
print (wb2.get_sheet_names())
ws = wb2.get_sheet_by_name('Sheet1')
print (type(ws))
print (ws.calculate_dimension())
list = []
for i in ws.rows:
print ('\n')
for cell in i:
list.append(cell.value)
print(str(cell.value).encode('utf-8'))
print (type(ws))
ws.get_highest_row()
here's what turned out eventually
<class 'openpyxl.worksheet.worksheet.Worksheet'>
Traceback (most recent call last):
File "script.py", line 17, in <module>
ws.get_highest_row()
AttributeError: 'Worksheet' object has no attribute 'get_highest_row'
I run into the problem where it says that get_highest_row is not an attribute.
This seems correct since this function is under class worksheet.worksheet (from API doc), and ws is worksheet.worksheet.Worksheet (I've no idea what that is) may inherits some functions so it can still call dimension(), but can someone tell me how to fix this? I want to check through one specific row or column and do some sorting with varying length of cols and rows.
Any help is appreciated!
According to https://bitbucket.org/openpyxl/openpyxl/issues/278/get_highest_row-column-are-unreliable
In newest openpyxl, which has removed get_highest_row and get_highest_column method. They have been replaced by max_row and max_column property
I tried it with openpyxl 2.3.5 and got the following
/usr/local/lib/python3.5/site-packages/openpyxl/worksheet/worksheet.py:350:
UserWarning: Call to deprecated function or class get_highest_row (Use
the max_row property). def get_highest_row(self):
So as you are using 2.4 they probably removed it from there as it was deprecated already in 2.3.5.
EDIT: In the documentation for 2.4 this method is not mentioned any longer

Categories