'Tables' not recognizing 'isHDF5File' - python

I am writing a code that creates an HDF5 that can later be used for data analysis. I load the following packages:
import numpy as np
import tables
Then I use the tables module to determine if my file is an HDF5 file with:
tables.isHDF5File(FILENAME)
This normally would print either TRUE or FALSE depending on if the file type is actually an HDF5 file or not. However, I get the error:
AttributeError: module 'tables' has no attribute 'isHDF5File'
So I tried:
from tables import isHDF5File
and got the error:
ImportError: cannot import name 'isHDF5File'
I've tried this code on another computer, and it ran fine. I've tried updating both numpy and tables with pip but it states that the file is already up to date. Is there a reason 'tables' isn't recognizing 'isHDF5File' for me? I am running this code on a Mac (not working) but it worked on a PC (if this matters).

Do you have the function name right?
In [21]: import tables
In [22]: tables.is_hdf5_file?
Docstring:
is_hdf5_file(filename)
Determine whether a file is in the HDF5 format.
When successful, it returns a true value if the file is an HDF5
file, false otherwise. If there were problems identifying the file,
an HDF5ExtError is raised.
Type: builtin_function_or_method
In [23]:

Related

Unable to view dataframes in Spyder's variable explorer "Can't get attribute '_unpickle_block' on"

I'm using Python 3.8.13 and Pandas 1.4.1 on Spyder 4.1.5.
I have no problem reading the dataframe into memory, but when I try to open the dataframe for viewing in the variable explorer I get the following:
"Spyder was unable to retrieve the value of this variable from the console. The error message was: Can't get attribute '_unpickle_block' on".
This is regardless of size (e.g. if I create a variable containing just the head of the dataset, the issue persists). Not sure why this specific error is showing as I don't believe I'm using a pickled object at any point.
I've confirmed that I can open the dataframe when using a Python 3.9 env with Spyder 5.1.5, but I need to use the former specs given they're locked in on a virtual machine. Here's my code as of now (very straightforward):
import pandas as pd
import numpy as np
filepath = "../../../projects2/"
df = pd.read_parquet(filepath + "hmda/lar/parquet/hmda_lar_2017.parquet")

When importing into Google Collab: TypeError: 'str' object is not callable

using pandas in google colaboratory, I am attempting to import a .csv file named 'gifted.csv'. Using the following code:
df=pd.read_csv('/content/gifted.csv')
I have ran the pandas library as pd, but whenever I run the code, it does not function and the following error appears.
enter code hereTypeError: 'str' object is not callable
i dont know where the csv is located but try
df=pd.read_csv('content/gifted.csv')
without the '/' before the content
the error does not imply for it but try it.
more check about the import did the package installed well.

Python - "KeyError : System.Object" - Pyadomd - Querying a SSAS Data Source

Working on a project where I am trying to query a SSAS data source we have at work through Python. The connection is presently within Excel files, but I am trying to reverse engineer the process with Python to automate part of the analysis I do on a day to day... I use the pyadomd library to connect to the data source, here`s my code:
clr.AddReference(r"C:\Program Files (x86)\Microsoft Office\root\vfs\ProgramFilesX86\Microsoft.NET\ADOMD.NET\130\Microsoft.AnalysisServices.AdomdClient.dll")
clr.AddReference('Microsoft.AnalysisServices.AdomdClient')
from Microsoft.AnalysisServices.AdomdClient import AdomdConnection , AdomdDataAdapter
from sys import path
path.append('C:\Program Files (x86)\Microsoft Office\root\vfs\ProgramFilesX86\Microsoft.NET\ADOMD.NET\130\Microsoft.AnalysisServices.AdomdClient.dll')
import pyadomd
from pyadomd import Pyadomd
from pyadomd._type_code import adomd_type_map, convert
constr= "connection string"
with Pyadomd(constr) as conn:
with conn.cursor().execute(query) as cur:
print(cur.fetchall())
Which works (in part), seemingly I am able to connect to the SSAS data source. Say I do conn = Pyadomd(constr), it returns no error (no more as it did before). The issue is when I try to execute the query with the cursor it returns an error saying:
File "C:\Users\User\Anaconda3\lib\site-packages\pyadomd\pyadomd.py", line 71, in execute
adomd_type_map[self._reader.GetFieldType(i).ToString()].type_name
KeyError: 'System.Object'
By doing a bit of research, I found that KeyError meant that the code was trying to access a key within a dictionary in which that key isn't present. By digging through my variables and going through the code, I realized that the line:
from pyadomd._type_code import adomd_type_map
Created this dictionary of keys:values:
See dictionary here
Containing these keys: System.Boolean, System.DateTime, System.Decimal, System.Double, System.Int64, System.String. I figured that the "KeyError: System.Object" was referring to that dictionary. My issue is how can I import this System.Object key to that dictionary? From which library/module/IronPython Clr reference can I get it from?
What I tried:
clr.AddReference("System.Object")
Gave me error message saying "Unable to find assembly 'System.Object'. at Python.Runtime.CLRModule.AddReference(String name)"
I also tried:
from System import Object #no error but didn't work
from System import System.Object #error saying invalid syntax
I think it has to do with some clr.AddReference IronPython thing that I am missing, but I've been looking everywhere and can't find it.
Thanks!
Glad that the newer version solved the problem.
A few comments to the code snippet above. It can be done a bit more concise 😊
Pyadomd will import the necessary classes from the AdomdClient, which means that the following lines can be left out.
clr.AddReference(r"C:\Program Files (x86)\MicrosoftOffice\root\vfs\ProgramFilesX86\Microsoft.NET\ADOMD.NET\130\Microsoft.AnalysisServices.AdomdClient.dll")
clr.AddReference('Microsoft.AnalysisServices.AdomdClient')
from Microsoft.AnalysisServices.AdomdClient import AdomdConnection , AdomdDataAdapter
Your code will then look like this:
import pandas as pd
from sys import path
path.append(r'C:\Program Files (x86)\MicrosoftOffice\root\vfs\ProgramFilesX86\Microsoft.NET\ADOMD.NET\130')
from pyadomd import Pyadomd
constr= "constring"
query = "query"
with Pyadomd(constr) as con:
with con.cursor().execute(query) as cur:
DF = pd.DataFrame(cur.fetchone(), columns = [i.name for i in cur.description])
The most important thing is to add the AdomdClient.dll to your path before importing the pyadomd package.
Furthermore, the package is mainly meant to be used with CPython version 3.6 and 3.7.
Well big problems require big solutions..
After endlessly searching the web, I went on https://pypi.org/project/pyadomd/ and directly contacted the author of the package (SCOUT). Emailed him the same question and apparently there was a bug within the code that he fixed overnight and produced a new version of the package, going from 0.0.5 to 0.0.6. In his words:
[Hi,
Thanks for writing me 😊
I investigated the error, and you are correct, the type map doesn’t support converting System.Object.
That is a bug!
I have uploaded a new version of the Pyadomd package to Pypi which should fix the bug – Pyadomd will now just pass a System.Object type through as a .net object. Because Pyadomd doesn’t know the specifics of the System.Object type at runtime, you will then be responsible yourself to convert to a python type if necessary.
Please install the new version using pip.]1
So after running a little pip install pyadomd --upgrade, I restarted Spyder and retried the code and it now works and I can query my SSAS cube !! So hopefully it can help others.
Snippet of the code:
import pandas as pd
import clr
clr.AddReference(r"C:\Program Files (x86)\MicrosoftOffice\root\vfs\ProgramFilesX86\Microsoft.NET\ADOMD.NET\130\Microsoft.AnalysisServices.AdomdClient.dll")
clr.AddReference('Microsoft.AnalysisServices.AdomdClient')
from Microsoft.AnalysisServices.AdomdClient import AdomdConnection , AdomdDataAdapter
from sys import path
path.append(r'C:\Program Files (x86)\MicrosoftOffice\root\vfs\ProgramFilesX86\Microsoft.NET\ADOMD.NET\130\Microsoft.AnalysisServices.Ado mdClient.dll')
import pyadomd
from pyadomd import Pyadomd
constr= "constring"
query = "query"
and then as indicated on his package website:
with Pyadomd(constr) as con:
with con.cursor().execute(query) as cur:
DF = pd.DataFrame(cur.fetchone(), columns = [i.name for i in cur.description])
and bam! 10795 rows by 39 columns DataFrame, I haven't precisely calculated time yet, but looking good so far considering the amount of data.

Python read pickle protocol 4 error: STACK_GLOBAL requires str

In Python 3.7.5, ubuntu 18.04, pickle read gives error,
pickle version 4
Sample code:
import pickle as pkl
file = open("sample.pkl", "rb")
data = pkl.load(file)
Error:
UnpicklingError Traceback (most recent call
last)
in
----> 1 data = pickle.load(file)
UnpicklingError: STACK_GLOBAL requires str
Reading from same file object solves problem.
Reading using pandas also gives same problem
I also has this error turned out I was opening a numpy file with pickle. ;)
Turns out it is known issue. There is issue page in
github
I had this problem and just added pckl to the end of the file name.
My problem was that I was trying to pickle and un-pickle across different python environments - watch out to make sure your pickle versions match!
Perhaps this will be the solution to this error for someone.
I needed to load a numpy array:
torch.load(file)
When I loaded the array, this error appeared. All that is needed is to turn the array into a tensor.
For example:
result = torch.from_numpy(np.load(file))

Geopandas throws driver error when reading shp file

Geopandas is throwing a driver error when reading a SHP file.
DriverError: '*PATH*/cb_2018_us_zcta510_500k.shp does not exist in the file system, and is not recognized as a supported dataset name.
All I am doing is this:
import geopandas
geopandas.read_file("*PATH*/cb_2018_us_zcta510_500k.shp")
The directory this pulls from includes all the other needed files downloaded from here:
https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html
and the actual files are here:
https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_zcta510_500k.zip
Just to confirm that the file is not corrupt or anything I opened it up in QGis and it pulled up perfectly.
In case someone else needs similar info: I, too, had a legit shapefile URL, GeoPandas read_file threw an error: DriverError not recognized as a supported file format.
What worked for me is the following:
import fiona
with fiona.open('/path/to/my_shapefile.shp') as shp:
ax = geo.plot()
#...rest of code

Categories