Unable to manipulate netCDF4 files using xarray - python

I am following a particular Machine Learning exercise using jupyter notebooks. The link to the exercise is here. I have managed to install all the necessary packages such as netCDF4, rioxarray, xarray and tensorflow. However, I am having trouble reading the netCDF4 datasets using the open_mfdataset function of xarray. This is the error I get if I try to run the following code:
z500 = xr.open_mfdataset('geopotential_500*.nc', combine='by_coords').isel(time=slice(None, None, 12))
z500
The error:
...
ImportError: DLL load failed while importing defs: The specified procedure could not be found.
I also loaded other dependancies of xarray as explained here. However, I cannot proceed to the next step because of the above issue.
Here is my full code:
# Import packages
import xarray as xr
import numpy as np
# Load in the necessary python packages to train in the neural network
import tensorflow.keras as keras
import tensorflow as tf
# Unzip the data
import os
from zipfile import ZipFile
path = "C:/Users/gachuhi/gis800"
filepath = os.path.join(path, 'geopotential_500_5.625deg.zip')
with ZipFile(filepath, 'r') as zObject:
zObject.extractall(path=os.path.join(path, 'geopotential'))
z500 = xr.open_mfdataset('C:/Users/gachuhi/gis800/geopotential/geopotential_500*.nc', combine='by_coords').isel(time=slice(None, None, 12))
z500
The error ---------
C:\Users\gachuhi\anaconda3\envs\python-gis\lib\site-packages\xarray\backends\plugins.py:71: RuntimeWarning: Engine 'rasterio' loading failed:
DLL load failed while importing _version: The specified procedure could not be found.
warnings.warn(f"Engine {name!r} loading failed:\n{ex}", RuntimeWarning)
.....
--snip, too long--
.....
File h5py\h5.pyx:1, in init h5py.h5()
ImportError: DLL load failed while importing defs: The specified procedure could not be found.
I tried importing rasterio, netCDF4 as import netCDF4, but whatever attempt was used, it was impossible to open the dataset as intended. I think the end goal of xr.mf_dataset is to combine the netCDF4 files into one file for further analysis but I have been unable to proceed any further with the exercise from this particular point. How do I solve this?
Full path to zip file is here

Related

import error in jupyter, can't create file in destination [duplicate]

I am trying to load my saved model from s3 using joblib
import pandas as pd
import numpy as np
import json
import subprocess
import sqlalchemy
from sklearn.externals import joblib
ENV = 'dev'
model_d2v = load_d2v('model_d2v_version_002', ENV)
def load_d2v(fname, env):
model_name = fname
if env == 'dev':
try:
model=joblib.load(model_name)
except:
s3_base_path='s3://sd-flikku/datalake/doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
else:
s3_base_path='s3://sd-flikku/datalake/doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
But I get this error:
from sklearn.externals import joblib
ImportError: cannot import name 'joblib' from 'sklearn.externals' (C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\externals\__init__.py)
Then I tried installing joblib directly by doing
import joblib
but it gave me this error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in load_d2v_from_s3
File "/home/ec2-user/.local/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 585, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/ec2-user/.local/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle
obj = unpickler.load()
File "/usr/lib64/python3.7/pickle.py", line 1088, in load
dispatch[key[0]](self)
File "/usr/lib64/python3.7/pickle.py", line 1376, in load_global
klass = self.find_class(module, name)
File "/usr/lib64/python3.7/pickle.py", line 1426, in find_class
__import__(module, level=0)
ModuleNotFoundError: No module named 'sklearn.externals.joblib'
Can you tell me how to solve this?
You should directly use
import joblib
instead of
from sklearn.externals import joblib
It looks like your existing pickle save file (model_d2v_version_002) encodes a reference module in a non-standard location – a joblib that's in sklearn.externals.joblib rather than at top-level.
The current scikit-learn documentation only talks about a top-level joblib – eg in 3.4.1 Persistence example – but I do see a reference in someone else's old issue to a DeprecationWarning in scikit-learn version 0.21 about an older scikit.external.joblib variant going away:
Python37\lib\site-packages\sklearn\externals\joblib_init_.py:15:
DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and
will be removed in 0.23. Please import this functionality directly
from joblib, which can be installed with: pip install joblib. If this
warning is raised when loading pickled models, you may need to
re-serialize those models with scikit-learn 0.21+.
'Deprecation' means marking something as inadvisable to rely-upon, as it is likely to be discontinued in a future release (often, but not always, with a recommended newer way to do the same thing).
I suspect your model_d2v_version_002 file was saved from an older version of scikit-learn, and you're now using scikit-learn (aka sklearn) version 0.23+ which has totally removed the sklearn.external.joblib variation. Thus your file can't be directly or easily loaded to your current environment.
But, per the DeprecationWarning, you can probably temporarily use an older scikit-learn version to load the file the old way once, then re-save it with the now-preferred way. Given the warning info, this would probably require scikit-learn version 0.21.x or 0.22.x, but if you know exactly which version your model_d2v_version_002 file was saved from, I'd try to use that. The steps would roughly be:
create a temporary working environment (or roll back your current working environment) with the older sklearn
do imports something like:
import sklearn.external.joblib as extjoblib
import joblib
extjoblib.load() your old file as you'd planned, but then immediately re-joblib.dump() the file using the top-level joblib. (You likely want to use a distinct name, to keep the older file around, just in case.)
move/update to your real, modern environment, and only import joblib (top level) to use joblib.load() - no longer having any references to `sklearn.external.joblib' in either your code, or your stored pickle files.
You can import joblib directly by installing it as a dependency and using import joblib,
Documentation.
Maybe your code is outdated. For anyone who aims to use fetch_mldata in digit handwritten project, you should fetch_openml instead. (link)
In old version of sklearn:
from sklearn.externals import joblib
mnist = fetch_mldata('MNIST original')
In sklearn 0.23 (stable release):
import sklearn.externals
import joblib
dataset = datasets.fetch_openml("mnist_784")
features = np.array(dataset.data, 'int16')
labels = np.array(dataset.target, 'int')
For more info about deprecating fetch_mldata see scikit-learn doc
none of the answers below works for me, with a little changes this modification was ok for me
import sklearn.externals as extjoblib
import joblib
for this error, I had to directly use the following and it worked like a charm:
import joblib
Simple
In case the execution / call to joblib is within another .py program instead of your own (in such case even you have installed joblib, it still causes error from within the calling python programme unless you change the code, i thought would be messy), I tried to create a hardlink:
(windows version)
Python> import joblib
then inside your sklearn path >......\Lib\site-packages\sklearn\externals
mklink /J ./joblib .....\Lib\site-packages\joblib
(you can work out the above using a ! or %, !mklink....... or %mklink...... inside your Python juptyter notebook , or use python OS command...)
This effectively create a virtual folder of joblib within the "externals" folder
Remarks:
Of course to be more version resilient, your code has to check for the version of sklearn is >= 0.23 again before hand.
This would be alternative to changing sklearn vesrion.
When getting error:
from sklearn.externals import joblib it deprecated older version.
For new version follow:
conda install -c anaconda scikit-learn (install using "Anaconda Promt")
import joblib (Jupyter Notebook)
I had the same problem
What I did not realize was that joblib was already installed!
so what you have to do is replace
from sklearn.externals import joblib
with
import joblib
and that is it
After a long investigation, given my computer setup, I've found that was because an SSL certificate was required to download the dataset.

How to convert a python file in which I import cv2 to a .so file by using cython

I have a python file in which I import cv2.When I do not import cv2,I can use cython to convert the python file to a .so file and I can use it in java by jni.But when I import cv2 and add relative code,the .so file produced by cython can't be use.When I use it in java,it will thrown a error which indicate progress can't import cv2.How to solve this problem?Thanks.
Here is error:
/root/miniconda3/lib/python3.8/site-packages/numpy/__init__.py:142: UserWarning: mkl-service package
failed to import, therefore Intel(R) MKL initialization ensuring its correct out-of-the box operation under condition when Gnu OpenMP had already been loaded by Python process is not assured. Please install mkl-service package, see http://github.com/IntelPython/mkl-service
from . import _distributor_init
OpenCV bindings requires "numpy" package.
Install it via command:
pip install numpy
AttributeError: module '__main__' has no attribute 'JNI_API_testFunction'
error
I use python to run my Test.py file and it work out.It means that my python can import cv2 and numpy correctly.I use this command to convert my Test.py to a .so file:
python Setup.py build_ext --inplace
Here is Setup.py:
from distutils.core import setup
from Cython.Build import cythonize
from distutils.extension import Extension
sourcefiles = ['Test.pyx', 'main.c']
extensions = [Extension("Test", sourcefiles,
include_dirs=['/root/java/jdk-18.0.2/include/',
'/root/java/jdk-18.0.2/include/linux/',
'/root/miniconda3/include/python3.8/'],
library_dirs=['/root/miniconda3/lib/'],
libraries=['python3.8']),
]
setup(ext_modules=cythonize(extensions, language_level = 3))
I use System.load() function to load my .so file in java.If I do not import cv2 and numpy and delete code about cv2 and numpy,I can call testFunction().It means that my java code is correct.But if I import cv2 and numpy and I add code about cv2 and numpy,for example,cv2.imread(),I will get a error.
I think it can find cv2 and numpy because it have run numpy's init.py file.I think the problem result from cython because I get athis information in a Chinese community:
The directory level information of the code file will be lost in the file code processed by cython. There is no difference between the code after c.py conversion and the code after M / c.py conversion.In a.py or b.py code, if there is a reference to C Py module, the loss of directory information will cause the two to report an error when executing import M.C, because the corresponding module cannot be found.
the author say the problem can be solved by modifying the source code of cython,but he don't give the method.

ModuleNotFoundError: No module named 'onnxruntime'

I'm taking a Microsoft PyTorch course and trying to implement on Kaggle Notebooks but I kept having the same error message over and over again: "ModuleNotFoundError: No module named 'onnxruntime'". I've checked everywhere possible if I could find a solution to it but none, I even tried installing it manually using pip in the notebook, but it's still not working. I've checked the official onnxruntime website and documentation but there's nowhere it states anything about something being outdated or any other issue. Someone help. My code won't run because it says "onnxruntime is not defined". Here are my imports:
%matplotlib inline
import torch
import onnxruntime
from torch import nn
import torch.onnx as onnx
import torchvision.models as models
from torchvision import datasets
from torchvision.transforms import ToTensor
and the code cell I'm trying to run
session = onnxruntime.InferenceSession(onnx_model, None)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
result = session.run([output_name], {input_name: x.numpy()})
predicted, actual = classes[result[0][0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')
And you can find the complete notebook here: https://www.kaggle.com/faisalalbasu/complete-model
the error occurs because "import" cannot find onnxruntime in any of the paths, check where import is searching and see if onnxruntime is in there.
check what path pip install installs to, that way in the future you won't have the same problem! :)

Python: Appending system path to import the module

I am trying to import a module from the prgoram called 'Power factory" in Python. The folder where power factory file located looks as follow:
I have written a script to import the powerfactory module as follow:
import sys
sys.path.append("PAth of folder")
import powerfactory as pf
When I ran the above code, it throws the following error:
ImportError: DLL load failed while importing powerfactory: The specified module could not be found.
I copied the .dll file present in the power factory folder into the Python DLL folder but no luck. Could anyone help me where am I making the mistake?
Searching the net I found this (from here)
I am not able to import powerfactory module: DLL load failed Category:
Scripting
If an error message appears when importing the powerfactory module
stating “ DLL load failed: the specified module could not be found”,
this means that Microsoft Visual C++ Redistributable for Visual Studio
2012 package is not installed on the computer.
To overcome this problem the user should add the PowerFactory
installation directory to the os path variable within his python
script.
import os
os.environ["PATH"] = r'C:\Program Files\DIgSILENT\PowerFactory 2016;' + os.environ["PATH"]
Copy the .dll file from your digsilent folder, eg. Program Files\DIgSILENT\PowerFactory 2020 SP2A\Python\3.8\boost_python38-vc141-mt-x64-1_68.dll
and place the .dll file directly in your system!
Save it to your C:\Windows\System32 folder.
&
Save it also to your C:\Windows\SysWOW64 folder.
You should be good to go.

ImportError: No module named examples.tutorials.mnist

I am very frustrated by this error, what I did is getting the code from tensor flow tutorial to import moist:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
However when I run the python shows:
File "/Users/kevinling/Desktop/Machine Learning/tensorflow.py", line 2, in
from tensorflow.examples.tutorials.mnist import input_data
ImportError: No module named examples.tutorials.mnist
When I check into the directory, the file is perfectly there:
And the directory is:
enter image description here
The input_data.py is like:
The input_data.py
Just rename your example from "tensorflow.py" to anything else and it will work. The interpreter is trying to import the necessary files from your script.
Did you already install tensorflow? If not, follow their install instructions or simply install using pip:
pip install tensorflow
Now, make sure you are NOT currently in a folder where tensorflow is located, and try running your script.
python your_script.py

Categories