pytorch dataset python script does not run in colab

pytorch dataset python script does not run in colab - python

I've decided to make reusable scripts for frequently used classes. So, i made one for my image dataset and imported it in collab. I can create the dataset object successfully but can't get data from it.
here is my dataset code:
https://pastebin.com/XfAiTe3A
Here is how i use the script in google colab:
from imageDataset import customDataset
dataset = customDataset(train_data)\
dataset[0]
Here is the error:
16 target = self.targets[index]
---> 17 image = io.imread(self.image_paths[index])
18
19 if self.augmentations is not None:
SystemError: <built-in function imread> returned NULL without setting an error
But if i copy paste the code in a jupyter cell , i can use the class like i normally do. What I'm i doing wrong?
any help is appreciated, thanks

I just had to rerun the notebook and the problem fixed itself. Coding truly amazes me

Related

Imorting zero_gradients from torch.autograd.gradcheck

I want to replicate the code here, and I get the following error while running in Google Colab?
ImportError: cannot import name 'zero_gradients' from
'torch.autograd.gradcheck'
(/usr/local/lib/python3.7/dist-packages/torch/autograd/gradcheck.py)
Can someone help me with how to solve this?

This seems like it's using a very old version of PyTorch, the function itself is not available anymore. However, if you look at this commit, you will see the implementation of zero_gradients. What it does is simply zero out the gradient of the input:
def zero_gradients(i):
for t in iter_gradients(i):
t.zero_()
Then zero_gradients(x) should be the same as x.zero_grad(), which is the current API, assuming x is a nn.Module!
Or it's just:
if x.grad is not None:
x.grad.zero_()

save instance of custom python class to file

I would like to save to a file multiple instances of a custom Python class.
The class is Loess, taken from https://github.com/joaofig/pyloess, which performs localised regression.
Below's a MWE of the saving process:
import pickle
import numpy as np
from Loess import Loess
xarr = np.linspace(0, 4, 100) * np.pi
yarr = 2*np.sin(xarr) + np.random.rand(len(xarr))
loess = Loess(xarr, yarr)
with open("localised_regression.pkl", "wb") as output:
pickle.dump(loess, output)
and now the retrieval process
import pickle
with open("localised_regression.pkl", 'rb') as input_:
localised_regression = pickle.load(input_)
When I do this on a jupyter notebook (run the first snippet on one notebook and the second on another) it works perfectly.
Dumping the instance of Loess from a notebook and retrieving it from terminal or another machine, it doesn't work.
I get ModuleNotFoundError: No module named 'Loess' error message
I even tried importing the module in python session where I attempt the retrieval, but nothing changes.
It seems that it only works from within the same location where the dumping was performed.
I'm using Python 3.7.7 and the same conda environment for both Python shell and jupyter notebook.
I examined other answers (like how to save/read class wholly in Python) but no luck.
I've tried saving to numpy file, but same story.
I've also tried dumping with marshal and json, but both complained.
Does anybody have a suggestion on how to solve this? Thank you

No html webpage shown from tensoflow data validation visualize_statistics() when run from databricks notebook

I am trying to use tensorflow (2.2) data validation (TFDV version: 0.22.2) to visualize data on databricks GPU cluster.
From databricks notebook, I am running the code at :
https://nbviewer.jupyter.org/github/tensorflow/tfx/blob/master/docs/tutorials/data_validation/tfdv_basic.ipynb
But, when I run
tfdv.visualize_statistics(train_stats)
I got:
<IPython.core.display.HTML object>
no html webpage shown.
I have tried to update matlibplot but it does not work.
I have also tried https://python-forum.io/Thread-How-to-display-IPython-core-display-HTML-object
and How to embed HTML into IPython output?
but still no html shown.
Could anybody help me about this ?
thanks
UPDATE
I have tried :
html = tfdv.visualize_statistics(train_stats).data
got:
<IPython.core.display.HTML object>
AttributeError: 'NoneType' object has no attribute 'data'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<command-2488671> in <module>
----> 1 html = tfdv.visualize_statistics(train_stats).data
AttributeError: 'NoneType' object has no attribute 'data'

This can be fixed by importing the function that generates the HTML objects and calling those instead of the visualize functions. Then, visualize those functions with the DataBricks displayHTML function.
from tensorflow_data_validation.utils.display_util import get_statistics_html
displayHTML(get_statistics_html(train_stats))
The issue is that the tfdv utility notebook imports Ipython display functionality, and overrides the DataBricks display function with the Ipython display function inside visualize functions.
try:
# pylint: disable=g-import-not-at-top
from IPython.display import display
from IPython.display import HTML
except ImportError as e:
The display_anomalies function has a similar issue and can be solved by importing the get_anomalies_dataframe function directly and displaying the resulting pandas dataframe.

this works perfectly in a jupyter notebook, which is required to visualize this <IPython.core.display.HTML object>
You can get the HTML code with:
html = tfdv.visualize_statistics(train_stats).data

I was able to display the statistics with a workaround:
I copy-pasted to my databricks notebook most of the code in this page and modified this function so that instead of displaying the html it returned it. Like this:
def visualize_statistics(
lhs_statistics: statistics_pb2.DatasetFeatureStatisticsList,
rhs_statistics: Optional[
statistics_pb2.DatasetFeatureStatisticsList] = None,
lhs_name: Text = 'lhs_statistics',
rhs_name: Text = 'rhs_statistics',
allowlist_features: Optional[List[types.FeaturePath]] = None,
denylist_features: Optional[List[types.FeaturePath]] = None) -> None:
"""Visualize the input statistics using Facets.
Args:
lhs_statistics: A DatasetFeatureStatisticsList protocol buffer.
rhs_statistics: An optional DatasetFeatureStatisticsList protocol buffer to
compare with lhs_statistics.
lhs_name: Name of the lhs_statistics dataset.
rhs_name: Name of the rhs_statistics dataset.
allowlist_features: Set of features to be visualized.
denylist_features: Set of features to ignore for visualization.
Raises:
TypeError: If the input argument is not of the expected type.
ValueError: If the input statistics protos does not have only one dataset.
"""
assert (not allowlist_features or not denylist_features), (
'Only specify one of allowlist_features and denylist_features.')
html = get_statistics_html(lhs_statistics, rhs_statistics, lhs_name, rhs_name,
allowlist_features, denylist_features)
return html
After that you can simply do:
displayHTML(visualize_statistics(train_stats))
I know, it's not ideal, but it worked.

How to create a class in Python to work with an external API (SAP2000)

I´m quite new in programming and even more when it comes to Object Oriented programming. I’m trying to connect through Python to an external software (SAP2000, an structural engineering software). This program comes with an API to connect and there is an example in the help (http://docs.csiamerica.com/help-files/common-api(from-sap-and-csibridge)/Example_Code/Example_7_(Python).htm).
This works pretty well but I would like to divide the code so that I can create one function for opening the program, several functions to work with it and another one to close. This would provide me flexibility to make different calculations as desired and close it afterwards.
Here is the code I have so far where enableloadcases() is a function that operates once the instance is created.
import os
import sys
import comtypes.client
import pandas as pd
def openSAP2000(path,filename):
ProgramPath = "C:\Program Files (x86)\Computers and Structures\SAP2000 20\SAP2000.exe"
APIPath = path
ModelPath = APIPath + os.sep + filename
mySapObject = comtypes.client.GetActiveObject("CSI.SAP2000.API.SapObject")
#start SAP2000 application
mySapObject.ApplicationStart()
#create SapModel object
SapModel = mySapObject.SapModel
#initialize model
SapModel.InitializeNewModel()
ret = SapModel.File.OpenFile(ModelPath)
#run model (this will create the analysis model)
ret = SapModel.Analyze.RunAnalysis()
def closeSAP2000():
#ret = mySapObject.ApplicationExit(False)
SapModel = None
mySapObject = None
def enableloadcases(case_id):
'''
The function activates LoadCases for output
'''
ret = SapModel.Results.Setup.SetCaseSelectedForOutput(case_id)
From another module, I call the function openSAP2000() and it works fine but when I call the function enableloadcases() an error says AttributeError: type object ‘SapModel’ has no attribute ‘Results’.
I believe this must be done by creating a class and after calling the functions inside but I honestly don´t know how to do it.
Could you please help me?
Thank you very much.

Thank you very much for the help. I managed to solve the problem. It was as simple and stupid as marking SapModel variable as global.
Now it works fine.
Thank you anyway.

How to Deploy Amazon-SageMaker Locally in Python

I trained my model in Amazon-SageMaker and downloaded it to my local computer. Unfortunately, I don't have any idea how to run the model locally.
The Model is in a directory with files like:
image-classification-0001.params
image-classification-0002.params
image-classification-0003.params
image-classification-0004.params
image-classification-0005.params
image-classification-symbol.json
model-shapes.json
Would anyone know how to run this locally with Python, or be able to point me to a resource that could help? I am trying to avoid calling the model using the Amazon API.
Edit: The model I used was created with code very similar to this example.
Any help is appreciated, I will award the bounty to whoever is most helpful, even if they don't completely solve the question.

This is not a complete answer as I do not have SageMaker setup (And I do not know MXNet) and so I can not practically test this approach (yes, as already mentioned, I do not want to call this a complete answer rather a probable pointer/approach to solve this issue).
The Assumption -
You mentioned a that your model is very similar to the notebook link you provided. If you read the text in the notebook carefully, you will see at some point there is something like this -
"In this demo, we are using Caltech-256 dataset, which contains 30608 images of 256 objects. For the training and validation data, we follow the splitting scheme in this MXNet example."
See the mention of MXNet there? Let us assume that you did not change a lot and hence your model is built using MXNet as well.
The Approach -
Assuming what I just mentioned, if you go and search in the documentation of AWS SageMaker Python SDK you will see a section about serialization of the modules. Which again, by itself, starts with another assumption -
"If you train function returns a Module object, it will be serialized by the default Module serialization system, unless you've specified a custom save function."
Assuming that this is True for your case, further reading in the same document tells us that "model-shapes.json" is a JSON serialised representation of your models, "model-symbol.json" is the serialization of the module symbols created by calling the 'save' function on the 'symbol' property of module, and finally "module.params" is the serialized (I am not sure if it is text or binary format) form of the module parameters.
Equipped with this knowledge we go and look into the documentation of MXNet. And Voila! We see here how we can save and load models with MXNet. So as you already have those saved files, you just need to load them in a local installation of MXNet and then run them to predict the unknown.
I hope this will help you to find a direction to solve your problem.
Bonus -
I am not sure if this also can do the same job, (it is also mentioned by #Seth Rothschild in the comments) but it should, you can see that AWS SageMaker Python SDK has a way to load models from saved ones as well.

Following SRC's advice, I was able to get it to work by following the instructions in this question and this doc which describe how to load a MXnet model.
I loaded the model like so:
lenet_model = mx.mod.Module.load('model_directory/image-classification',5)
image_l = 64
image_w = 64
lenet_model.bind(for_training=False, data_shapes=[('data',(1,3,image_l,image_w))],label_shapes=lenet_model._label_shapes)
Then predicted using the slightly modified helper functions in the previously linked documentation:
import mxnet as mx
import matplotlib.pyplot as plot
import cv2
import numpy as np
from mxnet.io import DataBatch
def get_image(url, show=False):
# download and show the image
fname = mx.test_utils.download(url)
img = cv2.cvtColor(cv2.imread(fname), cv2.COLOR_BGR2RGB)
if img is None:
return None
if show:
plt.imshow(img)
plt.axis('off')
# convert into format (batch, RGB, width, height)
img = cv2.resize(img, (64, 64))
img = np.swapaxes(img, 0, 2)
img = np.swapaxes(img, 1, 2)
img = img[np.newaxis, :]
return img
def predict(url, labels):
img = get_image(url, show=True)
# compute the predict probabilities
lenet_model.forward(DataBatch([mx.nd.array(img)]))
prob = lenet_model.get_outputs()[0].asnumpy()
# print the top-5
prob = np.squeeze(prob)
a = np.argsort(prob)[::-1]
for i in a[0:5]:
print('probability=%f, class=%s' %(prob[i], labels[i]))
Finally I called the prediction with this code:
labels = ['a','b','c', 'd','e', 'f']
predict('https://eximagesite/img_tst_a.jpg', labels )

If you want to host your trained model locally, and you are using Apache MXNet as your model framework (as you have in the above example), the simplest way is to use MXNet Model Server: https://github.com/awslabs/mxnet-model-server
Once you installed it locally, you can start serving using:
mxnet-model-server \
--models squeezenet=https://s3.amazonaws.com/model-server/models/squeezenet_v1.1/squeezenet_v1.1.model
and then call the local endpoint with the image
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/squeezenet/predict -F "data=#kitten.jpg"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.