ValueError and AttributeError in Pycaret - python

I am using pycaret in my new laptop and also after a gap 6 months so not sure if this problem is due to some issue in my laptop or due to some changes in pycaret package itself. Earlier I simply used to create experiment using setup method of pycaret and it used to work. But now it keep raising one error after another. Like I used below 2 lines to setup experiment.
from pycaret.classification import *
exp = setup(data=df.drop(['id'], axis=1), target='cancer', session_id=123)
But this gave error:-
ValueError: Setting a random_state has no effect since shuffle is False. You should leave random_state to its default (None), or set shuffle=True.
Then I changed my second as below-
exp = setup(data=df.drop(['id'], axis=1), target='cancer', session_id=123, fold_shuffle=True, imputation_type='iterative')
Then it returned a new error-
AttributeError: 'Make_Time_Features' object has no attribute 'list_of_features'
I remember well earlier I never had to use these attributes in my setup method. Looks like even default values of attributes in setup method of pycaret are not working. Can anyone suggest me how to troubleshoot this?

Related

Got `AttributeError` from `from_pymc3` of ArviZ

I am learning Bayesian inference by the book Bayesian Analysis with Python. However, when using plot_ppc, I got AttributeError and the warning
/usr/local/Caskroom/miniconda/base/envs/kaggle/lib/python3.9/site-packages/pymc3/sampling.py:1689: UserWarning: samples parameter is smaller than nchains times ndraws, some draws and/or chains may not be represented in the returned posterior predictive sample
warnings.warn(
The model is
shift = pd.read_csv('../data/chemical_shifts.csv')
with pm.Model() as model_g:
μ = pm.Uniform('μ', lower=40, upper=70)
σ = pm.HalfNormal('σ', sd=10)
y = pm.Normal('y', mu=μ, sd=σ, observed=shift)
trace_g = pm.sample(1000, return_inferencedata=True)
If I used the following codes
with model_g:
y_pred_g = pm.sample_posterior_predictive(trace_g, 100, random_seed=123)
data_ppc = az.from_pymc3(trace_g.posterior, posterior_predictive=y_pred_g) # 'Dataset' object has no attribute 'report'
I got 'Dataset' object has no attribute 'report'.
If I used the following codes
with model_g:
y_pred_g = pm.sample_posterior_predictive(trace_g, 100, random_seed=123)
data_ppc = az.from_pymc3(trace_g, posterior_predictive=y_pred_g) # AttributeError: 'InferenceData' object has no attribute 'report'
I got AttributeError: 'InferenceData' object has no attribute 'report'.
ArviZ version: 0.11.2
PyMC3 Version: 3.11.2
Aesara/Theano Version: 1.1.2
Python Version: 3.9.6
Operating system: MacOS Big Sur
How did you install PyMC3: conda
You are passing return_inferancedata=True to pm.sample(), which according to the PyMC3 documentation will return an InferenceData object rather than a MultiTrace object.
return_inferencedatabool, default=False
Whether to return the trace as an arviz.InferenceData (True) object or a MultiTrace (False) Defaults to False, but we’ll switch to True in an upcoming release.
The from_pymc3 function, however, expects a MultiTrace object.
The good news is that from_pymc3 returns an InferenceData object, so you can solve this in one of two ways:
The easiest solution is to simply remove the from_pymc3 calls, since it returns InferenceData, which you already have due to return_inferencedata=True.
Set return_inferencedata=False (you can also remove that argument, but the documentation states that in the future it will default to True, so to be future proof it's best to explicitly set it to False). This will return a MultiTrace which can be passed to from_pymc3.

How to implement refining conflict in constraint programming

I use Docplex with python 3.7 to implement constraints programming. when it was infeasible, how can i proceed to list constraints those was to source of the conflict?
mdl.export_as_cpo(out="/home/..../MCP3.lp")
msol = mdl.solve(FailLimit=700000, TimeLimit=1600)
DInfos= msol.get_solver_infos()
mconflict=msol.CpoRefineConflictResult()
mconflict.get_all_member_constraints()
Error message:
mconflict=msol.CpoRefineConflictResult()
AttributeError: 'CpoSolveResult' object has no attribute 'CpoRefineConflictResult'
solve returns a SolveResult, and CpoRefineConflictResult is a class in docplex.cp.solution. So, the error message is correct: a SolveResult does not have an attribute CpoRefineConflictResult. You'd expect the CpoRefineConflictResult as the result of the conflict refiner.
You should probably read through the documentation a bit more http://ibmdecisionoptimization.github.io/docplex-doc/cp/docplex.cp.solution.py.html
You can call the .refine_conflict() method on the CpoSolver object to obtain a CpoRefineConflictResult, as documented here http://ibmdecisionoptimization.github.io/docplex-doc/cp/docplex.cp.solver.solver.py.html#detailed-description
Perhaps you can provide a minimal, reproducible example, if you need a more specific solution to your problem. https://stackoverflow.com/help/minimal-reproducible-example
I have add:
from docplex.cp.solver.solver import CpoSolver
After, i have add those lines if the model is infeasible:
mconfl= CpoSolver(model)
mconf = mconfl.refine_conflict()

"SystemError: <class 'int'> returned a result with an error set" in Python

I wanted to apply a very simple function using ndimage.generic_filter() from scipy. This is the code:
import numpy as np
import scipy.ndimage as ndimage
data = np.random.rand(400,128)
dimx = int(np.sqrt(np.size(data,0)))
dimy = dimx
coord = np.random.randint(np.size(data,0), size=(dimx,dimy))
def test_func(values):
idx_center = int(values[4])
weight_center = data[idx_center]
weights_around = data[values]
differences = weights_around - weight_center
distances = np.linalg.norm(differences, axis=1)
return np.max(distances)
results = ndimage.generic_filter(coord,
test_func,
footprint = np.ones((3,3)))
When I execute it though, the following error shows up:
SystemError: <class 'int'> returned a result with an error set
when trying to coerce values[4] to an int. If I run the function test_func() without using ndimage.generic_filter() for a random array values, the function works alright.
Why is this error occurring? Is there a way to make it work?
For your case:
This must be a bug in either Python or SciPy. Please file a bug at https://bugs.python.org and/or https://www.scipy.org/bug-report.html. Include the version numbers of Python and NumPy/SciPy, the full code that you have here, and the entire traceback.
(Also, if you can find a way to trigger this bug that doesn't require the use of randomness, they will likely appreciate it. But if you can't find such a method, then please do file it as-is.)
In general:
"[R]eturned a result with an error set" is something that can only be done at the C level.
In general, the Python/C API expects most C functions to do one of two things:
Set an exception using one of these functions and return NULL (corresponds to throwing an exception).
Don't set an exception and return a "real" value, usually a PyObject* (corresponds to returning a value, including returning None).
These two cases are normally incorrect:
Set an exception (or fail to clear one that already exists), but then return some value other than NULL.
Don't set an exception, but then return NULL.
Python is raising a SystemError because the implementation of int, in the Python standard library, tried to do (3), possibly as a result of SciPy doing it first. This is always wrong, so there must be a bug in either Python or the SciPy code that it called into.
I was having a very similar experience with Python 3.8.1 and SciPy 1.4.1 on Linux. A workaround was to introduce np.floor so that:
centre = int(window.size / 2) becomes centre = int(np.floor(window.size/2))
which seems to have resolved the issue.

Running embedded R gives a NameError

I'm a python newbie trying to follow along with this great blog on finding seasonal customers:
Python Code for Identifying Seasonal Customers
However, I am stuck on one of the last steps. The code is this:
customerTS = stats.ts(dataForOwner.SENDS.astype(int),
start=base.c(startYear,startMonth),
end=base.c(endYear, endMonth),
frequency=12)
I get this error: NameError: name 'dataForOwner' is not defined
Edit I should add that this last line is also in the code block but I still get error without including:
customerTS = stats.ts(dataForOwner.SENDS.astype(int),
start=base.c(startYear,startMonth),
end=base.c(endYear, endMonth),
frequency=12)
r.assign('customerTS', customerTS)
I have googled quite a bit and having no luck getting it to work.
NameError: name 'dataForOwner' is not defined
Is raised by Python itself to indicate it is unable to find an object called dataForOWner in the current context. To experience it yourself, just start a new Python terminal and type x (a variable name that does not exist).
The issue is either with the blog your refer to (the definition of dataForOwner is missing) or with that definition forgotten by a user trying to reproduce that blog.

PyML: graphing the decision surface

PyML has a function for graphing decision surfaces.
First you need to tell PyML which data to use. Here I use a sparsevectordata with my feature vectors. This is the one I used to train my SVM.
demo2d.setData(training_vector)
Then you need to tell it which classifier you want to use. I give it a trained SVM.
demo2d.decisionSurface(best_svm, fileName = "dec.pdf")
However, I get this error message:
Traceback (most recent call last):
**deleted by The Unfun Cat**
demo2d.decisionSurface(best_svm, fileName = "dec.pdf")
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyML/demo/demo2d.py", line 140, in decisionSurface
results = classifier.test(gridData)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyML/evaluators/assess.py", line 45, in test
classifier.verifyData(data)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyML/classifiers/baseClassifiers.py", line 55, in verifyData
if len(misc.intersect(self.featureID, data.featureID)) != len(self.featureID) :
AttributeError: 'SVM' object has no attribute 'featureID'
I'm going to dive right into the source, because I have never used PyML. Tried to find it online, but I couldn't track down the verifyData method in the PyML 0.7.2 that was online, so I had to search through downloaded source.
A classifier's featureID is only set in the baseClassifier class's train method (lines 77-78):
if data.__class__.__name__ == 'VectorDataSet' :
self.featureID = data.featureID[:]
In your code, data.__class__.__name__ is evaluating to "SparseDataSet" (or what ever other class you are using) and the expression evaluates to False (never setting featureID).
Then in demo2d.decisionSurface:
gridData = VectorDataSet(gridX)
gridData.attachKernel(data.kernel)
results = classifier.test(gridData)
Which tries to test your classifier using a VectorDataSet. In this instance classifier.test is equivalent to a call to the assess.test method which tries to verify if the data has the same features the training data had by using baseClassifier.verifyData:
def verifyData(self, data) :
if data.__class__.__name__ != 'VectorDataSet' :
return
if len(misc.intersect(self.featureID, data.featureID)) != len(self.featureID) :
raise ValueError, 'missing features in test data'
Which then tests the class of the passed data, which is now "VectorDataSet", and proceeds to try to access the featureID attribute that was never created.
Basically, it's either a bug, or a hidden feature.
Long story short, You have to convert your data to a VectorDataSet because SVM.featureID is not set otherwise.
Also, you don't need to pass it a trained data set, the function trains the classifier for you.
Edit:
I would also like to bring attention to the setData method:
def setData(data_) :
global data
data = data_
There is no type-checking at all. So someone could potentially set data to anything, e.g. an integer, a string, etc., which will cause an error in decisionSurface.
If you are going to use setData, you must use it carefully (only with a VectorDataSet), because the code is not as flexible as you would like.

Categories