Using huggingface fill-mask pipeline to get more than 5 suggestions

Using huggingface fill-mask pipeline to get more than 5 suggestions - python

The below lets me get 5 suggestions for the masked token, but i'd like to get 10 suggestions - does anyone know if this is possible with hugging face?
!pip install -q transformers
from __future__ import print_function
import ipywidgets as widgets
from transformers import pipeline
nlp_fill = pipeline('fill-mask')
nlp_fill("I am going to guess <mask> in this sentence")

I would like to add that the parameter was changed to top_k.
It can be passed to each individual call of nlp_fill as well as the pipeline method.

Again this is an unfortunate shortcoming of the "under construction" documentation.
If you look closely at the parameters of the FillMaskPipeline (which is what pipeline('fill-mask') constructs, see here),
then you will find that it has a topk=5 parameter, which you can simply set to a value of your liking by specifying it in the pipeline constructor:
from transformers import pipeline
nlp_fill = pipeline('fill-mask', topk=10)

Related

Error in importing jaccard_similarity_score

I am trying to import 'jaccard_similarity_score' from 'sklearn' package. But unable to do so. Upon running the cell in Jupyter Notebook, I get an error. I tried restarting the kernel (as mentioned in one of the posts of stackoverflow) but that didn't work for me. I've attached the the screenshot of the error:
Any help is appreciated. Thanks in advance.

In the last version of sklearn, this function is renamed as 'jaccard_score'.

importing has changed due to recent updates.
Instead of writing :
from sklearn.metrics import jaccard_similarity_score
you should write : from sklearn.metrics import jaccard_score
note: new parameter pos_label is required, for example:
jaccard_score(y_test, dt_yhat,pos_label = "PAIDOFF")
Valid labels for pos_label are: array(['COLLECTION', 'PAIDOFF'], dtype='<U10')
References :
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html#sklearn.metrics.jaccard_score
https://github.com/DiamondLightSource/SuRVoS/issues/103#issuecomment-731122304

Instead of importing jaccard_similarity_score, you should import jaccard_score. Keep in mind that jaccard_score needs another parameter in the arguments passed, which is pos_label.

'NearMiss' object has no attribute '_validate_data'

Detailed Image
This is the code below which shows the error.
from imblearn.under_sampling import NearMiss
nm = NearMiss()
X_res,y_res=nm.fit_sample(X,Y)

You are probably trying to under sample your imbalanced dataset. For this purpose, you can use RandomUnderSampler instead of NearMiss.
Try the following code:
from imblearn.under_sampling import RandomUnderSampler
under_sampler = RandomUnderSampler()
X_res, y_res = under_sampler.fit_resample(X, y)
Now, your dataset is balanced. You can verify it using y_res.value_counts().
Cheers!

Instead of "imblearn" package my conda installed a package named "imbalanced-learn" that's why it does not take the data. But it is strange that the jupyter notebook doesn't tell me that "imblearn" isn't installed.

No module named 'sklearn.utils.linear_assignment_'

I am trying to run a project from github , every object counter applications using sort algorithm. I can't run any of them because of a specific error, attaching errors screenshot. Can anyone help me about fixing this issue?

The linear_assignment function is deprecated in 0.21 and will be removed from 0.23, but sklearn.utils.linear_assignment_ can be replaced by scipy.optimize.linear_sum_assignment.
You can use:
from scipy.optimize import linear_sum_assignment as linear_assignment
then you can run the file and don't need to change the code.

pip install scikit-learn==0.22.2

As yiakwy points out in a github comment the scipy.optimize.linear_sum_assignment is not the perfect replacement:
I am concerned that linear_sum_assignment is not equivalent to linear_assignment which later implements "maximum values" matching strategy not "complete matching" strategy, i.e. in tracking problem maybe an old landmark lost and a new detection coming in. We don't have to make a complete assignment, just match as more as possible.
I have found this out while trying to use it inside SORT-based yolo tracking code which that replacement broke (I was lucky that it did otherwise, I would get wrong results from the experiments without realising it...)
Instead, I suggest copying the module itself to the last version of sklearn and include as module in your code.
https://github.com/scikit-learn/scikit-learn/blob/0.22.X/sklearn/utils/linear_assignment_.py
For instance if you copy this file into an utils directory import with from utils.linear_assignment_ import linear_assignment

Solution
Use pip to install lap and optionally scipy
Uncomment the import and use the following function
def linear_assignment(cost_matrix):
try:
import lap
_, x, y = lap.lapjv(cost_matrix, extend_cost=True)
return np.array([[y[i], i] for i in x if i >= 0])
except ImportError:
from scipy.optimize import linear_sum_assignment
x, y = linear_sum_assignment(cost_matrix)
return np.array(list(zip(x, y)))

You are getting this error because you haven't install scikit module yet.
Install scikit-learn module from https://pypi.org/project/scikit-learn/

Name 'RandomUnderSampler' is not defined

I'm trying to use RandomUnderSampler. I have correctly installed the imblearn module. But still getting the error: "Name 'RandomUnderSampler" is not defined`. Any specific reason for this? Can someone please help
from imblearn.under_sampling import RandomUnderSampler
#Random under-sampling and over-sampling with imbalanced-learn
def random_under_sampling(X,Y):
rus = RandomUnderSampler(return_indices=True)
X_rus, y_rus, id_rus = rus.fit_sample(X, Y)
print('Removed indexes:', id_rus)
plot_2d_space(X_rus, y_rus, 'Random under-sampling')
The actual method name
This is where I called my method

Since it seems that you are using IPython it is important that you execute first the line importing imblearn library (e.g. Ctrl-Enter):
from imblearn.under_sampling import RandomUnderSampler
After that the module should get imported and the name of the function is going to be defined.
If this does not work, could you reload the notebook and execute all the statements up until the random_under_sampling function to ensure nothing was missed?

Tensorrt Plugin and caffe parser in python

I am new to Tensorrt and I am not so familiar with C language also. May I ask if there is any example to import caffe modell(caffeparser) and at the same time to use plugin with python. Plugin library example: "https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/_nv_infer_plugin_8h_source.html".
I saw an example doing something like the below. Is it necessary to modify the the pluginfactory class? or it has been already done with the python plugin api?
import tensorrt
import tensorrtplugins
from tensorrt.plugins import _nv_infer_plugin_bindings as nvinferplugin
from tensorrt.parsers import caffeparser
plugin_factory = tensorrtplugins.FullyConnectedPluginFactory()
parser = caffeparser.create_caffe_parser()
parser.set_plugin_factory(plugin_factory)
engine = trt.utils.caffe_to_trt_engine(G_LOGGER,
MODEL_PROTOTXT,
CAFFE_MODEL,
1,
1 << 20,
OUTPUT_LAYERS,
trt.infer.DataType.FLOAT,
plugin_factory
)
P.s: I am trying to convert YOLO2 to Tensorrt format. Therefore, some layers(e.g kYOLOREORG and kPRELU) can only be supported by the plugin.
Another way to do so is to add the plugin during constructing the network, by method network.add_plugin_ext() ?However, I am not so sure how to specify the previous layer that is going to be imported later.
Thank you so much for your answer. Your help will be much appreciated!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using huggingface fill-mask pipeline to get more than 5 suggestions - python

I would like to add that the parameter was changed to top_k. It can be passed to each individual call of nlp_fill as well as the pipeline method.

Related

Error in importing jaccard_similarity_score

'NearMiss' object has no attribute '_validate_data'

No module named 'sklearn.utils.linear_assignment_'

Name 'RandomUnderSampler' is not defined

Tensorrt Plugin and caffe parser in python

Categories

Resources