I am trying to use the pretrained SciBERT model (https://huggingface.co/allenai/scibert_scivocab_uncased) from Huggingface to evaluate masked words in scientific/biomedical text for bias using CrowS-Pairs (https://github.com/nyu-mll/crows-pairs/). The CrowS-Pairs code works great with the built in models like BERT.
I modified the code of metric.py with the goal of allowing an option of using the SciBERT model -
import os
import csv
import json
import math
import torch
import argparse
import difflib
import logging
import numpy as np
import pandas as pd
from transformers import BertTokenizer, BertForMaskedLM
from transformers import AlbertTokenizer, AlbertForMaskedLM
from transformers import RobertaTokenizer, RobertaForMaskedLM
from transformers import AutoTokenizer, AutoModelForMaskedLM
and get the following error
2021-06-21 17:11:38.626413: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
File "metric.py", line 15, in <module>
from transformers import AutoTokenizer, AutoModelForMaskedLM
ImportError: cannot import name 'AutoModelForMaskedLM' from 'transformers' (/usr/local/lib/python3.7/dist-packages/transformers/__init__.py)
Later in the Python file, the AutoTokenizer and AutoModelForMaskedLM are defined as
tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")
model = AutoModelForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased")
Libraries
huggingface-hub-0.0.8
sacremoses-0.0.45
tokenizers-0.10.3
transformers-4.7.0
The error occurs with and without GPU support.
Try this:
tokenizer = BertTokenizer.from_pretrained("allenai/scibert_scivocab_uncased", do_lower_case=True)
model = BertForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased")
Related
It seems that when I call a python function from an xmlhttp request so that its output prints to the browser, the script will lead to an error if any of these import statements (commented) are run:
import traceback
import transformers
"""
from transformers import AutoTokenizer, AutoModel, AutoModelForSequenceClassification, TrainingArguments, Trainer
import evaluate
import torch
"""
import pandas as pd
import numpy as np
import datasets
from datasets import load_dataset, load_metric, load_from_disk, Value
import cgi, cgitb
print("Content-Type: text/html\n")
print("Hello!")
That is, if I keep them commented, my hello statement will show on the browser, otherwise, I get an internal server error 500. Any help would be greatly appreciated!!
I already tried suppressing warnings and logs!
When I am running datasets_utils.py from '/usr/local/lib/python3.7/dist-packages/torchtext/data/datasets_utils.py' in Google Colab, the following error occurs even with the most updated versions of Python packages:
ImportError: cannot import name 'functional_datapipe' from 'torch.utils.data' (/usr/local/lib/python3.7/dist-packages/torch/utils/data/init.py)
Are there any solutions to solve such errors, as I could not find functional_datapipe even in the official torch.utils.data documentation? The following is excerpt from datasets._utils.py in the Google Colab environment
import functools
import inspect
import os
import io
import torch
from torchtext.utils import (
validate_file,
download_from_url,
extract_archive,
)
from torch.utils.data import functional_datapipe, IterDataPipe
from torch.utils.data.datapipes.utils.common import StreamWrapper
import codecs
It might be available only on torchdata.datapipes
I am trying to download a pre-trained tensorflow model. I am using the below code
import numpy as np
import time
import PIL.Image
import IPython.display as display
import matplotlib.pylab as plt
import tensorflow as tf
import tensorflow_hub as hub
import datetime
from tensorflow.keras.preprocessing import image
from dateutil import parser
from keras.applications.inception_v3 import InceptionV3
model = InceptionV3()
model.summary()
I am getting the following error
AttributeError: module 'dateutil' has no attribute 'parser'
I am using python -3.7 and TF-2.7, python-dateutil-2.8.1
Please help me fix this. Thank You :)
The correct import syntax is:
import dateutil.parser
and then:
parser.parse(time_string)
or:
from dateutil.parser import parse
parse(time_string)
Documentation: https://dateutil.readthedocs.io/en/stable/parser.html
i have the following python imports with in Jupyter Notebook.
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
from tensorflow.keras.applications import imagenet_utils
from sklearn.metrics import confusion_matrix
import itertools
import os
import shutil
import random
import matplotlib.pyplot as plt
%matplotlib inline
But i keep getting the following error
ImportError: cannot import name 'imagenet_utils' from 'tensorflow.keras.applications' (C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\api_v2\keras\applications_init_.py)
when i search for **cannot import name 'imagenet_utils' from 'tensorflow.keras.applications' **in google i dont get much helpful information.
Has anyone come across this at all?
change
from tensorflow.keras.applications import imagenet_utils
to
from keras.applications import imagenet_utils
i managed to solve my issue.
first i ran the following to update all modules.
conda update --all
Then i used 'from keras.applications import imagenet_utils'
instead of '#from tensorflow.keras import imagenet_utils'
This is literally all the code that I am trying to run:
from transformers import AutoModelWithLMHead, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
model = AutoModelWithLMHead.from_pretrained("microsoft/DialoGPT-small")
I am getting this error:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-14-aad2e7a08a74> in <module>
----> 1 from transformers import AutoModelWithLMHead, AutoTokenizer
2 import torch
3
4 tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
5 model = AutoModelWithLMHead.from_pretrained("microsoft/DialoGPT-small")
ImportError: cannot import name 'AutoModelWithLMHead' from 'transformers' (c:\python38\lib\site-packages\transformers\__init__.py)
What do I do about it?
I solved it! Apperantly AutoModelWithLMHead is removed on my version.
Now you need to use AutoModelForCausalLM for causal language models, AutoModelForMaskedLM for masked language models and AutoModelForSeq2SeqLM for encoder-decoder models.
So in my case code looks like this:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")