We have fasttext commands to run in command prompt
I have cloned the github repository and for example to change parameters of the network for a supervised learning in the command I used are like
./fasttext supervised -input FT_Race_data.txt -output race_model -lr 0.4 -epoch 30 -loss hs
I am changing lr and epoch and loss. I can train and fetch the required output.
For programming in python script, I installed the fasttext library and I tried like
classifier = fasttext.supervised('FT_Race_data.txt','race_model')
The model gets trained but the results are not good, In this case, I didn't define any parameters. So I tried like
classifier = fasttext.supervised('FT_Race_data.txt','race_model', 0.4, 30, 'hs')
The programs run with no error but don't give any result. So I tried like
classifier = fasttext.supervised(input = 'FT_Race_data.txt',output ='race_model', lr = 0.4,epoch= 30,loss = 'hs')
it gives an error that fasttext takes only two arguments.
How to change parameters in python script like in command prompt to fine tune the supervised learning ?
For future references, Form discussions here, it seems that the pip install fasttext doesn't install the full features available in the repo.
So till when the latest features are included in https://pypi.python.org/pypi/fasttext, for python bindings with features to train models and set parameters, follow the following installation procedure as outlined here.
git clone https://github.com/facebookresearch/fastText.git
cd fastText
pip install .
And then using train_supervised a function which returns a model object one can set the different parameters as in the following example in that repo.
fastText.train_supervised(input, lr=0.1, dim=100, ws=5, epoch=5, minCount=1, minCountLabel=0, minn=0, maxn=0, neg=5, wordNgrams=1, loss='softmax', bucket=2000000, thread=12, lrUpdateRate=100, t=0.0001, label='__label__', verbose=2, pretrainedVectors='')
Related
I'm training a custom named entity recognition model, I created the config.cfg and train.spacy files, among all it has, I'm using this as pre-trained vectors en_core_web_lg
[paths]
train = null
dev = null
vectors = "en_core_web_lg"
init_tok2vec = null
I then train the model using
!python -m spacy train config.cfg --output ./output --paths.train ./train.spacy --paths.dev ./train.spacy
This works and I can see the output model.
Then I want to train another NER model that has nothing to do with the previous one (same code different data) and I get this error:
Error: [E884] The pipeline could not be initialized because the vectors could not be found at 'en_core_web_lg'.
If your pipeline was already initialized/trained before, call 'resume_training' instead of 'initialize', or initialize only the components that are new.
It looks like it modified the base en_core_web_lg model, which can be a problem for me since I use it for different models, some fine-tuned and others just out of the box.
How can I train this NER model making sure the downloaded en_core_web_lg model is not modified? and would this ensure that I can train several models without interfering with each other?
When you use a model as a source of vectors, or for that matter a source for any other part of a pipeline, spaCy will not modify it under any circumstances. Something else is going on.
Are you perhaps using a different virtualenv? Does spacy.load("en_core_web_lg") work?
One thing that could be happening (but seems less likely) is that in some fields, you can use the name of an installed pipeline (using entry points) or a local path. If you have a directory named en_core_web_lg where you are training that could be checked first.
I'm trying to compare the benefits of word vectors trained on in-domain data to word vectors trained on non-domain specific data on spaCy NER.
I have built two word2vec models, each one trained on different text.
Then, I try to build and evaluate two spaCy models, each with a different txt file containing the word2vec embeddings.
This is the code that I use to initiate the model:
!python -m spacy init vectors en models/w2v/merged/word2vec_merged_w2v.txt models/w2v/merged --name en_test
It runs succesfully, and both create different amounts of vectors.
1st model:
Creating blank nlp object for language 'en'
[+] Successfully converted 28093 vectors
[+] Saved nlp object with vectors to output directory. You can now use the path
to it in your config as the 'vectors' setting in [initialize].
C:\Users\Ruben Weijers\models\w2v\merged
2nd model:
Creating blank nlp object for language 'en'
[+] Successfully converted 34712 vectors
[+] Saved nlp object with vectors to output directory. You can now use the path
to it in your config as the 'vectors' setting in [initialize].
C:\Users\Ruben Weijers\models\w2v\quine
The vectors are different. I then try to load and train using the following code:
nlp = spacy.load("models/w2v/merged")
nlp.add_pipe('ner')
nlp.to_disk("models/w2v/merged")
#train model
#train model
!python -m spacy train models/w2v/merged/config.cfg --output models/w2v/merged --paths.train data/train2.spacy --paths.dev data/test2.spacy --paths.vectors models/w2v/merged
I do the same for the other model, pathing to the other vector file ofcourse.
However, the training pipeline shows that both models have exactly the same precision, recall and loss rate throughout the learning process.
Also, when I call:
#evaluate model
!python -m spacy evaluate models/w2v/merged/model-best data/val2.spacy --output models/w2v/merged/metrics.json
Both models have the same performance metrics, while the vectors are totally different.
I have looked up different videos on how to path vectors, I have added paths to the vectors to the config files, on top of adding
--paths.vectors models/w2v/merged
All doesn't seem to help. Many videos show how to implement word2vec, yet don't evaluate. I'm curious to see as to why both word2vec models appear to be exactly the same. It doesn't make sense. I have checked that pathing is correct, and files are in the correct place multiple times. It doesn't seem like that's the issue since the different numbers returned in creation of vectors also shows that the vector files are different.
I have created the word vectors using:
def train_w2v_model(model_name):
w2v_model = Word2Vec(min_count=5,
window=2,
vector_size=500,
sample=6e-5,
alpha=0.03,
min_alpha=0.0007,
negative=20)
w2v_model.build_vocab(sentences)
w2v_model.train(sentences, total_examples=w2v_model.corpus_count, epochs=30)
w2v_model.save(f"downloads/{model_name}.model")
w2v_model.wv.save_word2vec_format(f"downloads/word2vec_{model_name}.txt")
I know it's been a along time since this question was asked, but I ran into the same problem and I found the solution, so I thought it could be useful for somebody else to share it.
Instead of
nlp.add_pipe('ner')
you should do this:
!python -m spacy init config -p ner -o accuracy config.cfg
At the beginning of the year I started to look into Tensorflow and ML.
A friend of mine with some code sample to train some classes for custom objects using FASTER_RCNN_RESNET_101. Because I had no GPU and the training was taking quite some time on my machine he was running the training on his machine and he was providing me the files needed for me in the image detection code.
Meanwhile I have bought a GPU and I tried to do all the steps on my machine.
I am doing all the steps I was doing with him but now when I run the detection no objects are detected after the training.
The step that I have no documentation is the one that generates the .pb file.
I tried to use models\research\object_detection\export_inference_graph.py to generate the file like:
python models\research\object_detection\export_inference_graph.py --input_type image_tensor --pipeline_config_path data\pipeline_vio.config --trained_checkpoint_prefix data\train_folder\model.ckpt-497416 --output_directory train_folder\exported_graphs
I encounter a problem:
Using the .pb file generated no object is recognized but if I use the file he provided the objects are recognized.
Last time I did the training using python train.py --logtostderr --train_dir=data\train_folder --pipeline_config_path=data\pipeline_vio.config
Is there a method to see the difference between the 2 graphs or something to understand why one of the graphs works but not the other
I can't figure out for the life of me how to generate text from the default model feeding in a prefix:
I have downloaded the model and here is my code:
import gpt_2_simple as gpt2
model_name = "124M"
sess = gpt2.start_tf_sess()
gpt2.generate(sess, model_name=model_name)
gpt2.generate(sess, model_name=model_name, prefix="<|My name is |>")
However when i run it i get the following error:
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found. (0) Failed precondition: Attempting to use uninitialized value model/h3/mlp/c_proj/w [[{{node model/h3/mlp/c_proj/w/read}}]] [[strided_slice/_33]] (1) Failed precondition: Attempting to use uninitialized value model/h3/mlp/c_proj/w [[{{node model/h3/mlp/c_proj/w/read}}]]
Any idea what I'm doing wrong?
You are trying to generate without loading parameters first.
It seems that the downloaded models are used for training ("finetuning") but they are not loaded for generation.
For generation, the library tries to run a previously saved Tensorflow model ("checkpoints" in TF terminology).
Finetuning
You can generate a checkpoint by training the model for a few epochs using your own dataset (or working from the dataset published by the researches).
Otherwise, gpt-2-simple makes it easy. Get a text file with some text and train it:
gpt_2_simple --sample_every 50 finetune yourtext.txt
Let it run for a few epochs and have a look at the result samples. A checkpoint will be saved every 100 epochs. Once you are happy, hit CTRL+C and it will save a last checkpoint.
You can then generate text using:
gpt_2_simple generate --prefix "Once upon a time" --nsamples 5
The gpt_2_simple tool accepts a -h argument for help. Have a look at the other options. Using the library from code is similar to this tool workflow.
Generating without finetuning
The author explains in this GitHub question the procedure to skip finetuning entirely. Just copy the model to the checkpoint directory (you need to download the model first, have a look at that link):
mkdir -p checkpoint/
cp -r models/345M checkpoint/run1
When training a Tensorflow object detection model, I want to view the mean average precision (mAP) of my test data. Running train.py and eval.py in two consoles, I can view the loss rates of the training data, and even the objects detected in the test set images through
tensorboard --logdir=model_dir/
however no precision scalars are being displayed for the test set.
I am using python 3 on windows 10, and successfully installed pycocotools using;
pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
Cheers
I thought I had this problem, hence the upvote. But I simply wasnt waiting long enough. Leave the eval running for the at least couple of hours or longer alongside the train job.
#Graham Monkman is it a must for the eval and train job to run together so that I can get the evaluation metrics (mAP, etc..) ?