Overall goal:
I used some Python code from mmpose which can identify animals in a picture, and then deduce their pose. Great. Now, my goal is to be able to bring this to the browser with TensorFlow.js. I understand this question might require many steps.
What I've managed so far:
I used the file top_down_img_demo_with_mmdet.py which came in the demo/ folder of mmpose. Detecting objects works like a charm, the key line being mmdet_results = inference_detector(det_model, image_name) (from mmdet.apis) which returns bounding boxes of what's found. Next, it runs inference_top_down_pose_model (from mmpose.apis) which returns an array of all the coordinates of key points on the animal. Perfect. From there, it draws out to a file. Now, shifting over to TensorFlow.js, I've included their COCO-SSD model, so I can get bounding boxes of animals. Works fine.
What I need help with:
As I understand it, to use the .pth file (big) used in the animal pose identification, it must be ported to another format (.pt, maybe with an intermediate onnx stop) and then loaded as a model in TensorFlow.js where it can run its pose-detection magic in-browser. Two problems: 1) most instructions seem to expect me to know data about the model, which I don't. Kernel size? Stride? Do I need this info? If so, how do I get it? 2) it's honestly not clear what my real end-goal should be. If I end up with a .pt file, is it a simple few lines to load it as a model in TensorFlow.js and run an image through it?
TL;DR: I've got a working Python program that finds animal pose using a big .pth file. How do I achieve the same in-browser (e.g. TensorFlow.js)
What didn't work
This top answer does not run, since "model" is not defined. Adding model = torch.load('./hrnet_w32_animalpose_256x256-1aa7f075_20210426.pth') still failed with AttributeError: 'dict' object has no attribute 'training'
This GitHub project spits out a tiny saved_model.pb file, less then 0.1% the size of the .pth file, so that can't be right.
This answer gave a huge wall of text, array values off my screen, which it said were weights anyway, not a new model file.
This article expects me to know the structure of the model.
Thank you all. Honestly, even comments about apparent misunderstandings I have about the process would be very valuable to me. Cheers.
Chris
Related
Background
I am working on a project where I need to do coreference resolution on a lot of text. In doing so I've dipped my toe into the NLP world and found AllenNLP's coref model.
In general I have a script where I use pandas to load in a dataset of "articles" to be resolved and pass those articles to the predictor.from_path() object to be resolved. Because of the large number of articles that I want to resolve, I'm running this on a remote cluster(though I don't believe that is the source of this problem as this problem also occurs when I run the script locally). That is, my script looks something like this:
from allennlp.predictors.predictor import Predictor
import allennlp_models.tagging
import pandas as pd
print("HERE TEST")
def predictorFunc(article):
predictor = predictor.from_path("https://storage.googleapis.com/allennlp-public-models/coref-spanbert-large-2021.03.10.tar.gz")
resolved_object = predictor(document=article)
### Some other interrogation of the predicted clusters ###
return resolved_object['document']
df = pd.read_csv('articles.csv')
### Some pandas magic ###
resolved_text = predictorFunc(article_pre_resolved)
The Problem
When I execute the script the following message is printed to my .log file before anything else (for example the print("HERE TEST") that I included) -- even before the predictor object itself is called:
Some weights of BertModel were not initialized from the model checkpoint at SpanBERT/spanbert-large-cased and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
I understand that this message itself is to be expected as I'm using a pre-trained model, but when this message appears it completely locks up the .log file (nothing else gets printed until the script ends and everything gets printed at once). This has been deeply problematic for me as it makes it almost impossible to debug other parts of my script in a meaningful way. (It will also make tracking the final script's progress on a large dataset very difficult... :'( ) Also, I would very much like to know why the predictor object appears to be loading even before it gets called.
Though I can't tell for sure, I also think that whatever is causing this is also causing runaway memory use (even for toy examples of just a single 'article' (a couple hundred words as a string)).
Has anyone else had this problem/know why this happens? Thanks very much in advance!
I think I figured out two competing and unrelated problems in what I was doing. First, the reason for the unordered printing had to do with SLURM. Using the --unbuffered option fixed the printing problem and made diagnosis much easier. The second problem (which looked like runaway memory usage) had to do with a very long article (aprox 10,000 words) that was just over the max length of the Predictor object. I'm going to close this question now!
While we are replicating the implementation of NGNN Dressing as a whole paper, I am stuck on one pickle file which is actually required to progress further i.e. fill_in_blank_1000_from_test_score.pkl.
Can someone help by sharing the same, else with its alternative?
Github implementation doesn't contain the same!
https://github.com/CRIPAC-DIG/NGNN
You're not supposed to use main_score.py (which requires the fill_in_blank_1000_from_test_score.pkl pickle), it's obsolete - only authors fail to mention this in the README. The problem was raised in this issue. Long story short, use another "main": main_multi_modal.py.
One of the comments explains in details how to proceed, I will copy it here so that it does not get lost:
Download the pre-processed dataset from the authors (the one on Google Drive, around 7 GB)
Download the "normal" dataset in order to get the text to the images (the one on Github, only a few MBs)
Change all the folder paths in the files to your corresponding ones
run "onehot_embedding.py" to create the textual features (The rest of the pre-processing was already done by the authors)
run "main_multi_modal.py" to train. In the end of the file you can adjust the config of the network (Beta, d, T etc.), so the file
"Config.py" is useless here.
If you want to train several instances in the for-loop, you need to reset the graph at the begining of the training. Just add
"tf.reset_default_graph()" at the start of the function "cm_ggnn()"
With this setup, I could reproduce the results fairly well with the
same accuracy as in the paper.
Sometimes, without a specific pattern - meaning it sometimes happens, sometimes not, using the same .jpg pictures as input - the following error is raised:
AssertionError: Image is not a np.ndarray
As a consequence of normally loading pictures as:
imgcv = cv2.imread(image_path)
and simply trying to make predictions using a pre-trained model or plotting the image.
Specifically, the picture is not loaded as np.arrays, with three dimensions as (700,700, 3), for instance. Instead, it is stored as NoneType object of builtins module.
Which could be the reason of this error?
I am currently using:
print(cv2.__version__)
'4.0.0'
Best guess: file system issue. cv2.imread(fn) returns None when the file is not found.
I have analysis code that sometimes fails when analyzing videos stored on Synology boxes (i.e., NAS) that tend to go into sleep mode and then wake up too slowly, giving a "file not found" when I first run the analysis; when I re-run it, things work fine. Similar problems are less likely on local disks or SSDs, but I would not be surprised to see them on VMs, highly loaded machines, or in case a disk is going bad...
After retraining my model on tensorflow by following method in the tutorial video by Siraj Raval
https://www.youtube.com/watch?v=QfNvhPx5Px8
I encountered the below error when i finally tested my test image but it generated two errors as seen in screenshot
There are two errors ,a Type and a Key error and both of their root cause is probably DecodeJpeg/Contents: 0
If anyone can explain me the errors and give its resolution then it will be really helpful.
DecodeJpeg/Contents:0 is supposed to be a tensor, and you want to feed data to it, so you consider it as an input. Problem is that it doesn't exist, this probably means that you made a small mistake in the naming.
run this before the sess.run(something, {"DecodeJpeg/Contents:0": something})
tf.summary.FileWriter("name_of_a_folder", sess.graph)
this will generate a log file in that folder. then run in cli:
tensorboard --log_dir /name/to/that/folder/
and open your browser on the link provided in the cli, now you can see the graph and check the real name of the tensor. If you still have problems, feel free to share the graph image, or ask away.
OK, I have a question about how to lay out code efficiently.
I have a model written in python which generates results which I use to produce graphs in matplotlib. As written, the model is contained within a single file, and I have 15 other run-files, which call on it with complicated configurations and produce graphs. It takes a while to go through and run each of these run-files, but since they all use substantially different settings for the model, I need to have complicated setup files anyway, and it all works.
I have the output set up for figures which could go in an academic paper. I have now realised that I am going to need each of these figures again in other formats - one for presentations (low dpi, medium size, different font) and one for a poster (high dpi, much bigger, different font again.)
This means I could potentially have 45 odd files to wade through every time I want to make a change to my model. I also would have to cut and paste a lot of boilerplate matplotlib code with minor alterations (each run-file would become 3 different files - one for each graph).
Can anybody explain to me how (and if) I could speed things up? At the moment, I think it's taking me much longer than it should.
As I see it there are 3 main options:
Set up 3 run-files for each actual model run (so duplicate a fair amount, and run the model a lot more than I need) but I can then tweak everything independently (but risk missing something important).
Add another layer - so save the results as .csv or equivalent and then read them into the files for producing graphs. This means more files, but I only have to run the model once per 3 graphs (which might save some time).
Keep the graph and model parameter files integrated, but add another file which sets up graphing templates, so every time I run the file it spits out 3 graphs) It might speed things up a bit, and will certainly keep the number of files down, but they will get very big (and probably much more complicated).
Something else..
Can anybody point me to a resource or provide me with some advice on how best to handle this?
Thanks!
I think you are close to find what you want.
If calculations take some time, store results in files to process later without recalculation.
The most important: separate code from configuration, instead of copy pasting variations of such mixture.
If the model takes parameters, define a model class. Maybe instantiate the model only once, but the model knows how to load_config, read_input_data and run. Model also does write_results. That way you can loop a sequence of load_config, read_data, write_results for every config and maybe input data.
Write the config files by hand with ini format for example and use the confiparser module to load them.
Do something similar for your Graph class. Put the template definition in configuration files, including output format, sizes fonts, and so on.
In the end you will be able to "manage" the intended workflow with a single script that uses this facilites. Maybe store groups of related configuration files, output templates and input data together, one group per folder for each modelling session.