Im new to DL and currently learning to develop classification model using PET-CT scans. Apparently there is a series of dcm images per patient. Might be dumb question but those who have developed DL classification models using pydicom, how do you feed those images into the neural network? I am familiar with normal classification, where the input would be something like:
Fruit Classification:
train folder
apples
apple1.jpg
apple2.jpg
apple3.jpg
other fruits...
For classification using PET-CT images:
train folder
class 1
patient1
first check up
dcm1
dcm2
dcmn
second check up
dcm1
dcm2
dcmn
patient2
first check up
dcm1
dcm2
dcmn ...
class others...
In tensorflow/pytorch, would normal loading work, even if there are many subfolders and files for each patient?
Related
Regular TimesFormer takes 3 channel input images, while I have 4 channel images (RGBD). I am struggling to find a TimesFormer (or a model similar to TimesFormer) that takes 4 channel input images and extract features from them.
Does anybody know such a model? Preferably, I would like to find pretrained model with weights.
MORE CONTEXT:
I am working with RGBD video frames and have multiclass classification problem at the end. My videos are fairly large, between 2 to 4 minutes so classical time-series models doesn't work for me. So my inputs are RGBD frames/images from the video and at the end I would like to get class prediction.
My idea was to divide the problem into 2 stages:
Extract features from video into smaller dimension with TimesFormer-like model. Result: I would get a new data representation (dataset).
Train clasification ML network with new data to get a class prediction.
As of Jan 2023, I don't think there's any readily available TimeSformer model/code that works on RGBD 4 channel image.
Alternatively, If you are looking for Vision Transformers that can work with depth as well (RGBD data), you have the entire list with state-of-the-art approaches and corresponding code (wherever available) here.
One of the good approach to start with is DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation. You can find the pre-trained models with this approach here.
If you're looking for 3D CNN based object detectors that can work on RGBD data: RGB-D Salient Object Detection via 3D Convolutional Neural Networks is one of the good ones to start with. Code and pre-trained models can be found here.
Since I don't fully understand your problem statement or exact requirement I'm proposed few things that I thought could be helpful.
I am currently working on a system that extracts certain features out of 3D-objects (Voxelgrids to be precise), and i would like to compare those features to automatically made features when it comes to performance (classification) in a tensorflow cNN with some other data, but that is not the point here, just for background.
My idea now was, to take a dataset (modelnet10), train a tensorflow cNN to classify them, and then use what it learned there on my dataset - not to classify, but to extract features.
So i want to throw away everything the cnn does,except for what it takes from the objects.
Is there anyway to get these features? and how do i do that? i certainly have no idea.
Yes, it is possible to train models exclusively for feature extraction. This is called transfer learning where you can either train your own model and then extract the features or you can extract features from pre-trained models and then use it in your task if your task is similar in nature to that of what the pre-trained model was trained for. You can of course find a lot of material online for these topics. However, I am providing some links below which give details on how you can go about it:
https://keras.io/api/applications/
https://keras.io/guides/transfer_learning/
https://machinelearningmastery.com/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/
https://www.pyimagesearch.com/2019/05/27/keras-feature-extraction-on-large-datasets-with-deep-learning/
https://www.kaggle.com/angqx95/feature-extractor-fine-tuning-with-keras
I've just started with tensorflow. I wrote a program that uses Fashion_MNIST dataset to train the model. And then predicts the labels using 'test_images'and it's working good so far.
But what I am curious how can I use my own image of a shoe or shirt for prediction. Because all the test images are of shape 28*28. How can I do this ?
The task you are engaged in is the task of data preparation and preprocessing. Among the things you must do already having a directory with images is the tagging of the images, for this task I recommend labelImg.
If you also need the dimensionality of the input to be of a specific size like the example you give, you can use digital image processing software. The OpenCV library has dimensionality reduction tools that work for this.
I'm building CNN that will tell me if a person has brain damage. I'm planning to use tf inception v3 model, and build_image_data.py script to build TFRecord.
Dataset is composed of brain scans. Every scan has about 100 images(different head poses, angles). On some images, damage is visible, but on some is not. I can't label all images from the scan as a damage positive(or negative), because some of them would be labeled wrong(if scan is positive on damage, but that is not visible on specific image).
Is there a way to label the whole scan as positive/negative and in that way train the network?
And after training is done, pass scan as input to network(not single image) and classify it.
It looks like multiple instance learning might be your approach. Check out these two papers:
Multiple Instance Learning Convolutional Neural
Networks for Object Recognition
Classifying and segmenting microscopy images
with deep multiple instance learning
The last one is implemented by #dancsalo (not sure if he has a stack overflow account) here.
I looks like the second paper deals with very large images and breaks them into sub images, but labels the entire image. So, it is like labeling a bag of images with a label instead of having to make a label for each sub-image. In your case, you might be able to construct a matrix of images, i.e. a 10 image x 10 image master image for each of the scans...
Let us know if you do this and if it works well on your data set!
I am using a ImageNet-trained network to extract features and classify my own images. My images are quite different (microscopic images) from cats and dogs but the features extracted from the ImageNet gave quite promising results by classification.
It would be very easy for me to generate millions of small microscopic images which contain no label. Would it be somehow possible to train my own convolutional neural network? My target images for classification after transfer learning are labeled. Is this somehow possible? Or can I use pseudo labeling (e.g. classes of mean(histogram) or whatever)?