im trying to detect the area of the street in image without any deep learning method.
say i have this image:
i am looking for any simple method to detect street portion of the image like the following:
now i know this might not be very accurate, and accuracy is not the problem at all , i am trying to achieve this without using any deep learning method.
Hough line can give direct straight line measure. but i don't thin it will give you exactly what you want. As shown below
You need a lot more complicated algorithms such as deep sematic segmentation model. and train based on that.
Even you don't like deep learning. traditional algo such as variational analysis, SVM learning or adaboost is also very complicated and you wont be able to use it easily. You need to have mucher deeper understanding on those topic.
if you really want you can start with variational analysis, active contour model, snake energy for extracting the road first. This variational analysis is proven to be working for a complex scenes and extract a particular model as shown in the image below. your road is the empty low gradient region and all building tree nearby are high gradient responses that you don't want.
My suggestion is to make your life easier by using pre trained model and extra the surface model. Download, run python script. that's all
There are a few open-source implementations that you can try such as this
https://github.com/ArkaJU/U-Net-Satellite
https://github.com/Paulymorphous/Road-Segmentation
https://github.com/avanetten/cresi
Based on the predicted mask. then you can get production accurately as shown below
This would be the result that you are looking for
Regards
Shenghai Yuan
Related
I'm trying to stretch the edges of the image (duplicate pixels) around data (not transparent) areas in picture.
for example:
before:
after - the red line is only for you to see the difference
note: it's important not to change the dimension of the image, i'ts being used for geographical needs.
I would like to hear any ideas or suggestions from you, code snippets will be welcomed.
thank you!
This problem is called image outpainting and You can see very advanced deep learning based solutions in papers on paperswithcode.com. Current state-of-art is given by Basile Van Hoorick in work Image Outpainting and Harmonization using Generative Adversarial Networks. Code with possibility of usage is placed in his github profile. See Usage section on github to know how to use it.
There are 3 models available to use in Pretrained models section:
G_art.pt: Artistic
G_nat.pt: Natural
G_rec.pt: Reconstruction loss only (no adversarial loss)
I believe You'll have to use transfer learning to train and use this architecture in Your use case.
I am currently looking for a possibly fast way to create (or use an already existing) classifier for images. The main goal is to classify the images whether they are (mainly) a logo or not. This means that I am not interested in recognizing the brand/name of the company. But instead the model would have to tell how possible it is that the image is a logo.
Does such a categorizer already exist? And if not, is there any possible solution to avoid neural networks for this task?
Thanks in advance.
I am not sure about the existence of this project, but I have a couple of ideas that can work for this without neural networks. I think as a convention neural networks would be much easier but I think it might be done K-means algorithm or by a clustering algorithm. I have imagined like if logo data are in the same area and image data are in another same area, they can be clustered.However, I haven't done it sth like that before but theoretically, it seems logical
As you have seen there isn't a y label in our algorithm.
I am training a U-Net architecture to for a segmentation task. This is in Python using Keras. I have now run into an issue, that I am trying to understand:
I have two very similar images from a microscopy image series (these are consecutive images), where my current U-Net model performs very good on one, but performs extremely poor on the immediately following one. However, there is little difference between the two to the eye and the histograms also look very much alike. Also on other measurements the model performs great across the whole frame-range, but then this issue appears for other measurements.
I am using data-augmentation during training (histogram stretching, affine transformation, noise-addition) and I am surprised that still the model is so brittle.
Since the U-Net is still mostly a black-box to me, I want to find out steps I can take to better understand the issue and then adjust the training/model accordingly.
I know there are ways to visualize what individual layers learn (e.g. as discussed F. Chollets book see here) and I should be able to apply these to U-Nets, which is fully convolutional.
However, these kinds of methods are practically always discussed in the realm of classifying networks - not semantic segmentation.
So my question is:
Is this the best/most direct approach to reach an understanding of how U-Net models attain a segmentation result? If not, what are better ways to understand/debug U-Nets?
I suggest you use the U-Net container on NGC https://ngc.nvidia.com/catalog/resources/nvidia:unet_industrial_for_tensorflow
I also suggest you read this: Mixed Precision Training: https://arxiv.org/abs/1710.03740
https://developer.nvidia.com/blog/mixed-precision-training-deep-neural-networks/
Let me know how you are progressing and if any public repo, happy to have a look
i try to determine the quality of a person's sitting posture. (e.g. sitting upright = good / sitting crouched = bad) with a webcam.
First try:
Image aquisition (with OpenCV python bindings)
Create a dataset with labeled images into good/bad
Feature detection (FAST)
Train a neuronal net on the dataset with that features(ANN_MLP)
The result was ok with a few restrictions:
not invariant to webcam movements, displacement, other persons, objects etc.
iam not sure if FAST features will fit
im pretty new to machine learning and want to try more sophisticated approaches with TensorFlow:
Second try:
I tried human pose detection via Tensorflow PoseNet
And got a mini example working which can determine probabilities of human bodypart positions. So now the challenge would be to detect the quality of a person's sitting posture out of the output of PoseNet.
What is a good way to proceed:
train a second TF model which gets probabilities of human bodypart
positions as input and outputs good/bad posture? (so PoseNet is used as fancy feature detector)
rework the PoseNet model to fit my output needs and retrain it?
transfer learning from PoseNet (i just read about it but have no clue how or if its even applicable here)?
or maybe a complete different approach?
I have implemented several clustering algorithms on an image dataset.
I'm interested in deriving the success rate of clustering. I have to detect the tumor area, in the original image I know where the tumor is located, I would like to compare the two images and obtain the percentage of success.
Following images:
Original image: I know the position of cancer
Image after clustering algorithm
I'm using python 2.7.
Segmentation Accuracy
This is a pretty common problem addressed in image segmentation literature, e.g., here is a StackOverflow post
One common approach is to consider the ratio of "correct pixels" to "incorrect pixels," which is common in image segmentation for safety domain, e.g., Mask RCNN, PixelNet.
Treating it as more of an object detection task, you could take the overlap of the hull of the objects and just measure accuracy (commonly broken down into precision, recall, f-score, and other measures with various bias/skews). This allows you to produce an ROC curve that can be calibrated for false positives/false negatives.
There is no domain-agnostic consensus on what's correct. KITTI provides both.
Mask RCNN is open source state-of-the-art, and provides implemenations
in python of
Computing image matching between segmented and original
Displaying the differences
In your domain (medicine), standard statistical rules apply. Use a holdout set. Cross validate. Etc. (*)
Note: although the literature space is dauntingly large, I'd caution you to take a look at some domain-relevant papers, as they may take fewer "statistical short cuts" than other vision (digit recognition e.g.) projects accept.
"Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool" provides some summary methods in your your domain
"Current methods in image segmentation" has about 2500 citations but is a little older.
"Review of MR image segmentation techniques using pattern recognition" is a little older still and will get you safely into "traditional" vision models.
Automated Segmentation of MR Images of Brain Tumors is largely about its segmentation validation process
Python
Besides the mask rcnn links above, scikit-learn provides some extremely user friendly tools and is considered part of the standard science "stack" for python.
Implementing the difference between images in python is trivial (using numpy). Here's an overkill SO link.
Bounding box intersection in python is easy to implement on one's own; I'd use a library like shapely if you want to measure general polygon intersection.
Scikit-learn has some nice machine-learning evaluation tools, for example,
ROC curves
Cross validation
Model selection
A million others
Literature Searching
One reason that you may have trouble searching for the answer is because you're trying to measure performance of an unsupervised method, clustering, in a supervised learning arena. "Clusters" are fundamentally under-defined in mathematics (**). You want to be looking at the supervised learning literature for accuracy measures.
There is literature on unsupervised learning/clustering, too, which looks for topological structure, generally. Here's a very introductory summary. I don't think that is what you want.
A common problem, especially at scale, is that supervised methods require labels, which can be time consuming to produce accurately for dense segmentation. Object detection makes it a little easier.
There are some existing datasets for medicine ([1], [2], e.g.) and some ongoing research in label-less metrics. If none of these are options for you, then you may have to revert to considering it an unsupervised problem, but evaluation becomes very different in scope and utility.
Footnotes
[*] Vision people sometimes skip cross validation even though they shouldn't, mainly because the models are slow to fit and they're a lazy bunch. Please don't skip a train/test/validation split, or your results may be dangerously useless
[**] You can find all sorts of "formal" definitions, but never two people to agree on which one is correct or most useful. Here's denser reading