Very cool! We (Paperspace) just posted a tutorial on Detecting Pneumonia from X-ray scans using PyTorch for anyone interested in building some more intuitions around this type of Deep Learning: https://blog.paperspace.com/detecting-and-localizing-pneumon...
Yes, there is a whole glut of companies in this space working in close collaboration with various hospitals. I've looked at several of these and the general quality of the products is high.
I worked on this problem during my masters. I created both a deep and KNN model for early breast cancer detection, but I think the performance (accuracy and speed) is most impressive for the latter.
Code:
https://github.com/jszym/pprecogg
Sounds nice, and they even have the good stuff on github with an explaining how-to article on medium.
If you're into Digital Pathology and Whole Slide Image (WSI) analysis then there is a whole undergrowth of open source tools and projects to dive into. I especially like QuPath by Pete Bankhead (https://qupath.github.io/). It has a more accessible interface than python (as in the article), it supports various machine learning algorithms and also has some nice video tutorials (https://www.youtube.com/channel/UCk5fn7cjMZFsQKKdy-YWOFQ).
143 images; and within 95%. I need to read more to work out which 5% is wrong, and what the 95% really means of course. I feel that calling this Pathologist level could be a stronger claim that I would make. I am hesitant to word things more strongly, but I feel dismay about the kind of claims of performance that are often made in these contexts.
One possible reason: pathology labeling can be prohibitively expensive. You're looking at paying highly-trained specialists to sit down and hand-segment regions of cell types on an entire slide.
That is definitely one bottle-neck, creating the labelled data-sets is very expensive, besides that there are a large number of privacy sensitive issues that need to be taken care of around the whole labeling process to ensure that patient confidentiality is maintained. Large datasets exist but in general are not available for research for that reason, only a few hospitals have made datasets available without restriction.