The ML community has techniques for building methods to address low signal to noise, class imbalance, noisy labels and smaller sample sizes. I definitely agree, all of these are issues in medical datasets. Part of the exciting challenge at the intersection of medicine and machine learning is around scaling data collection while respecting patient privacy.