Hacker Newsnew | past | comments | ask | show | jobs | submit | 23pointsNorth's commentslogin

Hey folks! I am Daniel, one of the co-founders of Efemarai. Happy to show to the ML crowd that we've started our journey of open-sourcing our platfrom for testing and validating Computer Vision models. https://github.com/efemarai/efemarai

Me and Svet(co-founder) have worked in industry (building self-driving cars, to cancer-research to different DARPA projects) and have come to the realization that similar to how software/EE/Aerospace is tested/QA'd as part of the deployment process, we need more rigorous steps in the ML domain (beyond test/val datasets or thumbs-down).

Our hope is to allow ML teams to build test suits for their models, embed them as part of their CI and give another layer of confidence that when you re-deploy, you're not going to regress and to the infamous one step forward, two steps back.

You can register at https://ci.efemarai.com and easily submit jobs through python/cli (pip install efemarai). You'll get access to: - Operational Domain Finetune how the images should be transformed, such that they cover the variability the model is expected to see in the real world. We know datasets cannot have it all, so we're releasing a tool that has helped us a lot in being able to both encode "business level performance SLA" - it should work with this small objects, in darkness, under lense flare, when the face is x% rotates or form this azimuth (and not plain mAP or accuracy)

- Support any Input and Output data types Not only do we support tasks such as classification, object detection, instance segmentation, keypoints detection, regression, but also any combination thereof, with any type of input - single image, multi-image, video, text, or anything that combines those. Right now this is such a major pain point from arbitrary open source datasets, loaders, we really hope to provide an easy way to encode the teams internal structure to something that is generalizable.

- High efficiency There is so much gain to be made in being thoughtful in the data that is being used, rather than randomly augmenting data. With Efemarai you can find and fix failure modes of your model given the degradation of performance that is extracted from purposeful transformations.

We look forward to seeing you at Discord https://discord.gg/cWQC3rrB and chatting with you!


(I'm biased, as I'm the CTO/co-founder of Efemarai)

Yes, there are usually several changes that need to be tracked - code/model changes (those usually happen early on and the stabalise), input/code changes (e.g. pre-processing the data with either new transformations or _other_ models), and data changes (both changes for training and testing). At Efemarai we are thinking about it as any changes to the above should automatically trigger a test suit of the model/process. And under test we're thinking not just the different forms of unit testing the input/output formats and sizes from the model, but also unit testing on the model performance of the data you've collected + stress testing the model with data the model is expected to see in production.

So in reality, it's indeed nothing new, but the standard DevOps pipeline needs to be extended to work with the ML assumptions.


That's quite interesting. We've been porting some of the properties in PBT for ML [1]. It really works and makes a big difference even for the AI space if we focus of semantically meaningful changed (that are a bit easier to replicate in the software world.)

[1] https://towardsdatascience.com/why-dont-we-test-machine-lear...


Hey HN,

I’m Daniel from the Efemarai team here. Svet and I are a couple of founders with deep tech backgrounds who have spent endless hours testing and debugging ML models. During our PhDs AI & Robotics and professional work later on, we’ve struggled with the limited QA techniques and tools for ML that are out there. That’s why we’ve ended up creating Efemarai - a platform for testing and debugging ML models. With Efemarai you can easily visualize your models and data but also specify all sorts of assertions to be automatically checked. It’s really easy to set up and takes literally 2 lines to integrate it with your codebase (`import efemarai as ef` & `with ef.scan():`) and look in your browser for results. If you don’t have any code lying around, don’t worry, after signing up you’ll get a short demo to show you how to navigate around and some of the features. Right now you can:

- Inspect and see any tensor!

- Show the full computational graph - next time you `git clone` an ML model, get a look at what you get. - During training, get aggregate information about any tensor or gradient. See what causes them to explode or vanish to zero.

- Automatically catch some common issues or write your own checks.

- Write custom system or browser notifications to get your attention back to training when issues happen. We all try to multitask during model training.

We’ve built Efemarai for developers and with privacy as a top priority, so none of your data or models leave your machine. You’ll be able to show quite large, multi-stage models like DCGAN, but also build and dissect your own (http://bit.ly/3eqZKbf).

We are now looking into enabling users to share their model visualizations online, what do you think? Are we missing something that you’d love to have to make your ML model testing or debugging easier? How do you test for regressions in your ML development?

Check out Efemarai on Product Hunt as well - https://www.producthunt.com/posts/efemarai


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: