We actually used Streamlit in the past. Our gripe with it was how the backend event loop was managed. Basically, Streamlit re-runs your code at every user interaction to check what's changed (unless you cache specific variables which is hard to do well). When your app has significant data or a significant model to work with or multiple pages or users, this approach fails, and the app starts freezing constantly. We wanted a product that is the compromise between the easy learning curve of Streamlit while retaining production-ready capabilities: we use callbacks for user interactions to avoid unnecessary computations, front and back-end are running on separate threads. We also run on Jupyter notebooks if that helps.
The script re-run ( and the bandaid of caching via decorators) is exactly what I don’t like about streamlit.
I’d love to see an example of how you use Taipy to build an LLM chat app, analogous to this SL example:
Another interest one in this space —
Reflex (formerly known as PyneCone).
They have a ready to use LLM chat App, which makes it more likely I will check it out.
I would love to read a comparison explaining the relative advantages of each framework from an experienced practitioner who has actually built apps with each. Add in plotly dash, bokeh/panel, and voila too.
Off the top of my head, bokeh and panel are more oriented towards high performance for large datasets, but have less overall adoption.
Voila is oriented towards making existing jupyter notebooks into interactive dashboards.
I'm always curious as to the runtime model for these interactive frameworks. Building interactivity into a jupyter notebook is fairly straightforward, but it's a very different execution model than the traditional http model. Jupyter notebook widgets need a separate backing kernel for each new user, vs the traditional http server models where all request state is reified normally based on a session cookie to DB state. The complete interpreter per user makes for simpler programming, but it is much more memory and process intensive.
Very cool! I signed up and uploaded data for a text classifier. 3000 examples of social media posts on a binary annotation task. Got 91% initially, then looked through the annotations and corrected a few errors that had snuck in. The UI for that is great. That got it to 92%.
Easy to use UI, easy data upload and the training was quick. A great tool for testing new ideas for classifiers. For bigger projects I'd be concerned about long term cost with pay per invocation.
Is weak labeling via labeling functions (snorkel, skweak) something that's on the roadmap for Nyckel? Also, do you plan to add named entity recognition?
Thanks you for the kind words and feedback! You basically went through most of the UI flow that we designed for. You're spot-on about testing new classifiers - answering the question "Can ML even help with my problem?" is much easier with Nyckel and prototyping and rapid iteration starts with that.
Our goal is to be cost-competitive, even for bigger projects. Given how early we are, our pricing structure is still being worked on, especially for high-volume.
Integrating with labeling solutions is in our roadmap. In the meantime, our API should enable any data/labeling integrations.
Named entity recognition is also in the roadmap. Would love to hear more about your use-case and we can give you access to the beta when ready.
Chiming in on the weak labeling question: As of right now, you can use outside libraries like skweak to create weak labels offline and then PUT those using our API (https://www.nyckel.com/docs#update-annotation). This wouldn't cost anything since we only charge for invokes, but it requires some work.
We may look at adding weak labeling as a first class feature of our site down the road, but we are not yet sure we need to. With the powerful semantic representations offered by the latest deep nets, we find that smaller number of hand-annotated samples often suffice for the desired accuracy which makes the whole annotation process simpler and faster. Of course, if you have data & evidence to the contrary, we'd love to take a look.
It's a cool challenge! I tried it at 90wpm and cleared most words. Some I had to do 2 or 3 times to do fast enough. Then I hit my nemesis: I can't type "necessary" fast enough for 90wpm. Tried it 20 times.
Q Insight Agency | Data Scientist | Remote or Mannheim, Germany
Q Insight Agency is a market research agency. We help our clients in consumer goods and pharma understand their customers. Our background is in qualitative research (interviews, focus groups, workshops) and now we are expanding into data science with a focus on social media. As a new team in an established company, the data science team enjoys stability and resources but also has the freedom to build.
We just launched Cosmention, an AI-powered social media monitoring tool specialized for cosmetics. It analyzes millions of social media posts from all platforms and detects mentions of brands, products, ingredients and other entities. Our stack: R, Python, Shiny, AWS, Snowflake, Docker.
The UI looks great! As others have said, I like how compact it is. It doesn't get in the way as much as MS Teams and Discord do. I also like that it is lightweight. It's important to me that the app stays performant while sharing the screen. MS Teams is too laggy.
Have you looked at Tuple? Noor seems quite similar. Could you please explain a bit more about the differences between Noor and Tuple?
Finally, as others have said: lack of Windows compatibility is a dealbreaker for me for now. A performant Windows app would be fantastic. That's also something that Tuple doesn't have.
I'm in the market for a tool like this. At the moment I'm using Prodigy but interested in other options. Features that I'd be willing to pay for (or rather my employer):
1 team functionality with multiple user accounts
2 easy to use workflow for double annotation where each text is annotated by exactly two annotators. The software should make sure that a text is never shown to more than 2 annotators and never shown to the same annotator twice
3 make it easy to review the 2 versions and solve conflicts
4 smarter alternative to review would be a warning system that identifies annotations that may have errors (because a model trained on the other data predicts a different result) and automatically flags it for review by another annotator
5 stats on the annotators: speed, accuracy, statistics on how frequently they assign different labels to detect potential misunderstandings of the annotation schema
6 GUI with overview of all annotation datasets, with stats like % finished annotating (with stages for double annotation and review), the types of annotation done, frequencies of labels to detect imbalances
7 functions to mass-edit the annotations, like renaming or removing an entity type
Another thing I'd be interested in is some integration with a third party annotation provider. There are companies that offer annotation as a service and it's also available on Google Cloud and AWS. Having that integrated into an annotation tool would make it very easy to get large amounts of well annotated training material.
But finally, and much more importantly: The workflow for annotators has to be perfected first, so they can work as efficiently and consistently as possible. Getting this right is more important to me than any of the other features I listed.
Mind if I ask what sort of team features you make use of with Prodigy? Are there any aspects you feel are lacking? Initial thoughts are that it'd be helpful for teams to be able to set group annotation goals, share docs / annotations / configs, view ongoing sessions, assign annotators to sessions, and view stats on each annotator (as per point 5).
> The software should make sure that a text is never shown to more than 2 annotators and never shown to the same annotator twice
For this I plan to let teams set the threshold for the number of documents that should overlap and the number of annotators a text should be shown to. In some situations it could be useful for there to be some % of overlap for all annotators to help determine the inter-annotator agreement across the entire team.
> The workflow for annotators has to be perfected first
Totally agree. My biggest concern is building out the above on top of an inefficient workflow. That's one of the primary driving forces behind the current re-write of the tool.
Love the smart flagging, mass-edit, and integrated provider ideas!
I use these team features in Prodigy: I start annotation sessions with different session_id and with the feed_overlap flag. I run Prodigy from an EC2 instance that annotators connect to.
The Prodigy team is working on a new version called Prodigy Scale with more team features. I'm looking forward to that release! For now it feels like a hack to use Prodigy in a team.
Inter-annotator agreement is key! You could consider making that highly visible in your tool. It's something that every team should measure and strive to maximize.
For developers who use spaCy in production (like me), I imagine it would be very hard for your tool to come out on top of Prodigy. But there could be an opportunity with price-sensitive hobby users or devs who use a different NLP library.