We are working on Gossip: the future of media insights (think Meltwater, Brandwatch, etc).
It’s been a journey but getting close to launching our first version to pilot customers in August. We use an enormous amounts of AI tokens every month to extract data not possible with any traditional player in this media monitoring space. Benchmarking competitors, tracking impactful discussions, and receiving actionable brand insights.
If you are currently using one of the big media monitoring companies, I’d love to chat!
We are using over 50 billion LLM tokens for NLP/classification purposes per month. A mix of self hosted and cloud hosted models. But I have not attempted any fine tuning. Just prompt, (and perhaps more importantly) context “engineering”.
But in general we found the best course of action is simply label everything. Because our customers will want those answers and rag won’t really work at the scale of “all podcasts the last 6 months. What is the trend of sentiment Hillary Clinton and what about the top topics and entities mentioned nearby”. So we take a more “brute force” approach :-)
I'm working on a podcast and audio media monitoring and research platform. Easily the most complex thing I've built. But still so much more to do to make it truly useful. Testing PR and public affairs use cases at the moment.
I used to work for a media monitoring company but was instantly struck by how old fashion everything was. And the total reliance on boolean searches meant only experts could find relevant information. This still appears to be the case for most players in the industry.
So I'm building a platform that finds what is important before you look for it. Novel entity linking, and sentiment analysis plus speaker tracking. It has come a long way from the proof of concept. Focused on audio media at the moment as it is the hardest to index compared to news articles in my opinion. And the hypothesis is audio media such as podcasts can contain so many juicy insights.
Next steps are converting pilot customers to paying customers, testing more markets (based in a tiny market now), raise a small pre-seed (bootstrapped at the moment), and quickly evolve the product based on feedback.
yes, definitely some skin deep similarities! We both transcribe podcasts, but what we do after that is very different ;-) (also, i'm a fan of his podcast).
There is also more generally Mention, Brand24, Meltwater, etc, in the media monitoring space. But all are generally weak in the audio media space.
When we stop looking at audio media like a newspaper article, things get more interesting.
What kind of problems are you facing for entity extraction with 3.5? I am also currently working with 3.5 for entity extraction and entity linking. It is a fun pipeline but curious what issues you ran into?
I can definitely argue that mirrorless is always better for most users! Thinner bodies, modern lens selection (so faster autofocus and wider aperture, etc), real-time image previews, better low-light preview (in both rear and viewfinder screens), video recording capabilities better match professional video cameras (by removing mirror complexity) which better matches the hybrid needs of the modern camera buyer.
It had to be said, it's true. Mirrorless camera displays give off light, the screen is brighter than ambient light. It's can't get dim enough preserve night vision if you're using a telescope for instance. You can shoot in very low light with the high ISO speeds available on a modern camera but if you ruin your night vision you lose awareness of everything but what you're shown in the finder, just in general. I'm sure you can think of others.
> Even with async/await it's still single-threaded.
That doesn't mean anything. V8 is single-threaded but Node.js I/O is non-blocking. The reason Node became popular in the first place is that companies started adopting it to fill the gaps in their existing infrastructure (e.g. Java) to offer "realtime" (i.e. web sockets or its experimental equivalents) communication.
> IMO async/await is error-prone and isn't an ideal programming model.
What's the point of unsubstantiated statements like that (btw "x is error-prone" is an empirical claim, so prefixing it with "IMO" just means "I can't back this up and don't care if it's true") other than stirring up pointless language rivalries?
Just acknowledge that your off-hand comment about "callback hell" was anachronistic and don't try to come up with excuses to justify your preferences. I think Elixir is neat and hope it can see sufficient adoption for me to justify getting invested in it but that doesn't justify poopooing other languages, especially ones you admit not to have up-to-date knowledge about.
My complaint isn't that you don't provide research. My complaint is that you use "IMO" to make a claim that could easily be substantiated instead.
E.g. "in my experience async/await can easily result in bugs that can be hard to detect" or "when teaching beginners, I've found that they have a harder time wrapping their head around async/await than when learning about agents" or "async/await still requires writing imperative code, requiring the programmer to pay attention to behavior that agents can abstract away through declarative code". Now, I don't know if any of those statements are true or if they reflect your experience but these are examples for what you could have said, assuming you didn't just want to say "I don't like JavaScript and I prefer Elixir or Erlang".
And yes, people just dumping strong opinions with little more substance than gut feelings is very much a problem in Software Engineering. That doesn't mean we can't work on that and practice a little more hygiene and respect for each other.
EDIT: To be clear, saying "I don't like X an I prefer Y or Z" is perfectly fine too as long as you are honest about this being your own preference rather than some grand truth about the universe. The problem comes from insisting that everyone else is wrong for not feeling the same way.
I've always taken that as the characters not knowing which language is being spoken. If it's been established I expect to see [SPEAKING MANDARIN] or some such.
They don't subtitle it because the audience isn't expected to know what's being said.
But when it's subtitled in the movie itself (which is then covered by the TV's closed-captioning system), the audience is clearly expected to get to know what's being said.
It’s been a journey but getting close to launching our first version to pilot customers in August. We use an enormous amounts of AI tokens every month to extract data not possible with any traditional player in this media monitoring space. Benchmarking competitors, tracking impactful discussions, and receiving actionable brand insights.
If you are currently using one of the big media monitoring companies, I’d love to chat!
https://www.gossipinsights.com/en/top-companies/us/