Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What AI Software Product/Software Do You Need?
13 points by iiJDSii on Dec 12, 2023 | hide | past | favorite | 31 comments
Just looking for people with problems, perhaps some inspiration for something to build.

Maybe you notice an issue in your day-to-day, at work, or have an idea you want but don't have the time to do it yourself.

Please share! Cheers :)




> perhaps some inspiration for something to build.

First off, thank you for stating this explicitly. I get annoyed by the posts clearly fishing for ideas while acting like they have no specific motive for the questions. So I like your transparency!

Honestly, I have a desire for an AI digital art product that has nothing to do with LLMs, but would instead be akin to what "style transfer" work was heading towards a few years back, but never quite landed where I hoped... what I'd like to see is an AI that can look at multiple artistic works done by the user, and blend them together for new results, but all sourced on that user/artist's original works. Something that lets me say, "I love how the aesthetic of my painting turned out, but I wish the image depicted this photo I took." Style transfer, but personal styles only, and without the "deep mind" artifacts that filled that stuff back in the day.

I have no idea if people have kept working on those areas or not - I found references to various GANs in more recent years, but they all still seem to suffer from those "deep mind" artifacts that made the initial work interesting but ultimately unusable for creative pursuits.


Thanks for your response. Yeah, I don't see why to beat around the bush - I'm looking for ideas! And I figure people can dump something they'd like built but maybe don't have the time/effort to do. Doesn't need to be anything huge or important, just some starting point that ideally does something useful for someone.

Regarding yours, I think that's really cool but probably not within my purview :) Still appreciated though. Doing some basic research I played around with Dall E and Midjourney, but found them very hard to "guide" in such a way. Though I'm sure something more relevant and powerful is on the horizon.


I recently read somewhere that a paper is a small part of a long on-going conversation. I found this description aptly.

To comprehend the latest paper, I frequently find it necessary to grasp the context provided by its citations. All pertinent papers collectively shape a Directed Acyclic Graph.

Is there a tool available that would enable me to organize papers in a DAG, allowing me to formulate a structured reading plan? Currently, my PDFs are scattered across different locations, and I essentially have to rely on memory to recall the dependencies between papers.

I also want to share my organized papers with the same research group.


For research papers, scite.ai makes citation contexts super obvious.

e.g. for this DOI: 10.1016/j.biopsych.2005.08.012

here is our report page: https://scite.ai/reports/association-between-amygdala-hypera...

Metadata at the top, the list of citations below with relevant contexts from those citations about your paper of interest.

There's a visualization feature as well.


That's a really good one. As a prototype, what do you think about something like:

- Paper you are currently reading taking up 2/3rds of the screen (left side) - Right side of the screen is a graph of references from, and references to, the paper - You can click on any node in the tree (paper titles of references and referencees) and then that paper loads up on your left side reading pane?

(Bonus: maybe nodes in the tree can be sized relatively by how cited a paper is.)

Also, would using a single site like arxiv.org be sufficient? Maybe for some fields of research but not others? As a dev I'd want to identify a single main repository as a starting point.


Fractal book summaries: I'd like a tool that allows me to put in a PDF, .epub, or .mobi of a book, and have it output a chapter by chapter summary of varying degrees of summarization. So that then I can read the book in a fractal way. I can start with a one paragraph summary of each chapter, and I can click on anything I want to see more detail of, and it'll be instant. (So, it does all the summarization one off before I start reading)


I want something that will let me hook up the output of a llm to a bash terminal and put that terminal output right back into the bash terminal, maybe in a container. I would want to have another prompt that would be for instructing the llm on what its goal is. For example, make a script that prints out an ascii picture of a cat. The llm then gets to work on the bash terminal, using VI or whatever to bang out the script. The the supervisor llm would be able to ask questions or get additional input when it wanted to. I would want this to be sane and not awful to use. Points for free software. Big points for locally hosted.


Here's a fun project that you could try. Use TTS to transcribe books but make the transcriptions feel more realistic. Give each character in the book a unique voice. Leading characters should have voices based on their personalities. Use quote extraction and character attributtion to tie characters to lines. Try to do convey the human qualities with EmotionML, SSML, or some kind of semantic analysis.

The best would be a TTS system at the level of OpenAI's but with voice selection like GCP TTS so you can get quality and a range of voices.

Copyright would probably spike any monetization effort but you could try. It would be nice to have an open source tool for this though! :)


I think RAG are the sweet spot of current tech. I've got a client with a 50 year repository of technical reports and while my memory is great, the organization logic is abysmal.

Something that can locate files, excerpts, timelines and basic QA from just a point and shoot capacity would kill so many small-medium orgs. it's basically plugnplay search to bright engineers, scientists, technical staff, etc up to speed. add flexibility without having to "train" someone up. basically, bypass ever having to hire interns.


There are few companies working on this like Inkeep, Kapa AI etc


RAG: Retrieval-Augmented Generation


Simple youtube thumbnail generator would be great, thank you.

Not all that complicated junk out there, but just something simple - take photo or two, ask me to describe the thumbnail and text to put on top, that’s it.

Yet no product is out there that can produce at least mediocre result.

Which makes me think… isn’t this wave of hype is yet another scam?..

Anyone remembers the bitcoin 5 years ago?


Just asking a question, what you didn't like with templating design tool like Canva ?


1. Time waste 2. Not unique thumbnail based on my specific needs 3. You can’t just tell it what you want like you presumably would do to a decent AI model


A very smooth, no-BS app that does handwritten text -> Markdown.

It should do headings, pictures (as local files), and other kinds of formatting as well.


I'd recommend Typora. No frills markdown editor with WYSIWIG support. It has a nice feature where if you copy paste media such as images into the document it can optionally create a subfolder at the same level as the markdown file to organize the doc's associated media.


Obsidian does the same and much much more.

I want to convert handwritten notes to text - smoothly.


An AI that is local and can tap into (either though fine tuning or RAG) the complete context of what I see, hear, think, ingest, and excrete so that I can get a better understanding of who I am and how I can improve. I want to see trends about myself that could only be discovered by an AI that knows more about me than I know myself.


Actually working on something like that right now. Hopefully have a rough prototype in a month or so.


anyone who creates this, and scales it, will take over the world

me too, I want this to be a real product

should it be a new device from a new player? or should it be from the big players (apple, google, etc), and on current devices (phones, comp, etc)?


Something like the GPTs from openai that I can run locally without being hosted in the cloud


Out of curiosity what are the use cases/benefits for running models locally?

I also have this fascination because I'd like to have my LLM exist like a 'pocket calculator' that's safe from any technical fiasco in the future...

But aside from this I'm wondering why others are into this as well.


The benefit is data ownership instead of sending it off to a third-party who does God knows what to it.


Latency, customisation, privacy, cost, availability.


I wonder how AI used "for good" would look like. To take all the disinformation online and point out the contradictions and things that do not make sense in a clear way.

Of course, it remains to be seen if people would be convinced or look for their already stablished opinions, etc. Still, cynicism aside, I wonder how something that balances that toxicity using AI would look like. Maybe reducing news to plain facts, like news wire services?

You could train on the comments of major newspapers.


A native macOS app that uses local LLMs for writing in any application, think ehanced autocomplete / tab completion etc... 100% local and no Electron.


Markup-friendly spell-checking that doesn't suck. It should spell-check comments in c code, but not syntax. It should work well with emacs.


iOS/Android keyboard that is context aware and can correct your typos based on context.


Doesn't Gboard on Android already do that? On my phone, it has a setting that says "Next-word suggestions: Use previous words in making suggestions", and I can sometimes tell that it's obviously doing that.


I think some of Gboard's features depend on the phone and the language, e.g., "fix it."


Maybe, but that setting has been there for me for as long as I can remember.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: