More

vishnumenon · on Nov 29, 2023

My current hypothesis here is that the way to make coding assistants as reliable as possible is to shift the balance towards making their output rely on context provided in-prompt rather than information stored in LLM weights. As all the major providers shift towards larger context-windows, it seems increasingly viable to give the LLM the necessary docs for whatever libraries are being used in the current file. I've been working on an experiment in this space[0], and while it's obviously bottle-necked by the size of the documentation index, even a couple-hundred documentation sources seems to help a ton when working with less-used languages/libraries.

[0]: https://indexical.dev/

mr_toad · on Nov 30, 2023

I like that your solution is basically telling the LLM to RTFM.

vishnumenon · on Nov 30, 2023

Yeah I've been using it with prompts that ask it to cite sources as well, honestly I think the best results are when I'm still interacting w/ the docs directly in addition to having the LLM look at em - still can't quite replace needing to RTFM!

spadufed · on Dec 1, 2023

This is the way forward imo. Particularly as we've started to flesh out the relationship between model size and true context reliability. We've found that raw context-window size is not representative of what the model can actually consistently recall, but we've also found the recall is consistently reliable out to a point. I suspect more robust theoretical models around superposition will move us a long way towards understanding the limits of context reliability rather than what would currently be an experimental approach.

vishnumenon · on Nov 9, 2023

I've been working on Lightrail for a couple months now, just added gpt-4-vision-preview support and figured it was as good a time as any to do a Show HN. It's still very rough around the edges, but I'm hoping it can provide a compelling alternative to siloed per-app AI assistants. So many of my favorite AI workflows involve working across apps, so I wanted to build something that made those workflows easy -- and that can support a community of new integrations and workflows/actions that are simple to throw together. Would love to hear feedback & happy to answer any questions!

vishnumenon · on Oct 27, 2023

The Magical Japanese Art of Luggage Forwarding: https://craigmod.com/ridgeline/170/

(iirc this was on HN front-page a day or two ago, which might be why its top of mind)

wahnfrieden · on Oct 27, 2023

same author as the iseji article above

vishnumenon · on Oct 26, 2023

Not the GP, but I've been working on an open platform [0] for integrating OpenAI APIs with other tools (VSCode, JupyterLab, bash, chrome, etc) that you might find interesting, the VSCode integration supports editing specific files / sections etc.

Also worth taking a look at Github Copilot Chat[1], it's a bit limited but in certain cases it works well for editing specific parts of files.

[0] https://github.com/lightrail-ai/lightrail

[1] https://marketplace.visualstudio.com/items?itemName=GitHub.c...

vishnumenon · on Oct 23, 2023

Seems similar in spirit to Nushell[0], which is a project I'm really rooting for, a super interesting update of the shell/pipes framework

[0]: https://www.nushell.sh/

geophile · on Oct 23, 2023

Check out marcel: https://marceltheshell.org, and https://github.com/geophile/marcel. Both marcel and nushell start with the idea of piping structured data instead of strings, which is incredibly powerful. (This also applies to osh. I am the author of osh and marcel.)

Marcel (and osh) rely on Python types and language where typical shells have sublanguages. So instead of awk or find and their sublanguages, you just use Python. Instead of piping strings, you pipe streams of Python values.

Marcel lets you use Python on the commmand line. It also has an API which allows you to use shell-like commands inside of Python programs.

Metus · on Oct 23, 2023

Some stuff I'd love to see implemented:

- Typed inputs and IntelliSense. There is very basic support for types, even so, I'd love even more strict types and type inference as in Typescript, so my terminal and shell can give me a hint what kind of input the command is expecting. At the same time, IntelliSense should tell me what command flags are still available and what are viable inputs, like cd suggesting only directories, or kubectl --namespace suggesting available namespaces.

- A concept of past commands as building blocks and/or interactive data wrangling. Many times I am mucking around in zsh to find a chain of commands that reliably gets the data I need out of some CSV or other source, retyping the same few commands with some new links until it works the way I want it to.

- A command like EXPLAIN in SQL so I can see where I should rethink what I am doing so I can refactor that part of the chain. At the same time, I'd love it if I could take one of those magic snippets from Stack Overflow and have an EXPLAIN-like command pick the components apart and explain the flags via some structured docstring format.

- Snippets in the Shell for some regularly used patterns of command chains.

- The concept of transactions, like in SQL, so I can run a command or script without worrying about it failing halfway through - the shell automatically undoing its changes. Maybe this should work on the level of a shell session even.

geophile · on Oct 23, 2023

Marcel (my successof of osh) supports using past commands in a few ways:

1) Command recall, like any shell.

2) Edit of a previous command, giving you access to the command in $EDITOR.

3) Abstraction. For example, suppose you want to find all files recursively, looking for those changed in the last 3 days. You could write:

    ls -fr | select (f: now() - f.mtime() < days(3))

To turn this selection into a reusable command:

    recentN = (| N: select (f: now() - f.mtime() < days(int(N))) |)

This is basically a function on a streams, that takes an input N, and filters for files that changed in the last N days. To use it:

    ls -fr | recent 3

I don't understand what EXPLAIN would do. In SQL, EXPLAIN helps you see the actual implementation chosen for a non-procedural statement. Marcel, nushell, etc. have no optimizer, so there is no invisible execution plan, that would be made visible using EXPLAIN.

Transactions: Might be feasible with a shell built on some kind of COW filesystem, but I'm dubious of how much transaction isolation is really possible even then.

vishnumenon · on Oct 12, 2023

Lightrail: https://github.com/lightrail-ai/lightrail

Basically, I think that LLMs will enable a whole new set of app UXes, and I'm trying to build a platform for those UXes. In a sense, a "shell" for LLM apps. It's a command-bar style UI with an SDK that makes it really easy to build functionality on top of LLMs / vector-dbs etc and to interact with other software/files (e.g. VSCode, Chrome, etc.). Currently very limited docs but if you're interested in building LLM workflows/tools, I'd love to collab!

vishnumenon · on Sept 20, 2023

Author of the piece, and I... can't really argue with any of this, tbh. I'll admit both those parts you called out were probably heavily tinged by my own desires, and your more-sober predictions are a very reasonable counterargument for what will happen in the general case. I suppose, especially regarding swappable LLMs, I _do_ only expect it to be an option for devs or sophisticated users; I assume that most folks probably wouldn't care, I'm just hoping there's enough of us that do care that at least some options offer that swappable functionality. Fwiw, I also use Linux (Fedora) as my daily-driver, and I'd be more than content if the predictions from this post came true in a similar vein, e.g. as an OSS option (or family of OSS options) that some subset of users can opt to use.

vishnumenon · on Sept 20, 2023

OP here, as someone who does love cooking, I've gone down this route pretty heavily in the last few years - been growing my collection of physical cookbooks and definitely enjoy flipping through them in search of inspiration. So, yeah, very much endorse the cookbook UX!

vishnumenon · on Sept 20, 2023

I'm the author, and I don't disagree with this at all - I do use LLMs pretty meaningfully in my day-to-day engineering work, and I definitely agree that they hold a ton of promise in situations like the one you mentioned. To be clear, I'm very much bullish on the applications of LLMs to e.g. coding! The point I was trying to make there was just that for _certain_ tasks, the process of chatting with an LLM is, by nature, less precise and more arduous than a purpose-built UX. By analogy, we might have "describe-your-change" functionality in Photoshop, but its no replacement for all the other pixel-perfect editing functionality that it provides, and I'd struggle to imagine a world where Photoshop is ever _replaced entirely_ by a chat UX

vishnumenon · on Sept 8, 2023

Love it! Reminds me of the hilarious "Typing the technical interview" [1] and "Typescripting the technical interview" [2], a couple of my favorite blog posts of all time.

[0] https://aphyr.com/posts/342-typing-the-technical-interview

[1] https://www.richard-towers.com/2023/03/11/typescripting-the-...

epgui · on Sept 8, 2023

I think you have an off-by-one error there!

other_herbert · on Sept 8, 2023

Just a ui bug, the presentation layer doesn’t match the data layer

mizzao · on Sept 9, 2023

Don't forget "Rewriting the technical interview":

https://aphyr.com/posts/353-rewriting-the-technical-intervie...