Hacker Newsnew | past | comments | ask | show | jobs | submit | rkunnamp's commentslogin

Thank you for sharing this. Sorry for a possible noob question. How are embedding generated? Does it use a hosted embedding model? (I was trying to understand how is semantic search implemented)


It, uh... generates mock embeddings? https://github.com/trvon/yams/blob/c89798d6d2de89caacdbe50d2...

(seems like there's some vague future plans for models like all-MiniLM-L6-v2, all-mpnet-base-v2)


Hmm I wonder how much that effects the compression benefits of block level duplication. The mock embeddings choose vector elements from a normal distribution, so it’s far from uniform


I couldn't get the search working (there was some cors error) . But what a feat and writeup. Wonder Stuck!


Plus 1 for GCP. Cant wait to try it out.


Asking a question here that I long wanted to ask. Is it possible to have a python handler function that uses duckdb to query a S3 hosted parquet file and that uses pandas for some data manipulation, run as a WASM app? (leveraging all features of duckdb like predicate pushdown etc)



Sorry for asking a possibly noob question. Doesn't firecracker vms requires bare metal instances? And does gcp support provisioning bare metal instances? Or is it that you are able to run firecracker on normal vm instances in gcp ?


GCP supports nested virtualisation


IPFS like "coordination free" local S3 replacement! Yes. That is badly needed.


Sorry for having to say the "cliche" answer "It depends"...But it really "depends"

It depends on the requirements like 1. requirements for having a self host version or acceptance of having cloud offering 2. data connector requirements (not all products will have native connectors to all data sources) 3. data blending requirements (not all products will have data blending capabilities to the same tune) 4. data size requirements (some support data in MBs, some can support data in PB's) 5. Programming knowledge requirements (some may have only non technical users using the product, some may have technical users who need finer control) 6. Visualisation options 7. Various Reproting options (some may have pdf export option, while some others may not have this) ....and the list is really long

There are a large number of parameters in deciding to choose a BI/Reporting solution

One can probably pickup a particular requirement and evaluate the best in class for that requirement. There is no one solution which excels in every single field.

Tableau has been the leader in the market for long.

And the list of offerings is endless From traditional Tableau, Looker, Sisense, Qlikview to modern POwerBi, Domo, Tableau online, Data Studio, Bime analytics

From open source solutions like redash.io, airbnb superset, pendaho, jasperreports to closed source solutions like crystal reports...

From simple tools like cyfe , to extremely complex ones

From generic solutions to industry specific solutions like Baremetrics (For stripe analysis), and my own ReportDash (for marketing reports)


As someone who has been watching Data studio for long, I can vouch that most of the observations here is true.


I happen to build a competing service.

Yours is a great list. Have saved it instantly! Thank You for that.


Stuff that helped me - Staying near or staying along with your closest ones - Sleep and Exercise - Some ayurvedic pills that is known to relieve stress and help sleep ("Ashwagandha" to be specific. I wont say it is a magic pill, but when I take it , I used to get a deep sleep)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: