Can you "chunkify" output so that you can rate different elements independently? Like "This part of the answer is totally cool" and "Wait. This part right here includes a hallucination."
Then allow for feedback to be provided if an issue is spotted.
Briefly: "No, you cannot use a star-tree index with upserts in Apache Pinot. This is a built-in restriction that is explicitly enforced in the system."
Why This Restriction Exists
When upserts are enabled for a table in Pinot, the system explicitly checks and prevents the use of star-tree indexes. This validation happens during table configuration validation.
The code in TableConfigUtils.java explicitly enforces this restriction:
[snip]
At cursory glance, I did not detect a hallucination. The answer was true (AFAIK), clear, and objective. I also see that you can peek into other resources to get some additional contextual information.
My favorite part is how incompetent they were in handling the redaction:
"But when the Kentucky AG’s office was preparing to post their brief against TikTok, whoever was in charge of doing the redaction simply covered the relevant text with black rectangles. Even though you can’t see the text while reading the PDF, you can just use your cursor to select each black section, copy it, and then paste it into another file to read the hidden text. It is great fun to do this — try it yourself! Or just read our version of the brief in which we have done this for you."
TIL the Nika riots took place in the Roman colosseum, and the blues and greens cheered as lone individuals from each deme were sent down for slaughter. Yeah, this is hot garbage in terms of accuracy.
@Samplank2 - this may hurt to hear, but your assertions that better models and pipeline improvements will solve this are pure cope. What you really need to do here is manually curate and tune the prompts, then cherry-pick with a fine eye for detail. There’s no substitute for actual effort and knowledge, but you seem disinterested in that part.
Yeah. This is going to be bad. This retraction is driven by the stark drop-off in consumer sentiment. Intelligent billionaires like Buffet, etc., take that into consideration. You know, all that big-brained macroeconomy thing, like IS-LM curves. Grown-up stuff of real economists.
Too-cool-for-school sociopathic Pepe avatar tech dudebros don't know how the actual economy works, and worse, don't care.
I didn't even read the article, but I love the comments on the thread.
Yes. The implementation language of a system should not matter to people in the least. However, they are used as a form of prestige by developers and, sometimes, as a consumer warning label by practitioners.
There's certainly some aspect of that going on, but I think mainly it's just notable when you write something in a programming language that is relatively new.
Does it matter? In theory no, since you can write pretty much anything in pretty much any language. In practice... It's not quite that black and white. Some programming languages have better tooling than others; like, if a project is written in pure Go, it's going to be a shitload easier to cross compile than a C++ project in most cases. A memory-safe programming language like Go or Rust will tell you about the likely characteristics of the program: the bugs are not likely to be memory or stack corruption bugs since most of the code can't really do that. A GC'd language like Go or Java will tell you that the program will not be ideal for very low latency requirements, most likely. Some languages, like Python, are languages that many would consider easy to hack on, but on the other hand a program written in Python probably doesn't have the best performance characteristics, because CPython is not the fastest interpreter. The discipline that is encouraged by some software ecosystems will also play a role in the quality of software; let's be honest, everyone knows that you CAN write quality software in PHP, but the fact that it isn't easy certainly says something. There's nothing wrong with Erlang but you may need to learn about deploying BEAM in production before actually using Erlang software, since it has its own unique quirks.
And this is all predicated on the idea that nobody ever introduces a project as being "written in C." While it's definitely less common, you definitely do see projects that do this. Generally the programming language is more of a focus for projects that are earlier in their life and not as refined as finished products. I think one reason why it was less common in the past is because writing that something is written in C would just be weird. Of course it's written in C, why would anyone assume otherwise? It would be a lot more notable, at that point, if it wasn't.
I get why people look at this in a cynical way but I think the cynical outlook is only part of the story. In actuality, you do get some useful information sometimes out of knowing what language something is written in.
I do know of a shop where an OSS database written in Java was chosen over one written in C++ because of the ability of the internal team to read the code, modify it, troubleshoot it, etc. That makes sense. It that was driven by pragmatics — maintainability. Not simply bias, or aesthetics or "rule of cool."
Then allow for feedback to be provided if an issue is spotted.