Hacker Newsnew | past | comments | ask | show | jobs | submit | lovelearning's commentslogin

Different subsets of commentators I think.

I don't often comment on HN in either of these cases but... I think aspects of both things are true.

There is a serious AI bubble right now and also it is the norm in the startup/VC world to fuck over regular employees.

I'm happy for any normal people who got 1.5M here. But even in this case I believe this has more to do with weird poaching politics (and hype building) than it does being legitimately altruistic.


I think it's a useful insight for people working on RAG using LLMs.

Devs working on RAG have to decide between parsing PDFs or using computer vision or both.

The author of the blog works on PdfPig, a framework to parse PDFs. For its document understanding APIs, it uses a hybrid approach that combines basic image understanding algorithms with PDF metadata . https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Anal...

GP's comment says a pure computer vision approach may be more effective in many real-world scenarios. It's an interesting insight since many devs would assume that pure computer vision is probably the less capable but also more complex approach.

As for the other comments that suggest directly using a parsing library's rendering APIs instead of rasterizing the end result, the reason is that detecting high-level visual objects (like tables , headings, and illustrations) and getting their coordinates is far easier using vision models than trying to infer those structures by examining hundreds of PDF line, text, glyph, and other low-level PDF objects. I feel those commentators have never tried to extract high-level structures from PDF object models. Try it once using PdfBox, Fitz, etc. to understand the difficulty. PDF really is a terrible format!


Why are such articles flagged? And why do only flaggers get a flag vote but there's no unflag vote?


If this News Media Alliance put some effort into enabling per-article micropayments or a prepaid credits system valid across all its members, there'd be fewer people looking to bypass paywalls.


A website owner can publish their website's capabilities or data as "tools". AI agents and LLMs like ChatGPT, in response to user prompts, can consult these tools to figure out their next actions.

Example:

1. An author has a website for their self-published book. It currently checks book availability with their database when add to cart is clicked.

2. The website publishes "check book availability" and "add to cart" as "tools", using this MCP-B protocol.

3. A user instructs ChatGPT or some AI agent to "Buy 3 copies of author's book from https://theirbooksite"

4. The AI agent visits the site. Finds that it's MCP-B compliant. Using MCP-B, it gets the list of available tools. It finds a tool called "check book availability", and uses it to figure out if ordering 3 copies is possible. If yes, it'll next call "add to cart" tool on the website.

The website here is actively cooperating with the agent/LLM and supplying structured data. Instead of being a passive collection of UI elements that AI chatbots have to figure out based on UI layouts or UI captions, which are generally very brittle approaches.


I'm not very familiar with Node.js. Any idea where in isoflow's code are the graphics for those 3D-style icons? Are they SVG or what? Is it possible to add custom icons?


I was curious also. The SVGs for the isoflow library of icons is in node_modules/@isoflow/isopacks/dist/isopacks.md

(yes, svg within base64 within markdown)


Thank you for the parenthetical there. When I read your first line I thought, "surely they didn't.."

They did.


Curious what's the argument for/against that here. I agree it's at least unusual.


I don't have any good argument in either direction, if I'm being honest. Just feels.. weird.



Is predictability not essential for electric, water, or cloud?

I didn't understand why news can't run on postpaid pay-per-use model, which I think you are implicitly referring. Note that pay-per-use isn't necessarily implying micro-transactions; we pay utility bills just once a month, and cloud is either postpaid pay-for-use or prepaid credits that are deducted based on usage.


It is essential, but the ones I mention are something that is very hard to do without. While there are alternatives they're not widespread and require significant shift in operating style to roll out. Newspaper content is not a necessity so it doesn't work the same way.


Performative virtue signaling is not always easy.


OpenAI makes statements like: [1]

1) "excel at a particular task"

2) "train on proprietary or sensitive data"

3) "Complex domain-specific tasks that require advanced reasoning", "Medical diagnosis based on history and diagnostic guidelines", "Determining relevant passages from legal case law"

4) "The general idea of fine-tuning is much like training a human in a particular subject, where you come up with the curriculum, then teach and test until the student excels."

Don't all these effectively inject new knowledge? It may happen through simultaneous destruction of some existing knowledge but that isn't obvious to non-technical people.

OpenAI's analogy of training a human in a particular subject until they excel even arguably excludes the possibility of destruction because we don't generally destroy existing knowledge in our minds to learn new things (but some of us may forget the older knowledge over time).

I'm a dev with hand-waving level of proficiency. I have fine-tuned self-hosted small LLMs using PyTorch. My perception of fine-tuning is that it fundamentally adds new knowledge. To what extent that involves destruction of existing knowledge has remained a bit vague.

My hand-waving solution if anyone pointed out that problem would be to 1) say that my fine-tuning data will include some of the foundational knowledge of the target subject to compensate for its destruction and 2) use a gold standard set of responses to verify the model after fine-tuning.

I for one found the article quite valuable for pointing out the problem and suggesting better approaches.

[1]: https://platform.openai.com/docs/guides/fine-tuning


> Don't all these effectively inject new knowledge?

If you mean new knowledge in the sense of "improved weights for a particular task", I guess yes, but the issuing about "new knowledge" is about learning something you didn't know if the first place, rather than being able to more accurately arrive at a conclusion.

1. "excel at a particular task" -> no. A lot of what gets in the way of excelling at a particular task is extraneous knowledge that leads to "thinking" about things that are not relevant about the task. If the job is "hot dog or not hot dog", knowing about the endless "hot dog is a sandwich" debate or the people with hot dog fingers in Everything, Everywhere, All At Once tends to just gets in the way of doing the job as accurately and efficiently as possible.

2. "train on proprietary or sensitive data" -> no. Training on proprietary or sensitive data might not give you any new knowledge, but it may allow for much more refined weights to drive probabilistic decisions. So, if I train on a model with thousands of examples of X-rays of potential cancer patients, it doesn't learn new ideas, but it does learn better weights for determining if it is seeing a tumour.

3. "Complex domain-specific tasks that require advanced reasoning" "Medical diagnosis based on history and diagnostic guidelines" "Determining relevant passages from legal case law"

If you fine tuned an engine to identify species of animals, ones that it is already aware of, you can produce a model that knows with high confidence that "jaguar" is a kind of cat, and not a car or a sports team. It has this high confidence because it knows that, after lots of examples, knowing about there's a car or sports team with that name just gets in the way of making good judgments.

"OpenAI's analogy of training a human in a particular subject until they excel even arguably excludes the possibility of destruction because we don't generally destroy existing knowledge in our minds to learn new things (but some of us may forget the older knowledge over time)."

That is a pretty broad statement about the workings of the human mind. We absolutely do lose sight of neural pathways as our brain learns. I can't remember most of what I learned when I was 2 years old.


What's a primary source in this context? The Chinese government?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: