More

elashri · 2025-01-28T02:40:21 1738032021

probably https://llmsystem.github.io/llmsystem2024spring/

elashri · 2025-01-25T01:40:28 1737769228

Not OP but I would guess that in this context they mean "Debug Adapter Protocol".

elashri · 2025-01-24T22:54:43 1737759283

I would like to add that this is probably not deceptive advertising. At least not intentional deceptive as many people including me didn't know that CC licenses are not meant for software and is not considered open source. I don't know if it is common misunderstanding or not but I think there is strong case to say that some people intuitively would think so.

telenardo · 2025-01-24T22:58:01 1737759481

Yes, that's right. This was definitely not intentional and we are very open to changing it to something more appropriate!

freeone3000 · 2025-01-24T23:14:24 1737760464

I think the license choice is great. It allows noncommercial use, modification, and redistribution. It’s not “open source” according to the champions of the term (since it violates the use-for-any-purpose requirement) but I’m a huge fan of this license and license several of my projects CC-NC-BY where AGPL would be too heavy-handed.

MacsHeadroom · 2025-01-25T01:15:30 1737767730

BSD or MIT license would be nice.

HeatrayEnjoyer · 2025-01-25T01:47:35 1737769655

AGPL would be better

gus_massa · 2025-01-25T14:22:23 1737814943

Amazon and other cloud providers avoid AGPL, so I think it's closer to the intentions of the OP.

josephernest · 2025-01-24T23:22:40 1737760960

I think your choice is very appropriate.

And it is open source.

Probably not OSI-open source or FSF-open source but it is open source, period.

emacsen · 2025-01-24T23:50:41 1737762641

"It's not recognized as Open Source by the Open Source body, and doesn't meet the criteria of Free/Open Source Software, but is Open Source" is a bit like saying "I used GMO and petroleum based pesticides, but my produce is all organic."

josephernest · 2025-01-26T08:10:31 1737879031

But here the source is open!

Why should we restrict the meaning of Opel Source, a societal mouvement since decades, to a list of criteria that FSF or OSI decided?

Open source is not a trade mark by FSF or OSI.

OP did not say it is free/libre software, but just open source, which it is.

We don't need "source available", just open source is correct.

PS: can you define the open source body in your previous comment?

emacsen · 2025-01-26T18:19:53 1737915593

Why should words like "organic" in relation to food mean without pesticides? I mean all carbon and water based life forms are organic, right?

I can define Open Source easily, using the OSI definition.

There is not a trademark for Open Source because they failed to secure the trademark, but we have decades of use for the term meaning something specific.

emacsen · 2025-01-24T22:59:43 1737759583

It might not be, but I can't understand how someone who has written such advanced software, and includes a monetization plan, and then posts about it on HN also doesn't take the time to choose a license.

Even if they didn't know CC wasn't suitable for software, everyone knows that non-commercial isn't Open Source.

I didn't dig into the software, but I wonder if the licenses for the dependencies allow this either, eg if any are GPL or similar.

josephernest · 2025-01-24T23:20:13 1737760813

> CC wasn't suitable for software

This is wrong. CC is perfectly fine for software in some cases, such as here.

Ok, CC is not tailored specifically for software, thus the general advice "you should use something else" but I do not see why CC would not be suitable here to achieve OP's goals.

Can someone explain?

bmelton · 2025-01-25T00:49:28 1737766168

Creative Commons' FAQ addresses this

    Unlike software-specific licenses, CC licenses do 
    not contain specific terms about the distribution 
    of source code, which is often important to ensuring 
    the free reuse and modifiability of software. 
    Many software licenses also address patent rights, 
    which are important to software but may not be 
    applicable to other copyrightable works. Additionally,
    our licenses are currently not compatible with the 
    major software licenses, so it would be difficult to 
    integrate CC-licensed work with other free software. 
    Existing software licenses were designed specifically
    for use with software and offer a similar set of 
    rights to the Creative Commons licenses.

emacsen · 2025-01-24T23:54:15 1737762855

Software licenses, especially the more "advanced" licences such as the GPL, MPL, and others include very specific language around the issue of what is use, what is distribution, what is is connecting to, derived works, and importantly, around patents.

The CC licenses do an amazing job when it comes to artistic work such as books, movies, music, etc. but you don't have the same issues there, and that's why even CC says that they don't recommend using them for software.

elashri · 2025-01-24T20:57:13 1737752233

It is sad that this is happening to PhysicsForums. It was one of first websites I was using frequently 15 years ago when I started my physics passion (later career). I was active reader and contributed on few occasions but I still remember some members who I thought that one day I will be smart and knowledgeable like them. With years and the move to social media following Arab spring things started to change (as part of the overall transition from forum being the dominant place for discussions). But I stopped visiting it around 2018 unless I came through google search (later kagi). I still find the archive useful to answer some questions and I would disagree with author of article that because no one is sharing links on twitter that means no one care.

elashri · 2025-01-23T11:52:04 1737633124

Yes, it is down for me too. There are many report on openAI discord as well.

elashri · 2025-01-21T06:10:29 1737439829

At the price of $5,000 before taxes. There would be better and most cost effective options to run models that will require that much memory.

csomar · 2025-01-21T06:36:12 1737441372

It is a laptop. The memory is also shared which means if you are looking for a non-gaming workload, you can use it. If you have laptop equivalents in the same memory range, feel free to share.

rfoo · 2025-01-21T07:29:55 1737444595

I have laptop equivalents in the same memory range and is at least $2,500 cheaper.

Unfortunately, it does not have "unified memory", a somewhat "powerful GPU", and of course no local LLM hype behind it.

Instead, I've decided to purchase a laptop with 128GB RAM with $2,500 and then another $2,160 for 10 years Claude subscription, so I can actually use my 128GB RAM at the same time as using a LLM.

csomar · 2025-01-21T11:39:25 1737459565

That's not the same thing. Also, can you share this 128GB $2500 laptop?

kridsdale1 · 2025-01-21T10:44:23 1737456263

Ok, but that means you’re not getting full privacy. It’s a trade off.

kergonath · 2025-01-21T07:09:35 1737443375

I see this comment all the time. But realistically if you want more than 1 token/s you’re going to need geforces, and that would cost quite a lot as well, for 100 GB.

nenaoki · 2025-01-21T07:25:37 1737444337

https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

GB10, or DIGITS, is $3,000 for 1 PFLOP (@4-bit) and 128GB unified memory. Storage configurable up to 4TB.

Can be paired to run 405B (4-bit), probably not very fast though (memory bandwidth is slower than a typical GPU's, and is the main bottleneck for LLM inference).

kergonath · 2025-01-21T16:52:50 1737478370

That’s not something I can get, so it’s not really relevant. There is always a better device around the corner.

justincormack · 2025-01-21T10:08:52 1737454132

Not shipping until May or so.

elashri · 2025-01-20T01:45:04 1737337504

A lot of people have problem with selective enforcement of copyright law. Yes, changing them because it is captured by greedy cooperations would be something many would welcome. But currently the problem is that for normal folks doing what openai is doing they would be crushed (metaphorically) under the current copyright law.

So it is not like all people who problems with openAI is big cudgel. Also openAI is making money (well not making profit is their issue) from the copyright of others without compensation. Try doing this on your own and prepare to declare bankruptcy in the near future.

cmeacham98 · 2025-01-20T01:53:36 1737338016

Can you give an example of a copyright lawsuit lost by a 'normal person' that's doing the same thing OpenAI is?

elashri · 2025-01-20T02:35:14 1737340514

https://journa.host/@jeremiak/113811327999722586

adwn · 2025-01-20T04:46:25 1737348385

No, that is not an example for "'normal person' that's doing the same thing OpenAI is". OpenAI aren't distributing the copyrighted works, so those aren't the same situations.

Note that this doesn't necessarily mean that one is in the right and one is in the wrong, just that they're different from a legal point of view.

BeefWellington · 2025-01-20T08:25:18 1737361518

> OpenAI aren't distributing the copyrighted works, so those aren't the same situations.

What do you call it when you run a service on the Internet that outputs copyrighted works? To me, putting something up on a website is distribution.

adwn · 2025-01-20T09:51:54 1737366714

Is that really the case? I.e., can you get ChatGPT to show you a copyrighted work?

Because I just tried, and failed (with ChatGPT 4o):

Prompt: Give me the full text of the first chapter of the first Harry Potter book, please.

Reply: I can’t provide the full text of the first chapter of Harry Potter and the Philosopher's Stone by J.K. Rowling because it is copyrighted material. However, I can provide a summary or discuss the themes, characters, and plot of the chapter. Would you like me to summarize it for you?

david_allison · 2025-01-20T11:29:00 1737372540

> The first page of "Harry Potter and the Philosopher's Stone" begins with the following sentences:

> Mr and Mrs Dursley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much.

> They were the last people you'd expect to be involved in anything strange or mysterious, because they just didn't hold with such nonsense.

> Mr Dursley was the director of a firm called Grunnings, which made drills.

> He was a big, beefy man with hardly any neck, although he did have a very large moustache.

> Mrs Dursley was thin

https://chatgpt.com/share/678e3306-c188-8002-a26c-ac1f32fee4...

adwn · 2025-01-20T12:09:20 1737374960

With that very same prompt, I get this response:

"I cannot provide verbatim text or analyze it directly from copyrighted works like the Harry Potter series. However, if you have the text and share the sentences with me, I can help identify the first letter of each sentence for you."

chaos_emergent · 2025-01-20T04:50:30 1737348630

Aaron Swartz, while an infuriating tragedy, is antithetical to OpenAI's claim to transformation; he literally published documents that were behind a licensed paywall.

DoctorOetker · 2025-01-20T15:50:19 1737388219

That is incorrect AFAIU. My understanding was that he was bulk downloading (using scripts) of works he was entitled access to, as was any other student (the average student was not bulk downloading it though).

As far as I know he never shared them, he was just caught hoarding them.

elashri · 2025-01-20T18:40:09 1737398409

> he literally published documents that were behind a licensed paywall.

No he did not do this [1]. I think you would need to read more about the actual case. The case was brought up based on him download and scraping the data.

[1] https://en.wikipedia.org/wiki/United_States_v._Swartz

elashri · 2025-01-19T17:32:56 1737307976

However for these large size repositories. I'm not sure that you fit in the effective context window. I know that there is option to limit the token but then this would be your realistic limit.

elashri · 2025-01-19T14:37:27 1737297447

Unless there is a significant increase in the effective context window in LLMs. Pursuing the goal of having agents working on complex goals is not going to work well. All the tricks and hacks trying to work around this problem is not going to fundamentally change that.

LLM agents will lose track of what they are trying to do after couple of trials. That's something that would differentiate human PhD is that while not fast or always creative, they have better attention memory span.

OutOfHere · 2025-01-19T15:42:28 1737301348

https://github.com/MiniMax-AI/MiniMax-01 is an open model that claims a 4 million context. Note however that longer context makes evaluation expensive as you are paying for every token. Still, it is true that OpenAI seriously needs a better solution for it.

AstralStorm · 2025-01-19T19:24:52 1737314692

How many tokens in a Bible? A typical technical book?

Thing will lose context before it has finished reading...

OutOfHere · 2025-01-19T19:48:53 1737316133

I think it's time to partition the context into L1, L2, and L3 contexts. L1 is the current context with a quadratic memory requirement. L2 is based on fancy mechanisms such as what is used by Gemini and MiniMax-01, having a sub-quadratic to linear memory requirement. L3 is based on document and chunk embeddings having a linear to logarithmic memory requirement. LLMs don't use this approach, but I think it might make sense. As for how this partitioning would work at the neural layers, that remains to be determined.

OutOfHere · 2025-01-20T04:12:50 1737346370

Titans [1] goes sort of a bit in this direction.

[1] https://arxiv.org/abs/2501.00663

xnx · 2025-01-19T14:55:24 1737298524

What's would be a significant increase to you? Gemini does 2 million tokens and Google just released a research paper to go beyond that.

mountainriver · 2025-01-19T23:35:47 1737329747

Reasoning is probably enough

elashri · 2025-01-18T02:27:36 1737167256

To avoid relying on projects like that in the future. I use selfhosted version of Reactive Resume [1]. Then I can keep the last version that worked even if change the license or stopped working. I would recommend everyone to look into this as a potential solution.

[1] https://github.com/AmruthPillai/Reactive-Resume