I stumbled upon some great reddit posts this year with reading suggestions, and compiled my own "humanity is fucked" themed reading list, which included:
* Mercy of Gods by James S.A. Corey
* The Light Pirate by Lily Brooks-Dalton
* Oryx and Crake by Margaret Atwood
* Dawn by Octavia Butler
I then diverged from this list (I have more) to re-read (though it's not such a great divergence):
* If This Is a Man / The Truce by Primo Levi
Other books I enjoyed reading this year in no particular order:
* Tau Zero by Poul Anderson
* Machine Vendetta by Alastair Reynolds
* Elysium Fire by Alastair Reynolds
* Aurora Rising by Alastair Reynolds
* Shadow of the Silk Road by Colin Thubron (loved this)
* The Lord of the Rings (the god knows how many times re-read)
* The Centauri Device by M. John Harrison
* Future's Edge by Gareth Powell
* Blueshift by Joshua Dalzelle
* The Heart of a Continent by Francis Younghusband (I didn't quite manage to finish it, but it was a fascinating read nonetheless)
A decoder predicts the next word (token) to iteratively generate a whole sentence. An encoder masks a word in the middle of a sentence and tries to predict that middle.
The original transformer paper from google was encoder-decoder, but then encoder BERT was hot and then decoder GPT was hot; now encoder-decoder is hot again!
Decoders are good at generative tasks - chatbots etc.
Encoders are good at summaration.
Encoder decoders are better at summaration. It’s steps towards “understanding” (quotes needed).
It's an alternate architecture of LLMs, they actually predate modern LLMs. An encoder-decoder model was actually the model used in the "Attention if all you need" paper that introduced the transformer and essentially gave birth to modern LLMs.
A encoder-decoder model splits input and output. This makes sense for translation tasks, summarization, etc. They're good when there's a clear separation of "understand the task" and "complete the task", but you can use it for anything really. A example would be send "Translate to english: Le chat est noir." to the encoder, the encoder processes everything in a single step, that is understand the task as a whole, then the output of the encoder is fed to the decoder and then the decoder runs one token at a time.
GPT ditches the encoder altogether and just runs the decoder with some slight changes, this makes it more parameter efficient but tends to hallucinate more due to past tokens containing information that might be wrong. You can see it as the encoder running on each token as they are read/generated.
Edit: On re-read I noticed it might not be clear what I mean by past tokens containing wrong information. I mean that for each token the model generates a hidden state, those states don't change, so for example an input of 100 tokens will have 100 hidden states, the states are generated at once on the encoder model, and one token at a time on the decoder models. Since the decoder doesn't have the full information yet, the hidden state will contain extra information that might not having anything to do with the task, or even confuse it.
For example if you give the model the task "Please translate this to chinese: Thanks for the cat, he's cute. I'm trying to send it to my friend in hong kong.". For a enc-dec model it would read the whole thing at once and understand that you mean cantonese. But a decoder only model would "read" it one token a time it could trip in several places, 1. assume chinese means mandarin chinese not cantonese, 2. assume that the text after "cute." it's something to also translate and not a clarification. This would have several token worth of extra information that would confuse the model. Models are trained with this in mind so they're used to tokens having lots of different meanings embeded in them, then having later tokens narrow down the meanings, but it might cause models to ignore certain tokens, or hallucinate.
Hi, I'm not on the t5 Gemma team but work on gemma in general.
Encoder Decoder comes from the original transformers implementation way back in 2017. If you look at figure 1 you'll see what the first transformer ever looked like.
Since that time different implementations of transformers use either just the encoder portion, or the decoder portion, or both. Its a deep topic so hard to summarize here, but Gemini explains it really well! Hope this gets you started on some prompting to learn more
The announcement of the original T5Gemma goes in some more detail [1]. I'd describe it as two LLMs stacked on top of each other: the first understands the input, the second generates the output. "Encoder-decoder models often excel at summarization, translation, QA, and more due to their high inference efficiency, design flexibility, and richer encoder representation for understanding input"
You can use everything as a radiator, but you can't use everything as a radiator sufficiently efficient to cool hot chips to safe operating temperature, particularly not if that thing is a thin panel intentionally oriented to capture the sun's rays to convert them to energy. Sure, you can absolutely build a radiator in the shade of the panels (it's the most logical place), but it's going to involve extra mass.
You also want to orient those radiators at 90 degrees to the power panels, so that they don't send 50% of their radiation right back to the power panels.
I think the point is, yes, cooling is a significant engineering challenge in space; but having easy access to abundant energy (solar) and not needing to navigate difficult politically charged permitting processes makes it worthwhile. It's a big set of trade offs, and to only focus on "cooling being very hard in space" is kind of missing the point of why these companies want to do this.
Compute is severely power-constrained everywhere except China, and space based datacenters is a way to get around that.
Of course you can build these things if you really want to.
But there is no universe in which it's possible to build them economically.
Not even close. The numbers are simply ridiculous.
And that's not even accounting for the fact that getting even one of these things into orbit is an absolutely huge R&D project that will take years - by which time technology and requirements will have moved on.
Lift costs are not quite dropping like that lately. Starship is not yet production ready (and you need to fully pack it with payloads, to achieve those numbers). What we saw is cutting off most of the artificial margins of the old launches and arriving to some economic equilibrium with sane margins. Regardless of the launch price the space based stuff will be much more expensive than planet based, the only question if it will be optimistically "only" x10 times more expensive, or pessimistically x100 times more expensive.
I don't get this "inevitable" conclusion. What is even a purpose of the space datacenter in the first place? What would justify paying an order of magnitude more than conventional competitors? Especially if the server in question in question is a dumb number cruncher like a stack of GPUs? I may understand putting some black NSA data up there or drug cartel accounting backup, but to multiply some LLM numbers you really have zero need of extraterritorial lawless DC. There is no business incentive for that.
You must be very young. This was well-known back in the day. Lots of articles (some even posted here some time back) of rant on cars, how they were ruining everything.
Btw The cute one-line slam doesn't really belong here. It's an empty comment, adds zero to the conversation, contributes nothing to the reader. It only makes a twelve year old feel a brief burst of endorphins. Please refrain.
The idea that its faster and cheaper to launch solar panels then get local councils to approve them is insane. The fact is those Data Center operates simply don't want to do it and instead want politicians to tax people to build the power infrastructure for them.
Focused on all the interesting and exciting happenings in tech here, from AI to defence to deeptech, and posting the most interesting job openings too. Did you know Europe had two space launch startups? I didn't until I started this project!
Color scheme is a bit harsh for me. I understand you're going for EU colours, but maybe a softer background like #fcfcfc and a more muted blue would be easier on the eyes?
Great idea, I'm keeping my fingers crossed for this initiative.
I believe that the main challenge would be to get more traction and build a community. Hope you find a way to encourage as many people as possible to join the website.
My very minor nitpick -- I would add some kind of background colour to the main post list, something like #FAFAFA looks fine to me.
Yes, absolutely! The guidelines for now are basically "same as HN, but Euro-centric content please" :) I'll write these down somewhere explicitly soon.
Ooh I like this! I love Hacker news and Lobsters but they're both very US centric, seem great to have a European one.
UI is very nice and simple, one tiny bit of feedback is that a 'guidelines' page would be worthwhile, especially while it's new! I thought I'd post my own project on the site - sometimes that's a little bit of a no-no though, and I couldn't find any guidelines to steer me towards what types of things to share, etc.
Edit: Tiny extra feedback, is upvoting something immediately changes the rankings in the browser. It's pretty impressive speedwise, but especially if you're a couple pages in, you can bump something off of the page you're on which makes it a little weird to do something like 'upvote article and then check the comments'.
Thanks for the feedback and posting, I appreciate it!
I'm definitely going through the comments I've had later and will take everything onboard. Guidelines is a great idea - for now it's basically "HN guidelines but Euro-centric content please" but I should definitely write that down.
I like to browse HN via "/front" and "Go back day" and then look at the couple of top posts for each day. I don't see such a day-by-day view on TPE.
What is the "official" acronym? TPE? TP? TecPeu?
What is language policy? (e.g. it would be nice if people would post any language they want, and the system shows other users what language the link is, and then offers an alternative link to a translated version. I imagine this would be hard to implement in a way that is robust way, but maybe you when user submit a link, they can set the language themselves)
> Treaty shopping is a tax strategy where companies route profits through intermediary countries with favorable tax treaties to minimize overall tax liability.
You mean the automatic normalization HN does when you submit the title? Yeah, it's still quite basic compared to the real HN. I want to validate it properly before investing in lots of features :)
Great initiative.
I was confused by the comment section design. The style of the metadata is not distinct enough from the real comment. And it tooks me too long to understand that the responses to comments were not citations.
Interesting idea! I was kind of playing with the idea of doing something CrunchBase-like for the companies, jobs and funding rounds. But there's a lot of data out there publicaly too so I'm not sure if it's worth it. Will have a look at the HN clients too, thanks for the idea!
Ha, thanks for the feedback! People have made a few points about the styling, it definitely needs a harder look. Maybe a silly question but which do you find worse, the blue color or the underlines?
Hi! Are you looking for a collaborator? I had a list of European companies divided by sector, that follow GDPR rules, with 1.2k stars on GitHub, currently deleted because I wanted to create a website, where people can search also for jobs and projects proposed by those companies, we can make a section of your projects related to it, let me know, please. I really love your idea!
> Instead, I'd love for Google to understand me well enough to show me which restaurants I would disproportionately love compared to other people based on its understanding of my taste profiles.
I mean... this sounds like the perfect use case for a third party app like "My taste restaurant finder"? There are undoubtedly apps out there like this.
I don't think Google Maps (a general purpose maps app) should try to be everything for everyone. It's good enough for what it is.
* Mercy of Gods by James S.A. Corey
* The Light Pirate by Lily Brooks-Dalton
* Oryx and Crake by Margaret Atwood
* Dawn by Octavia Butler
I then diverged from this list (I have more) to re-read (though it's not such a great divergence):
* If This Is a Man / The Truce by Primo Levi
Other books I enjoyed reading this year in no particular order:
* Tau Zero by Poul Anderson
* Machine Vendetta by Alastair Reynolds
* Elysium Fire by Alastair Reynolds
* Aurora Rising by Alastair Reynolds
* Shadow of the Silk Road by Colin Thubron (loved this)
* The Lord of the Rings (the god knows how many times re-read)
* The Centauri Device by M. John Harrison
* Future's Edge by Gareth Powell
* Blueshift by Joshua Dalzelle
* The Heart of a Continent by Francis Younghusband (I didn't quite manage to finish it, but it was a fascinating read nonetheless)
reply