After Moore's Law: how phones are becoming open-source

phkahler · on July 11, 2015

I was just thinking about the end of Moore's law. People in the industry will tell you they've been overcoming obstacles for decades, but there is a fundamental change going on. Previous obstacles were manufacturing: Reduced wavelength, double patterning, immersion lithography. Some were part manufacturing and part device change: SOI, strained silicon. But now we're seeing changes to the devices themselves: Tri-gate transistors and now IBMs SiGe material for 7nm.

That change to SiGe is to increase electron mobility. The material could have been used all along, but it has been easier to stick to the tried and true Si. The bottom line is that plain silicon is not really viable below 10nm. A material change to effect one parameter isn't likely to be useful for more than a node or two before the same problem arises in that material.

I'd say 14nm FinFET will be a long lived node just like 28 has been. Then 7nm SiGe will be a higher end node with higher cost. Perhaps there will be another at 5nm but that's going to be a long time.

The equipment manufacturers will need to stay in business so they'll start selling stuff cheaper to the lower cost players as development of new nodes stops. This will lead to more capacity at the advanced node. Every micro controller will be made at 28nm and have an FPU (I can finally abandon fixed point math). Every SoC with a GPU will be made at 14nm. And lastly, laptops and desktops will have 7nm parts that are very expensive due to the expense of different materials, light sources, and the number of masks.

That's my guess at the market over the next few years and lasting for at least 10 more years beyond that. It will be very interesting to see how all the players cope with this. Even ARM will suffer as their CPU license cost starts to become a problem in a world of cheap chip production.

narrator · on July 11, 2015

I can also to see the Chinese (Loongson[1]) and Russian (Elbrus[2]) processors start to approach state of the art process technology with progress slowing down. It's interesting how the developed world closed the 20th century with such a big lead in technology. It seems that the rest of the world is now quickly closing the gap.

[1] https://en.wikipedia.org/wiki/Loongson

[2] https://en.wikipedia.org/wiki/Elbrus_(computer)

minthd · on July 11, 2015

First , we don't need to wait, there are already microcontorllers for $1 with a floating-point unit.Also sleep current is rising greatly as one moves to new nodes - so some micro-controllers will stay on older nodes. But yes 28nm should be popular for mcu's - especially that now we have tools to package very tiny(sub mm2^2 dies) with lots of pins.

Anyway, today , and much more so in the future, microcontorllers will be commoditized(heck for most purposes today, many companies offer chips with low-enough power) and extremely cheap - and the determining factor will be software - and software ecosystem.

So the company who could come with with a great improvement to productivity and quality of embedded software developmet - and control that tool - would be in a very good place.

wtallis · on July 11, 2015

> Also sleep current is rising greatly as one moves to new nodes - so some micro-controllers will stay on older nodes.

How fundamental is this? TSMC has 5 different 28nm processes, so there's obviously some room to adjust things. Can't microcontrollers get effective power gating to keep idle power low when moving to smaller processes?

sliverstorm · on July 11, 2015

TSMC can tweak it all they like, but a smaller gate leads to higher leakage currents. Plus popular older nodes are still getting tweaked.

That said, you can still attain amazing sleep currents on modern advanced nodes. The catch is just that the best way to do that is with advanced design techniques, which require more design effort.

minthd · on July 12, 2015

>> amazing sleep currents on modern advanced nodes.

How low sleep current on a 40nm/28nm mcu would be with those advanced techniques ? can youplease estimate ?

sliverstorm · on July 12, 2015

Well, anywhere from full power to zero, if you design a chip that can flush its state to persistent storage and go dark. x86 chips exemplify this wide range, with many power states all the way from full power to shutting off the entire package.

minthd · on July 11, 2015

You can't power gate sram, or device state registers. That's like reset.That's a large fraction of the chip.

How fundamental - i'm not sure.

on July 11, 2015

[deleted]

minthd · on July 11, 2015

The languages that fit the web need: extreme simplicity(i.e garbage collection, very high abstraction, very simple to learn) and rapid prototyping(i.e. dynamic typing).

The languages that fit micro-controllers will need - no garbage collection(to support real-time), abstraction without loss of performance, high-reliability(it's expensive to fix bugs after shipment).

Those differing requirements tend to produce different languages. For example , rust - which embedded developers find very interesting, didn't seems to attract much attention by web developers.

mbertschler · on July 11, 2015

It will be really great when microcontrollers are made in todays advanced nodes. Besides becoming more powerful their low power consumption will be ideal for new applications like IOT and sensors everywhere.

yuhong · on July 12, 2015

I expect that DES, 64-bit KASUMI, or other crypto cracking ASICs will probably want to use 14nm, right?

Leszek · on July 12, 2015

I agree with Linus Torvalds about Moore's law:

"It's like Moore's law - yeah, it's very impressive when something can (almost) be plotted on an exponential curve for a long time. Very impressive indeed when it's over many decades. But it's _still_ just the beginning of the "S curve". Anybody who thinks any different is just deluding themselves. There are no unending exponentials."

hyperpallium · on July 12, 2015

So we'll switch to an entirely different basis - as we did from vaccuum tubes, relays, gears - I like Kurzweil's historical argument on this.

It's like Peak Oil - when the straightforward methods are exhausted, the price goes up, making it viable to explore other avenues that were comparatively inefficient before. Think about an intel-killer. Think about VCs thinking about an intel killer. Some tech may turn out to be much better than expected. To change the figure again, those neglected minor veins may lead to greater deposits, as has happened before.

There is so. much. demand for this.

eg biological neurons are far superior to our best silicon.

titanomachy · on July 12, 2015

This reminds me of Peter Thiel's idea of "indefinite optimism": we've been spoiled by a recent history of frequent cutting-edge innovation, so we assume that it will continue despite having no idea of the shape it will take. The key here is recent history -- humanity has only been advancing at its current breakneck pace for a few generations. Technological stagnation is more the normal mode.

I think we should consider the possibility that growth in processing power will level out at least for a while. There are some interesting research avenues which may yet bear fruit, but there's no guarantee that some miraculous scientific breakthrough will appear just in time to salvage exponential growth. "Moore's Law" is not some fundamental law of the universe.

carapat_virulat · on July 12, 2015

Yes, some people take the approach that because one guy called Malthus was wrong before, things can't turn really bad ever. Never understood that kind of reasoning.

Seems like plenty of people even if they don't believe in Father God, they still have some kind of faith in Mother Nature, or Brother Progress that have our backs covered whatever we do.

bsder · on July 12, 2015

> eg biological neurons are far superior to our best silicon.

By what metric?

Biological neurons misfire and make errors all the time. Biological neurons don't like -40C or 100C very much. Biological neurons get rather upset when you subject them to a couple of G's of acceleration.

leereeves · on July 12, 2015

We'd need artificial neurons without those limitations.

For example, just a few days ago a team from UCSD announced something like that called 'memprocessors'.

coldtea · on July 12, 2015

Unless it's an actual product, those announcements are a dime a dozen. We're still waiting for "memsistors" and "cold fusion".

leereeves · on July 12, 2015

It's research, but they claim to have built some.

http://advances.sciencemag.org/content/1/6/e1500031.full

p1esk · on July 12, 2015

Their conclusion:

"The actual machine we built clearly suffers from technological limitations, that impair its scalability due to unavoidable noise. These limitations derive from the fact that we encode the information directly into frequencies, and so ultimately into energy. This issue could, however, be overcome either using error correcting codes or with other UMMs that use other ways to encode such information and are digital at least in their input and output."

So, it remains to be seen how feasible this machine is.

coldtea · on July 12, 2015

>So we'll switch to an entirely different basis - as we did from vaccuum tubes, relays, gears - I like Kurzweil's historical argument on this.

The historical argument boils down to: "we've managed to do this a few times in the past, THUS we'll be able to do it indefinitely".

Hardly a logical or coherent argument.

hyperpallium · on July 12, 2015

Yeah, I misspoke "argument". It's odd that he doesn't actually articulate the "increasing returns" argument. It's that better technology enables you to see and do better (smaller, larger, faster, purer; whatever your trajectory). Knowing more gives you awareness of more avenues; more people (population) working on it (not needed for the basics of food, shelter etc) enables exploration of those avenues. But most of all, better technology, tools, methods give faster iteration. Because trial-and-error is how we explore unknowns.

Of course, all of that is about discovery - and pre-conditioned on there being something useful to discover. Whether there is or not is necessarily a matter of faith... since, by definition, we don't know. Historicaly it's turned out that way, but rephrasing you, past performance is no guarantee of future performance.

Personally, I see Chaitin's work as showing that there is infinite pattern; and with infinity, some of it is bound to be useful. It's a matter of finding it.

Of course, in the present case we already have a proof by existence that better technology is there to be discovered: you.

tzs · on July 11, 2015

> Reworking the supply and demand equation, Xiaomi limits availability through online-only flash sales. Tightly controlled and lean, this model allows the manufacturer to sell phones at near cost to the fortunate few who can catch a flash sale before supplies run out.

Why does the manufacturer want to sell at near cost?

minthd · on July 11, 2015

Controlling the OS many people use, selling other services, having a great consumer brands and using it to sell other stuff(it has a plan to partner with 200 startups) ,building an e-commerce store/empire , etc.

Also most consumer brands sell via a retailer(and it;s hard to change that). Xiaomi sells direct-to-consumer. That gives it more options with regards to pricing.

revelation · on July 11, 2015

I would suggest the same reason PayPal literally gave people money, but I don't know.

microcolonel · on July 12, 2015

RISC-V is going to win now, it seems.

Just need a similar project for a big wide VLIW or similar part for graphics, then plop them together. When that happens, ARM and intel will change dramatically as companies, I imagine.

Narishma · on July 12, 2015

GPUs have already transitioned away from VLIW to scalar architectures. At least on the desktop front. They may still exist in mobile GPUs.

microcolonel · on July 12, 2015

Indeed, looking at GCN, it seems like you could do as well with RISC-V itself(with a fancier SIMD extension) as the shader ISA.

cft · on July 11, 2015

Silicon lattice constant (roughly the distance between atoms) is 0.56nm. The 5nm process will have 10 atom transistors. It's hard to imagine a traditional process smaller than that. That means that the end if Moore's law is near.

jessepython · on July 12, 2015

I think quantum computing with quantum states in atoms being transistor states will be final progression of Moore's Law. But we are definitely feeling the effects of the end right now. I think it was Michio Kaku who said that we have about 50 years left until Moore's Law is officially done ushering in the collapse of the industry and global economy.

coldtea · on July 12, 2015

>I think it was Michio Kaku who said that we have about 50 years left until Moore's Law is officially done ushering in the collapse of the industry and global economy.

Not even sure what he means. The efficiency of processors, especially beyond a certain point, hardly makes any difference to the "global economy".

Heck, there are even notable economists saying that the effect of the whole "www" on the economy wasn't that special compared to things like the first telecommunications, road and planes.

It's difficult to see when ones lives and breath in the IT industry echo chamber, but, you know, diminishing returns.

The fact that despite 50 years of Moore's law we didn't have anything like a 8,589,934,592 [1] times larger global economy, actually not even a 100x larger, should be enough proof that the Moore's law doesn't have a strong influence on it...

[1] 2^((50*12)/18).

cft · on July 12, 2015

The effect of the silicon computers was huge however (office productivity, industry automation, embedded systems, numerical modeling of analog systems such as biotech, airfoils, etc)

p1esk · on July 11, 2015

To many experts back in the day it was hard to imagine a traditional process below 100nm.

cft · on July 11, 2015

i am not an expert, but this is an elementary physics knowledge.

p1esk · on July 12, 2015

LOL, yep, they said something similar: "You have to break laws of physics to get under 100nm". When you reach a single atom limit, why can't you start building vertical structures? Moore's law specifies the number of transistors per area, it says nothing about the height of that area. Specifically, you're not restricted to a single layer.

cft · on July 12, 2015

Because:

1. going from 2D to 3D is a one time improvement, not a continuous law that runs for 30 years: at some future point, say a 5nm process becomes 3D, and the number of transistors jumps from N^2 to N^3. That's a single time change.

2. making single atom conventional logic gates is physically impossible: it will have quantum effects and thus will behave like a quantum computer. To make a classical (non-quantum) logic gate, you need at least 8-10 atoms (and that's pushing it).

janekm · on July 12, 2015

No, going from 1xN^2 to 2xN^2 to 3xN^2 and each further stacked layer from there on is still going to be a very major challenge (getting rid of heat being a very major physical challenge for a start). We're already 32 x N^2 for flash packages, but with stacked dies. For flash heat is much easier to manage as only a small part of the die is active at any one time. Stacking memory on top of a cpu core is also common practice in mobile SoCs (though still mostly package-on-package). Going to N^3 from todays M x N^2 is squarely (cubedly? ;)) in the realm of science fiction, there's a lot of space for improvement here.

thoughtsimple · on July 11, 2015

Oddly timed article given the announcement by IBM of their successful 7 nm test chips.

wtallis · on July 11, 2015

Successfully making a 7nm chip doesn't mean it will ever be cost-competitive with a 28nm or 16nm chip. Per-transistor cost isn't going down anymore, so these new processes are only going to attract customers that are willing to pay more for lower power or higher performance. It's no longer the case that the whole product line will be brought forward when the fab capacity becomes available.

It was only this spring that AMD finally removed 40nm GPUs from the low end of their product line. They'll keep selling 28nm GPUs until the fab equipment breaks.

minthd · on July 11, 2015

But if you add the energy costs, it makes financial sense to move to newer nodes. Maybe we need some financial innovation that charges a part of the chip in small installments that you won't feel , because of savings ,while also supporting moore's law ?

zanny · on July 12, 2015

Then you get the option. Today you have a much more pronounced choice between cheaper more power hungry AMD parts or expensive power efficient Intel ones.

In 2020 we will very likely have the choice between 7nm expensive high end CPUs and 14-16nm midrange parts, and budget phones / tablets/ low end cpus will be shipping 20nm. And we will still be using todays 24/28/32/40/45nm plants for various other circuits.

mbertschler · on July 11, 2015

This is so wrong. They are acting if processors can only get better by increasing chip speeds. If that were the case todays quad core CPUs with 3GHz should match the performance of a Pentium 4 3GHz * cores, which is definitely not the case.

jsnell · on July 11, 2015

You're making an extremely uncharitable reading, especially give the author. The article specifically acknowledges that there are other ways of increasing performance, which is exactly what CPU manufacturers have been concentrating on for the last decade. And where they're getting that extra performance from (e.g more cores or extra IPC through larger cache, better branch prediction, extending the ISA, etc) you're going to need more transistors.

The real point of the article is what happens to the hardware ecosystem once the density scaling stops. Not the exact mechanism by which node advances have been translating to improved performance.

mbertschler · on July 11, 2015

Probably I don't really get the point of this article, or the exact changes in the hardware ecosystem when the density scaling stops? Because although scaling became harder recently, I don't think we have reached the point where the processing performance increase per year is slowing down.

I have a problem with that paragraph: Even if it takes a couple of years and several iterations for a self-taught engineer to reproduce, the resulting product isn't terribly out of date: the iPhone has only undergone a modest increase in clock rate over the past four years, from 1GHz in the iPhone 4 to 1.4GHz in today's iPhone 6. This relative stagnation is endemic, leaving a large window of opportunity for engineers to learn from and emulate the designs of the best.

The iPhone 6 was released with iOS 8, which doesn't support the iPhone 4 anymore. Also the CPU performance of the A8 in the iPhone 6 has 12x the performance of the iPhone 4. If you still want to use current apps, an iPhone 4 is really outdated.

On the other hand for the Novena laptop of the author the same principles might not hold true, but that has more to do with software than hardware, because especially Linux works very well with not so recent hardware. In my experience Android and iOS software became way more processing power hungry than desktop software did over the last 4 years.

pbbbt · on July 12, 2015

Depending on what you mean by processing performance, the increases did slow down, significantly, around 2004-2005. At least for non-numerical single-threaded workloads. Had they continued at the same rate, CPU performance would be anywhere between 5 to 10X higher today.

This is a commonly used motivation for a lot of modern computer architecture research, here's one such plot: http://liberty.princeton.edu/Projects/AutoPar/

hyperpallium · on July 11, 2015

GPUs are a better measure of performance improvement, because they more easily add cores, and so aren't as affected by the clock rate plateau - for GPUable computation.

p1esk · on July 11, 2015

I'm pretty sure that a single core of a modern CPU is much faster than Pentium 4 at the same clock rate. As a result, I believe that a proper parallelized code running on a modern quad core CPU would be much faster than the same code running on 4 P4s.

mbertschler · on July 11, 2015

This is what I meant. While I looked over some benchmarks I saw that a single Haswell thread for example seems to be 3-4 times faster than a Pentium 4.