FPGAs are definitely not a dead end. By virtue of being reconfigurable, they will never be obsolete as long as ASICs are a thing. Now, some whole new technology will come along eventually, supplanting present day ASICs and FPGAs... but until then...
Program as a term means something different with chip design than it does with software. An analogy is that to program an FPGA is to paint a canvas. The source code in chip design is instructions for how the canvas should be painted.
Another analogy would be to program an FPGA is to cook a meal. The source code is the recipe for the meal. But one doesn't run a recipe on a meal.
These analogies break down because a painting and a meal is passive... it doesn't do anything by itself, or react to the outside world.
So another analogy would be building a car. Here "programming" and "building" are the analogous terms. The instructions for the assembly line to construct the car is the source code. Once built, the car responds to stimulus (steering wheel, pedals) and does stuff. Same with the FPGA. It has inputs, it responds and does stuff. If you painted a picture of a CPU in your FPGA, it could run software.
There is tremendous overlap in designing for an FPGA and an ASIC. Most ASICs start life as an FPGA simply to prototype an idea.
The difference between an ASIC and an FPGA, at a high level from a design perspective, is the difference between writing with a pen vs a pencil. Learning to write is equally applicable.
It's probably not helpful to think about right now, but an FPGA is actually an ASIC.
The LUT based architecture is starting to run out of steam, I think a CGRA sort of architecture is the future, but programmable logic startups will likely fail, and there's approximately a zero percent chance that Xilinx or Altera would try anything that new.
Problem is you still generally need simple logic to combine some course grained blocks. Also we have a lot of FGPA's that including adders, RAM, DPS cores, and more course grained devices.
Honestly, a LUT can be pretty efficient structure for what it does. The biggest advantage to coarse grained structures is their they are much faster since the internal construction can use optimal routing.
The biggest issue with FGPA's is the programmable routing/connections. Ideally each LUT would form a complete graph. However, the number of wires grows at the approx rate n(n-2)/2 where N is the number of LUTs. So instead the structure is more hierarchical. Still the majority of silicon on an FPGA is still just used for routing.
However, I think an array of ALU's actually could be quite useful for some applications over an FPGA.
I think GPUs, FPGAs and scalar cores will all mix into a single fabric. As you mentioned, FPGAs are getting dedicated hard blocks, GPUs are getting scalar cores and CPUs are getting LUTs.
> However, I think an array of ALU's actually could be quite useful for some applications over an FPGA.
Well this depends on the underlying routing architecture of either system. However you are right in general since finer grain logic means more things that need routing.
Nothing stops you from treating LUT outputs in groups like corse grain system though. FPGA manufactures could make chip with a different routeting topology that works really well for certain applications.
However, we could be making lots of devices that fit certain data flow patterns better. By doing so makes the devices simpler and faster.
Routing is pretty important. Its just current FPGAs are built with quite flexible interconnect.
If you want see really limited programble interconnect look at some old PLDs that you program by blowing fuses.
Program as a term means something different with chip design than it does with software. An analogy is that to program an FPGA is to paint a canvas. The source code in chip design is instructions for how the canvas should be painted.
Another analogy would be to program an FPGA is to cook a meal. The source code is the recipe for the meal. But one doesn't run a recipe on a meal.
These analogies break down because a painting and a meal is passive... it doesn't do anything by itself, or react to the outside world.
So another analogy would be building a car. Here "programming" and "building" are the analogous terms. The instructions for the assembly line to construct the car is the source code. Once built, the car responds to stimulus (steering wheel, pedals) and does stuff. Same with the FPGA. It has inputs, it responds and does stuff. If you painted a picture of a CPU in your FPGA, it could run software.
There is tremendous overlap in designing for an FPGA and an ASIC. Most ASICs start life as an FPGA simply to prototype an idea.
The difference between an ASIC and an FPGA, at a high level from a design perspective, is the difference between writing with a pen vs a pencil. Learning to write is equally applicable.
It's probably not helpful to think about right now, but an FPGA is actually an ASIC.