Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you're at all interesting in getting into hardware, buy yourself a cheap FPGA and put a RISC-V CPU on it. I'm doing it with my kids and it's been tons of fun. :)


I recommend the DE0-Nano: https://www.adafruit.com/product/451

Altera's tools are much nicer for the beginner to use than ones offered by Xilinx. This board also has a lot of peripherals you can interface with (ADC, accelerometer, etc.) to grow your skills.


Have you tried Vivado? Xilinx ISE is hands down terrible, but supposedly Vivado was "500 man-years of engineering effort" [1]. Unfortunately, vendor lock-in means I can't try Altera's tools for my boards, but Vivado's high level synthesis is really cool. It generates a hardware design from a given C program, and then let's you tune the generated verilog [2].

[1]: http://www.eejournal.com/archives/articles/20120501-bigdeal

[2]: http://xillybus.com/tutorials/vivado-hls-c-fpga-howto-1


Vivado is miles ahead of Quartus, especially for beginners. Altera's tools have a very steep learning curve for beginners to hardware.


I think that is mainly a matter of opinion. Unfortunately, you can't use Vivado with Xilinx's older, cheaper chips. I think beginners are more likely to use these chips than their more advanced, expensive chips. We'll have to see if the Spartan-7 is supported by Vivado or not.


The newish Arty board has a 7-series part and is only $99


I've been wanting to try it, but for some reason Xilinx refuses to offer support for the Spartan-6. They only offer support for their expensive chips.


I haven't checked out the high-level synthesis part, but I've been using Vivado for a project recently and it's absolutely horrible. To name some of the issues I've had;

- Memory leaks (grows from about 1GB to 11GB and starts swapping in a couple of minutes when editing existing IP)

- Single-threaded synthesis is slow (though isn't limited to Vivado specifically)

- Failing after 20 minutes of synthesis because of errors that would be easy to check at the start

- Placing debug cores can result in needing to synthesise, find something went wrong, delete debug core, re-synthesise, re-add debug nets, synthesise again...

- Aggressive caching results in it trying to find nets which were changed and no longer exist, despite re-creating the IP in question from scratch

- Vivado creates a ridiculous amount of temporary files and is a royal pain to use with version control (there is an entire document which details the methods to use if you want to create a project to be stored under version control)

I've been playing around with IceStorm for the iCE40 device and it's an absolute joy to use: fast, stable and simple. I appreciate that there are a lot of complex tools and reports which Vivado provides, but I would much rather use an open source tool like IceStorm for synthesis alongside the advanced tools from Vivado.


What would be a use case for using one of these FPGAs rather than something like a Raspberry Pi with a traditional microcontroller?

I'm genuinely interested. I've been really curious about FPGAs but I don't know what a good use case there is for them for a hobbyist.


The use case for these boards is to learn how to use FPGAs.

Why would you use an FPGA? Mainly when you have specialized requirements that can't be met by processors. FPGAs mainly excel in parallelization. A designer can instantiate many copies of a circuit element like a processor or some dedicated hardware function and achieve higher throughput than with a normal processor. If your application doesn't require that, you might like them for the massive number of flexible I/O pins they offer.

Lastly, using FPGAs as a hobby is rewarding just like any other hobby. Contrary to popular belief, you don't "program" with a programming language like with a processor. You describe the circuit functionality in a hardware description language and a synthesizer figures out how to map it to digital logic components. You get a better insight into how digital hardware works and how digital chips are designed. When you use them as a hobby, you get the feeling that you are designing a custom chip for whatever project you are working on. Indeed, FPGAs are routinely used to prototype digital chips.


Sounds like something that might be just as enjoyable on a simulator.


No, not really. FPGA's by definition are massively parallel. There is no way you can simulate them on CPU's with any reasonable speed (think: 1ms cpu time for simulating one clock cycle, so your simulation maxes out at a few kHz max). That sucks all the enjoyment out of it.


Ha! I often find myself wishing I had an FPGA. It's very common that I have to control external devices that would be straight forward with logic but requires all soft of hacks and tricks using a microcontroller.

Here's just one simple example: controlling servos. Sure you can do that with most uC simply enough, say using timer interrupts, but what if I need to control 100 of them? In logic, I can just instantiate 100 very trivial pulse controllers where as this typically is impossible with a microcontroller or at the very least leaves no cycles free for any computation.

Another example: you want to implement a nice little crypto, like ChaCha20. Even though ChaCha20 is efficient, it's still a lot of cycles for a microprocessor, where as an FPGA can implement this is a nice pipeline, potentially reaching speeds like 800 MB/s, while still having ample resources left for other work.

I could go on.


Great comment and examples. The CPU's are optimized to try to do everything kind of fast a step at a time within specific constraints, often legacy. The FPGA's let us build just the right hardware for our computations using components that run in parallel with fewer constraints. The result is often some amazing performance/circuitry ratios.


To see the board running your program, there's a rare beautiful moment where YOU, one hobbiest, designed the whole stack :).


I don't know if this is a stupid question, but could one design a very basic LISP machine on a FPGA? how about a diminutive JVM?


Why not? It was done on real hardware in the 80s, right? hell, here's one:

http://www.aviduratas.de/lisp/lispmfpga/


It's very easy. Personally, I find Reduceron [0,1] far more interesting.

[0] https://www.cs.york.ac.uk/fp/reduceron/

[1] https://github.com/reduceron/Reduceron


I made a little FPGA LISP machine:

https://github.com/jbush001/LispMicrocontroller


A few ARM CPUs have not a JVM, but the ability to accelerate a JVM by directly executing Java bytecode (Jazelle DBX).

"The Jazelle extension uses low-level binary translation, implemented as an extra stage between the fetch and decode stages in the processor instruction pipeline. Recognised bytecodes are converted into a string of one or more native ARM instructions."


Cool, haven't thought about that. I probably need to get an FPGA. Really liked the book "The Elements of Computing Systems" [1][2] in which one builds a computer from NAND gates upwards, a compiler, vm and finally a simple OS with applications. The hardware part of the course seems to be on coursera now as well. [3]

[1] https://mitpress.mit.edu/books/elements-computing-systems

[2] http://www.nand2tetris.org/

[3] https://www.coursera.org/learn/build-a-computer


That's some really neat stuff that I somehow missed in prior research. Thanks for the links. I'm particularly going to have to take another look at the paper that details their methodology for building systems ground up. The abstraction process and samples.


Try interfacing one of them with DRAM, at a moderate speed.

You'll learn: Pipelines, Caches, Why cache misses are so painful and a whole host of CPU performance stuff that "looks" esoteric will become plain as day.


FPGAs are for pretending you have the money to fab every hardware design iteration. Small CPUs are...not?

Honestly the FPGA in data center stuff is probably mostly hype for most people, but toying with an FPGA is super fun.


Well... if you're going for something stupidly small/underpowered (ATTiny level of power consumption), but run out of cycles to handle multiple I/Os at the same time, FPGA allows you to cheat a little bit by doing things in parallel. For example with 10 inputs on standard CPUs you have to spend some cycles checking each one separately. With FPGAs you can have block for each and just get a signal propagated when something "interesting" actually happens. Then again, you could just invest in bigger batteries and better CPU instead :)


And here's an FPGA optimized RISC-V RV32IM core for that device (Altera Cyclone IV): https://github.com/VectorBlox/orca

I haven't tried this particular one.


I'm the principle author, I can answer any questions


This seems like an ultra-basic question (sorry). On the VectorBlox/orca github page, it mentions that the core takes ~2,000 LUT4s. Are those numbers apples-to-apples with the 22,320 LEs given for the Cyclone IV board mentioned earlier[1]?

If so, then (naively) could one pack ~10 on that single FPGA? Or does the 'packing overhead' become a big problem? Or does the design use more (say) multiply units pro-rata, so that they become the limiting factor?

[1] https://www.adafruit.com/product/451


Yes, that is more or less true, the problem you would run into is communication between cores. When you have lots of cores you run into problems if they all want to talk to the same memories. If they were could all run more or less independently there is no real issues. The reason I list LUT4s, is newer chips have ALMs which is definitely apples to oranges. Also there Cyclone IV chips with much more than 22K LUTs.


I haven't simulated this to check but it may be the case that IO pins become a limiting factor. The Cyclone IV has 153 IO pins so the design would have to use less than ~15 to be able to fit 10 copies on.


Well a CPU doesn't inherently use any I/O pins so that shouldn't be a problem.

You can easily add some logic to let CPUs share pins, too.


Good point sir


There is an amazing Amiga accelerator using the Alter FPGA http://www.kipper2k.com/accel600.html


The DE0-CV is a bit more expensive, but I think it's probably better than the Nano for the FPGA 101 type experiments. It's got more on the way of switches and LEDs, and the buttons are a lot easier to get to. I have both, and I had a lot more fun with low level learning activities with the CV than the Nano. Moving past that, I also enjoyed messing with the VGA and SD card peripherals more than the ADC and accelerometer, and also found those easier to add on after the fact.


I added a hand-made VGA "shield" to the Nano and used 3 digital I/O pins to get 3 bit color. I don't have a picture of the add-on board handy, but you can see a breakout game I made here: http://jrward.org/breakout.html


I also made a VGA shield for my Nano! I got a couple pictures at https://mobile.twitter.com/dyselon/status/648020130471899136

That project is actually kind of what convinced me to go ahead and buy the CV. It was fun, but it just felt like a lot of work making things that I could just already have on the board.


Can you tell us more about what you're doing and how? This seems like the kind of story HN would eat up, if you might like to write about it.


Which CPU did you try?


You may want to try PULPino: http://www.pulp-platform.org/


Any FPGA's available yet with an open(ish) toolchain? Including the bitstream generator/programmer?


Yes, there's a fully open stack for the Lattice iCE40s (the largest of which has 7680 LUTs):

http://www.clifford.at/yosys/ (Verilog synthesis) https://github.com/cseed/arachne-pnr (Place and route) http://www.clifford.at/icestorm/ (Bitstream generator and programmer)


I think some people had a full open source toolchain (or were close to it) for some of the Lattice FPGAs, but I think only the really feeble ones that have 10s of LUTs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: