Sure, you can do floating point in software. My point is that dedicated hardware is very likely more power efficient at it. The same goes for memory: accesses to off-chip memory cost much more energy than accesses to on-chip memory. It would be very interesting to get energy and performance numbers for some real world application running on both chips.