That's part of the point, in a way. Not to prove to people how much they don't know in a negative way (but maybe it comes across that way), but to provide a hopefully fairly shocking number to get them to rethink some of their basic assumptions about development. Using even a moderately different toolset can yield dramatically different (and I'd argue usually better) results.
Is there any rigorous study more recent than Lutz Prechelt's "An Empirical Comparison of Programming Languages"? That estimates a factor of two difference in lines of code between Java/C/C++ and Python/Perl/Tcl, albeit for a rather small project.
So I've been using "factor of two" as a rough benchmark. Of course, this assumes appropriate library support; if most of the C code implements a priority queue while C++ has that as part of the standard library, then the comparison is invalid.
I don't know if for this Grails comparison if the extra factor of two comes from Groovy or from the effectiveness of Grails over whatever the Java programmers would use for web app development.