Hacker News new | past | comments | ask | show | jobs | submit login

"So this optimization really does proceed because strlen is a builtin function, and does not depend on strlen being marked as pure or const."

No. It's exactly the other way around. We could not give a crap whether it is a built in function, only whether it is pure or const. We do not special case built-in functions anywhere near this optimization.

  ~/sources/gcc/gcc (git)-[master]- :) $ grep BUILT_IN tree-ssa-sccvn.c
      && DECL_BUILT_IN (TREE_OPERAND (op->op0, 0))
      && gimple_call_builtin_p (def_stmt, BUILT_IN_MEMSET)
	   && (gimple_call_builtin_p (def_stmt, BUILT_IN_MEMCPY)
	       || gimple_call_builtin_p (def_stmt, BUILT_IN_MEMPCPY)
	       || gimple_call_builtin_p (def_stmt, BUILT_IN_MEMMOVE))

  ~/sources/gcc/gcc (git)-[master]- :) $ grep BUILT_IN tree-ssa-pre.c
  ~/sources/gcc/gcc (git)-[master]- :( $
The special casing you see in the first part is trying to constant fold a few built-in calls in a utility function, and trying to see through memcpys for memory state.

The reason the optimization proceeds is because strlen gets marked as pure by the compiler if there is no non-pure definition that overrides it.

Basically, the compiler defines a function named "strlen" that is pure and nothrow behind your back, but you can override it by providing your own definition. This is unrelated to whether it is a builtin (because the builtin version is __builtin_strlen)




"strlen gets marked as pure by the compiler if there is no non-pure definition that overrides it"

That's what I meant when I wrote that "the compiler recognizes strlen, and optimizes it specially." It was not my intent to say that only builtins or all builtins may be hoisted in this way, although now I see how someone could interpret that. That was bad wording on my part.

The point I was trying to communicate is that gcc can do special optimizations on functions that it recognizes as builtins - for example, replacing printf() with fputs(). One illustration is how strlen is treated as pure, even if it is not marked as pure. As someone with far more gcc expertise than me, would you agree with that point?

"Basically, the compiler defines a function named "strlen" that is pure and nothrow behind your back, but you can override it by providing your own definition. This is unrelated to whether it is a builtin"

Well, the optimization is defeated by -fno-builtin, so I assumed that the underlying mechanism is that a call to strlen is replaced by a call to the builtin. Was this wrong?


It is wrong, but only because of the weirdness of how this works. The call to strlen is not replaced with builtin_strlen, it defines both a builtin_strlen and a strlen. Both are marked pure and nothrow.

Due to some wonderful oddness around freestanding environments, if you use -fno-builtin, it will define neither.


" ... if it can prove the global memory state does not otherwise change between calls in a way that impacts that pure call

Would it still work with -fno-strict-aliasing? I guess it should, for functions that only work on stack and get parameters from stack, right?


Yes. -fno-strict-aliasing only disables type based analysis, not points-to or other memory disambiguation.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: