strace is fantastic. Since it captures the detail of (nearly) all interactions between the process and the outside world, you can use it to answer many questions.
Why isn't this process invocation picking up my changed lib file? (strace, see if the changed file is being opened)
What are the exact http req/responses being made during the problem? (strace server or proxy with large -s value to see all read/write/sendmsg/recvmsg etc)
This tool fails when run as user X, probably a perms problem but which file? (strace, look for EPERM failures, probably to open())
Which /proc files are necessary to the operation of tool X (useful when checking what will and won't run in a sandbox like dotcloud)?
Main restrictions that I know of(in practice only the first is sometimes a problem to me):
- http/ssl hides the buffer info from 'strace -s'. Another good reason for ssl offloading :-)
- IO can occur via memory reads/writes after mmap(), which strace can't see
ltrace is a pretty nice complement too (trace inside dynamically loaded libs).
Actually this tool is not only for Sysadmin's - it is very helpful for developers. I know there are other tools like gdb, etc. But if you have to use (calling from your program) other programs, or if the library that you use does something strange, then strace is a shortcut.
From this article I learned that strace can be called to analyse an already running process (-p).
I've always been trying to find a program that will parse the strace output and create a call graph showing which programs call which programs and how long they run for.
Has anyone seen anything like that? I think I've searched pretty extensively.
Dtrace is originally from Solaris. FreeBSD (and possibly other BSDs) also has ktrace, which has been around a long time and has a learning curve the same as strace, but is not as powerful as dtrace.
Why isn't this process invocation picking up my changed lib file? (strace, see if the changed file is being opened)
What are the exact http req/responses being made during the problem? (strace server or proxy with large -s value to see all read/write/sendmsg/recvmsg etc)
This tool fails when run as user X, probably a perms problem but which file? (strace, look for EPERM failures, probably to open())
Which /proc files are necessary to the operation of tool X (useful when checking what will and won't run in a sandbox like dotcloud)?
Main restrictions that I know of(in practice only the first is sometimes a problem to me):
- http/ssl hides the buffer info from 'strace -s'. Another good reason for ssl offloading :-) - IO can occur via memory reads/writes after mmap(), which strace can't see
ltrace is a pretty nice complement too (trace inside dynamically loaded libs).