Has anyone heard of a program that will take strace (or dtrace) output and create a pretty diagram showing which commands call which commands and which files they read or create?
We've got a fairly complicated bioinformatics pipeline that calls about 100 other programs, and creates or reads about 100 different files. I'd love a way to create a picture of what's going on. Which files each program uses, etc.
If such a program doesn't exist, would that be worth building? Could it be something I could potentially sell?
Sounded like an interesting script, so I just wrote it in about a half hour. You're welcome to sell it if you want... (also, there's probably a default recursion limit of 100; unroll the recursion in walkpid to go farther) Collect logs with 'strace -o pids.log -e trace=process -f [specify your process here]', run with 'perl printpids.pl < pids.log'
#!/usr/bin/perl -w
$|=1;
use strict;
my (%pidmap, @order);
while ( <> ) {
chomp;
if ( /^(\d+)\s+(\w+)(.*)$/ ) {
my ($pid, $syscall, $args) = ($1, $2, $3);
if ( $syscall =~ /(^clone$|fork$)/ and $args =~ / = (\d+)$/ and $1 > 0 ) {
my $clonepid = $1;
$pidmap{$clonepid} = { -parent => $pid };
push(@order, $clonepid);
}
elsif ( $syscall =~ /^exec/ and $args =~ / = (\d+)$/ and $1 == 0 ) {
my $exec = $args;
@order = ($pid) if !@order;
$exec =~ s/^\("([^"]+?)",.*$/$1/g;
push( @{ $pidmap{$pid}->{-exec} } , $exec );
}
}
}
foreach my $pid ( @order ) {
my $spaces = walkpid($pid);
print " " x $spaces . join("\n" . (" " x $spaces), map { $_ . " ($pid)" } @{ $pidmap{$pid}->{-exec} } ) . "\n";
}
sub walkpid {
my $pid = shift;
my $c = shift || 0;
if ( exists $pidmap{$pid}->{-parent} ) {
return walkpid($pidmap{$pid}->{-parent}, $c+1);
}
return($pid, $c);
}
Valgrind's callgrind tool will profile all calls a program makes. You can them feed the output to kcachegrind (or qcachegrind for the Qt version) which will nicely visualize the profiling run.
We've got a fairly complicated bioinformatics pipeline that calls about 100 other programs, and creates or reads about 100 different files. I'd love a way to create a picture of what's going on. Which files each program uses, etc.
If such a program doesn't exist, would that be worth building? Could it be something I could potentially sell?