I am currently making a shared library out of some existing C code, for eventual inclusion in Debian. Because the author wasn't thinking about things like ABIs and APIs, the code is not too careful about what symbols it exports, and I decided clean up some of the more obviously private symbols exported.
I wrote the following simple script because I got tired of running grep by hand. If you run it with
grep-symbols symbolfile *.c
It will print the symbols sorted by how many times they occur in the other arguments.
#!/usr/bin/perl
use strict;
use File::Slurp;
my $symfile=shift(@ARGV);
open SYMBOLS, "<$symfile" || die "$!";
# "parse" the symbols file
my %count=();
# skip first line;
$_=<SYMBOLS>;
while(<SYMBOLS>){
chomp();
s/^\s*([^\@]+)\@.*$/$1/;
$count{$_}=0;
}
# check the rest of the command line arguments for matches against symbols. Omega(n^2), sigh.
foreach my $file (@ARGV){
my $string=read_file($file);
foreach my $sym (keys %count){
if ($string =~ m/\b$sym\b/){
$count{$sym}++;
}
}
}
print "Symbol\t Count\n";
foreach my $sym (sort {$count{$a} <=> $count{$b}} (keys %count)){
print "$sym\t$count{$sym}\n";
}
- Updated Thanks to Peter Pöschl for pointing out the file slurp should not be in the inner loop.