I recently decided to try maintaining a Debian package (bibutils) without committing any patches to Git. One of the disadvantages of this approach is that the patches for upstream are not nicely sorted out in ./debian/patches. I decided to write a little tool to sort out which commits should be sent to upstream. I'm not too happy about the length of it, or the name "git-classify", but I'm posting in case someone has some suggestions. Or maybe somebody finds this useful.
#!/usr/bin/perl
use strict;
my $upstreamonly=0;
if ($ARGV[0] eq "-u"){
$upstreamonly=1;
shift (@ARGV);
}
open(GIT,"git log -z --format=\"%n%x00%H\" --name-only @ARGV|");
# throw away blank line at the beginning.
$_=<GIT>;
my $sha="";
LINE: while(<GIT>){
chomp();
next LINE if (m/^\s*$/);
if (m/^\x0([0-9a-fA-F]+)/){
$sha=$1;
} else {
my $debian=0;
my $upstream=0;
foreach my $word ( split("\x00",$_) ) {
if ($word=~m@^debian/@) {
$debian++;
} elsif (length($word)>0) {
$upstream++;
}
}
if (!$upstreamonly){
print "$sha\t";
print "MIXED" if ($upstream>0 && $debian>0);
print "upstream" if ($upstream>0 && $debian==0);
print "debian" if ($upstream==0 && $debian>0);
print "\n";
} else {
print "$sha\n" if ($upstream>0 && $debian==0);
}
}
}
=pod
=head1 Name
git-classify - Classify commits as upstream, debian, or MIXED
=head1 Synopsis
=over
=item B<git classify> [I<-u>] [I<arguments for git-log>]
=back
=head1 Description
Classify a range of commits (specified as for git-log) as I<upstream>
(touching only files outside ./debian), I<debian> (touching files only
inside ./debian) or I<MIXED>. Presumably these last kind are to be
discouraged.
=head2 Options
=over
=item B<-u> output only the SHA1 hashes of upstream commits (as
defined above).
=back
=head1 Examples
Generate all likely patches to send upstream
git classify -u $SHA..HEAD | xargs -L1 git format-patch -1