sort order for Bio::Tree::Node each_Descendent

I’ve updated the code for Bio::Tree::Node so that each_Descendent can return the nodes in alphabetical order. This is achieved by pasing in the string ‘alpha’ (for alphabetical) or ‘revalpha’ (for reverse alphabetical). For internal nodes, they are sorted in order of the min or max (alphabetically) node in the sub-clade.

In addition, you can request the order of writing nodes by Bio::TreeIO::newick by passing in the -order_by flag which can be ‘alpha’, ‘revalpha’, ‘height’, or ‘creation’.

So you can use it like


my $treeio = Bio::TreeIO->new(-format => 'newick',
                              -file => 'file.tre');
my $tree = $treeio->next_tree;
for my $internal ( grep { ! $internal->is_Leaf } $tree->get_nodes ) {
 # this will get the nodes in alphabetical order
  my @subclade = $internal->each_Descendent('alpha');
}
my $out = Bio::TreeIO->new(-format =>'newick',
                           -order_by => 'alpha',
                           -file => 'sorted.tre');
$out->write_tree($tree);

This code is in CVS now.

Also, I fixed the code reference option so that arbitrary functions can be passed in. Here is an example which prints out the nodes from a nexus formated tree file.


use Bio::TreeIO;
use strict;
my $treeio = Bio::TreeIO->new(-format => 'nexus',
			      -file   => shift);

my $tree = $treeio->next_tree;

print join(",",map { $_->id }
             $tree->get_root_node->each_Descendent(&my_sort_routine)), "n";

sub my_sort_routine {
 my ($aa,$bb) = @_;
 if( $aa->is_Leaf && $bb->is_Leaf ) {
   return $aa->id cmp $bb->id;
 } elsif( $aa->is_Leaf ) {
  my ($left) = sort { $a->id cmp $b->id }
              grep {$_->is_Leaf } $bb->get_all_Descendents;
  return $aa->id cmp $left->id;
 } elsif( $bb->is_Leaf ) {
  my ($left) = sort { $a->id cmp $b->id }
             grep {$_->is_Leaf } $aa->get_all_Descendents;
  return $left->id cmp $bb->id;
 } else {
   my ($left) = sort { $a->id cmp $b->id }
               grep {$_->is_Leaf } $aa->get_all_Descendents;
   my ($right) = sort { $a->id cmp $b->id }
              grep {$_->is_Leaf } $aa->get_all_Descendents;
   return $left->id cmp $right->id;
 }
}


2 thoughts on “sort order for Bio::Tree::Node each_Descendent”

  1. marino says:

    Great job Jason!

    I have been trying to order nodes by similarity and the method ‘height’ works very well in most instances. It would be nice to have a ‘revheight’ method as well. I presume that the two methods would be similar to the “Ladderise” left and right from TreeView.

    I also think that it would be useful to have a leaf ordering algorithm similar to the one published by Bar-Joseph Z, Gifford DK and Jaakkola TS.Fast optimal leaf ordering for hierarchical clustering. Bioinformatics. 2001;17 Suppl 1:S22-9.

  2. Sounds good – don’t let me stop you from writing it… =)

    I think what is best is to add some of these comments to the module page on the wiki when someone has time so we can build a todo list on that page which everyone can edit and update when finished.

Comments are closed.

Categories