BioPerl core 1.6.1 PPM available

October 10, 2009 – 1:18 am
BioPerl 1.6.1 is now available for ActivePerl as a PPM, instructions for downloading can be found on the BioPerl wiki. This has been tested only for ActivePerl 5.10 and above, so any feedback with older versions of BioPerl would be greatly appreciated.

First 1.6.1 alphas of BioPerl-Run, BioPerl-DB, BioPerl-Network

October 1, 2009 – 12:40 pm
Running a bit late on this, so just a quick note that the first alphas for BioPerl-Run, BioPerl-DB, and BioPerl-Network have been uploaded to CPAN: BioPerl-Run BioPerl-DB BioPerl-Network They can also be downloaded from the BioPerl website: http://bioperl.org/DIST/RC/ This is the first run where we've switched to a regular Module::Build installation, so expect some initial bumps! There are a few initial problems that I plan on addressing soon, the main one being none of the modules are assigned version numbers (this may be a consequence of not pulling the version from a specific module). The other, more serious one, is that the Build.PL script checks for DBI but isn't checking for any compatible DBD::* adaptors for BioPerl-DB (so it fails tests if DBI is installed). We have code in core to check for DBI drivers, so I may adapt that for BioPerl-DB. Enjoy! chris

BioPerl 1.6.1 released

September 29, 2009 – 1:55 pm
We are pleased to announce the immediate availability of BioPerl 1.6.1, the latest release of BioPerl's core code. You can grab it here: Via CPAN: http://search.cpan.org/~cjfields/BioPerl-1.6.1/ Via the BioPerl website: http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2 http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz http://bioperl.org/DIST/BioPerl-1.6.1.zip The PPM for Windows should also finally be available this week, ActivePerl problems permitting (we will post more information when it becomes available). Tons of bug fixes and changes have been incorporated into this release. For a more complete change list please see the 'Changes' file included with the distribution. A few highlights: FASTQ parsing and interconversion of the three FASTQ variants (Sanger, Illumina, Solexa) now works (a concerted OBF effort!) Significant refactoring of Bio::Restriction methods Complete refactoring of Bio::Search-related tiling code, including HOWTO documentation GBrowse-related fixes: - berkeleydb database now autoindexes wig files and locks correctly - add Pg, SQLite, and faster BerkeleyDB implementations Infernal 1.0 output is now parsed New SearchIO-based parser for gmap -f9 output BLAST XML parsing essentially complete Installation via CPANPLUS should now work For those using Strawberry ...

Working with FASTQ files in Biopython when speed matters

September 25, 2009 – 7:49 am
Biopython 1.51 onward includes support for Sanger, Solexa and Illumina 1.3+ FASTQ files in Bio.SeqIO, which allows a lot of neat tricks very concisely. For example, the tutorial (PDF) has examples finding and removing primer or adaptor sequences. However, because the Bio.SeqIO interface revolves around SeqRecord objects there is often a speed penalty. For example for FASTQ files, the quality string gets turned into a list of integers on parsing, and then re-encoded back to ASCII on writing. The new Bio.SeqIO.convert(...) function in Biopython 1.52 onwards makes converting from FASTQ to FASTA, or between the FASTQ variants about five times faster. It can do this because it doesn't bother with creating any objects - it just uses Python strings. You can use the same approach in your own scripts. For example, suppose you have a Solexa FASTQ file where you want to trim all the reads, taking just the first 21 bases ...

Biopython CVS to git migration

September 24, 2009 – 9:00 am
The release of Biopython 1.52 earlier this week marked the end of an era, it was our last release using CVS for source code control. As of now, Biopython is using a git repository, hosted on github.com who kindly provide git hosting for open source projects free of charge. The BioRuby project have been using github for some time, so we are in good company. Our existing OBF hosted CVS repository will be maintained in the short to medium term as a backup, but will not be updated. Although many people have been involved in this move, we'd like to thank Bartek Wilczynski in particular for handling the CVS to git conversion, and the mirroring our CVS updates to git during the transition period. In the next few weeks hopefully we'll get our git usage wiki pages perfected, as we start using git for real.

Biopython 1.52 released

September 22, 2009 – 12:26 pm
We are pleased to announce the availability of Biopython 1.52, a new stable release of the Biopython library. It may only have been one month since the last release, but in that time we've added enough useful features to warrant a new release. In particular, Biopython 1.52 includes more substantial support for population genetics, and adds new functions that will be useful for people working with next generation sequencing. Tiago Antao's work on the Population Genetics module brings a command line wrapper for GenePop which allows the estimation of F-statistics, null allele frequencies and migration rates as well as tests for isolation by distance (IBD) and deviation from Hardy-Weinberg equilibrium. Bio.SeqIO and Bio.AlignIO both now have a new convert() function that allows for simple (and potentially optimized) conversion between file formats. Bio.SeqIO also gets a new index() function which allows random access to sequences in a file without reading every record ...

Simpler, optimized format conversion with Biopython

September 22, 2009 – 6:24 am
As per Peter's recent post we are using this space to show of a couple of the new features in Biopython 1.52 before it is released. In this post we'll look at the new convert() function that both Bio.SeqIO and Bio.AlignIO will get in Biopython 1.52. No one has ever complained that bioinformatics just doesn't have enough file formats - you probably frequently find yourself converting sequence files to suit particular applications with Bio.SeqIO. At the moment this is usually a two step process, something like this: >>> records = SeqIO.parse(in_handle "genbank") >>> SeqIO.write(records, out_handle, "fasta") As of Biopython 1.52, you'll be able to achieve the same result in a single step: >>> SeqIO.convert(in_handle, "genbank", out_handle, "fasta") In fact, it's even easier than that because the convert function will accept filename strings as well as file handles for both input and output. Adding the convert function to Bio.SeqIO and Bio.AlignIO will make your scripts more readable and ...

Indexing sequence files with Biopython

September 21, 2009 – 1:37 pm
The forthcoming release of Biopython 1.52 will include a couple of nice improvements to the Bio.SeqIO module, and here we're going to introduce the new index function. This will of course be covered in the Biopython Tutorial & Cookbook (PDF) once this code is released. Suppose you have a large sequence file with many many individual sequences in it. This could be next generation sequence data for example, maybe a FASTQ, FASTA or QUAL file. Or, it might be a big annotation rich file, such as the whole of UniProt, or a chunk of GenBank. The Bio.SeqIO.parse(...) function lets you iterate over all the records in a file, one by one. This allows you to process each sequence in turn, keeping only one in memory at a time. This approach is very valuable for dealing with big files. However, sometimes you can't just loop over the records in the order found in the ...

BioRuby 1.3.1 released

September 2, 2009 – 9:47 am
We are pleased to announce the release of BioRuby 1.3.1. This new release fixes many bugs existed in 1.3.0. Here is a brief summary of changes. Refactoring of BioSQL support. Bio::PubMed bug fixes. Bio::NCBI::REST bug fixes. Bio::GCG::Msf bug fixes. Bio::Fasta::Report bug fixes and added support for multiple query sequences. Bio::Sim4::Report bug fixes. Added unit tests for Bio::GCG::Msf and Bio::Sim4::Report. License of BioRuby is clarified. In addition, many changes have been made, mainly bug fixes. For more information, you can see ChangeLog. The archive is available at: http://bioruby.org/archive/bioruby-1.3.1.tar.gz We also put RubyGems pacakge at RubyForge as always. You can easily install by using RubyGems. % sudo gem install bio You can also obtain bioruby gem file from bioruby.org. http://bioruby.org/archive/gems/bio-1.3.1.gem Hope you enjoy.

Biopython 1.51 released

August 17, 2009 – 7:52 am
We are pleased to announce the release of Biopython 1.51.This new stable release enhances version 1.50 (released in April) by extending the functionality of existing modules, adding a set of application wrappers for popular alignment programs and fixing a number of minor bugs. In particular, the SeqIO module can now write Genbank files that include features, and deal with FASTQ files created by Illumina 1.3+. Support for this format allows interconversion between FASTQ files using Solexa, Sanger and Ilumina variants using conventions agreed upon with the BioPerl and EMBOSS projects. Biopython 1.51 is the first stable release to include the Align.Applications module which allows users to define command line wrappers for popular alignment programs including ClustalW, Muscle and T-Coffee. Bio.Fasta and the application tools ApplicationResult and generic_run() have been marked as deprecated - Bio.Fasta has been superseded by SeqIO's support for the Fasta format and we provide ...