Tag Archives: FASTQ

Illumina FASTQ files – Read Segment Quality Control Indicator

In another quirk to the FASTQ story, recent Illumina FASTQ files don’t actually use the full range of PHRED scores – and a score of 2 has a special meaning, The Read Segment Quality Control Indicator (RSQCI, encoded as ‘B’). … Continue reading

Posted in Biopython, Blogroll, Code, Community, Development, Documentation, HOWTO, OBF, OBF Projects | Tagged , | Leave a comment

BioRuby 1.4.0 released

We are pleased to announce the release of BioRuby 1.4.0. This new release contains many new features, in addition to bug fixes and improvements. PhyloXML support: Support for reading and writing PhyloXML file format is added, developed by Diana Jaunzeikare, … Continue reading

Posted in BioRuby, Code, Development, OBF, OBF Projects | Tagged , | Leave a comment

Sanger FASTQ format and the Solexa/Illumina variants

I’m delighted to announce an open access publication in Nucleic Acids Research describing the FASTQ file format based on the conventions agreed by the OBF projects: The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ … Continue reading

Posted in BioJava, BioPerl, Biopython, BioRuby, Blogroll, Community, Development, Documentation, General, OBF, OBF Projects | Tagged , , , , , | Leave a comment

Interleaving paired FASTQ files with Biopython

This post is about paired end data (FASTA or FASTQ) and manipulating it with Biopython’s Bio.SeqIO module (see also FASTQ conversions & speeding up FASTQ).

Posted in Biopython, Blogroll, Code, Community, Development, Documentation, HOWTO, OBF Projects | Tagged , | Leave a comment

BioPerl 1.6.1 released

We are pleased to announce the immediate availability of BioPerl 1.6.1, the latest release of BioPerl’s core code. You can grab it here: Via CPAN: http://search.cpan.org/~cjfields/BioPerl-1.6.1/ Via the BioPerl website: http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2 http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz http://bioperl.org/DIST/BioPerl-1.6.1.zip The PPM for Windows should also finally … Continue reading

Posted in BioPerl, Community, Development, Documentation, General, OBF, OBF Projects | Tagged , | 1 Comment

Working with FASTQ files in Biopython when speed matters

Biopython’s SeqIO interface revolves around SeqRecord objects which can impose a speed penalty. For FASTQ files the quality string gets turned into a list of integers on parsing, and then re-encoded back to ASCII on writing. Working directly with the raw strings is less flexible, but much faster. Continue reading

Posted in Biopython, Blogroll, Code, Community, Development, Documentation, HOWTO, OBF Projects | Tagged , | Comments Off

Simpler, optimized format conversion with Biopython

In this post we’ll look at the new convert() function that both Bio.SeqIO and Bio.AlignIO will get in Biopython 1.52. This allows easier file conversion, and internally provides a mechanism for specific optimisations, such as for FASTQ conversions. Continue reading

Posted in Biopython, Blogroll, Code, Development, Documentation, HOWTO, OBF, OBF Projects | Tagged , | Comments Off