Open Source, Open Door: increasing diversity in the bioinformatics open source community

The Bioinformatics Open Source Conference (BOSC) has always been about community. Launched in 2000, BOSC aims to provide a forum for both bioinformatics developers and users to share ideas and code and learn about the latest developments in open source bioinformatics and open science.

Our goal this year is to welcome even greater participation, opening the door even wider to participants who have historically been underrepresented in the world of open source bioinformatics and, therefore, at BOSC. This includes (but is by no means limited to) women, people who aren’t white, older people, people from outside North America and Europe, and non-programmers.

During a Birds of a Feather (BoF) session held at BOSC 2014, we discussed ways to increase the diversity of BOSC attendees, and gathered many useful suggestions from the participants, some of which we have already acted upon.

One of the suggestions from the 2014 BoF was to add someone to the organizing committee to focus on outreach and community-building. In January 2015, we welcomed Dr. Sarah Hird as our new Outreach Coordinator. Sarah is currently a UC Davis Chancellor’s Postdoctoral Fellow with Jonathan Eisen in the UC Davis Genome Center, where her research interests lie at the intersection of phylogeography, bioinformatics and microbial diversity. Sarah is also known for her focus on promoting diversity in STEM.  “I am personally and professionally interested in how we can make “the Academy” a more representative sample of the world around us,” she says.

During the 2014 BoF, we were asked whether BOSC planned to adopt a Code of Conduct. We felt that this should be an ISCB-wide effort, not one that is limited to a single SIG. Our advocacy efforts with the ISCB were successful with a code of conduct published on the ISMB/ECCB 2015 website. We are very pleased that ISCB joins us in wanting to foster a collegial and productive environment for everyone who attends the conferences. The code of conduct will also be announced in the ISCB April Newsletter.

The high price of travel and registration can make it hard for some people to attend BOSC. We are trying to lower this barrier by offering free or half-price registration to a limited number of accepted speakers – please indicate in the Comments section of your abstract submission if you would like to apply for this. We also award Student Travel Fellowships to the authors of the three best student abstracts each year; these provide $250 to offset travel costs, as well as granting free registration to BOSC.

Every year, the agenda at BOSC includes a panel that gives all participants the opportunity to engage each other in discussion. This year, our panel discussion will focus on increasing diversity in our community and at our conferences. The panel will be chaired by Monica Munoz-Torres and will include panellists Holly Bik and Jason Williams (see bios below).

  • Dr. Monica Munoz-Torres (Twitter: @monimunozto) is the lead biocurator at Berkeley Bioinformatics Open-Source Projects (BBOP). She is part of the development teams for Web Apollo (a web-based annotation editor designed to support community-based curation of genomes) and the tools of the Gene Ontology (GO) Consortium. She co-leads the Community Curation group within the global initiative to sequence and annotate the genomes of 5,000 arthropods (i5K Initiative), and is a member of the Executive Committee of the International Society for Biocuration (ISB).
  • Dr Holly Bik (Twitter: @hollybik) is a Birmingham Fellow (assistant professor) in the School of Biosciences at the University of Birmingham, UK. Her research uses high-throughput environmental sequencing approaches (rRNA surveys, metagenomics) to explore biodiversity and biogeographic patterns in microbial eukaryote assemblages, with an emphasis on nematodes in marine sediments. Through active collaborations with computer scientists and participation in software development projects, her long-term research aims to address existing bottlenecks encountered in –Omic analyses focused on microbial eukaryotes.
  • Jason Williams (Twitter: @JasonWilliamsNY) is the Lead of the iPlant Collaborative’s Education, Outreach, Training (EOT) group, based at Cold Spring Harbor Laboratory, where he has worked for over 10 years. He is also a Lead Instructor of “The Science Institute” at Yeshiva University High School for Girls, and the Treasurer of the Software Carpentry Foundation. His background is in molecular biology and bioinformatics.

We are looking for two more panellists, and have some ideas – but your suggestions are welcome! Please email the BOSC committee or just tweet panellist ideas at @OBF_BOSC.

Finally, please spread the word about BOSC! The deadline for submitting abstracts for regular-length talks is tomorrow (Friday, April 3 – update: extended to Tuesday, April 7 due to Easter/Passover weekend), but there will also be opportunities for last-minute lightning talks and posters.


Posted in Blogroll, BOSC/ISMB, Community, OBF | Tagged , , , , | 2 Comments

GSoC project Sambamba published in scientific journal

(This is a repost of a BLOG on Google Open Source news about Google’s open source student programs and software releases)

One of our goals with GSoC is to inspire young developers to participate in open source development, hopefully continuing well beyond the summer. Pjotr Prins from the Open Bioinformatics Foundation shared this story with us about a GSoC 2012 student who has continued leading the development of a software tool used in laboratories around the world. That tool, Sambamba, was recently featured in an Oxford University Press scientific journal.
The Open Bioinformatics Foundation (OBF) participated in Google Summer of Code (GSoC) in 2012 and again in 2014. One of our projects,Sambamba, enables users to rapidly process large sequence alignment files in the SAM, BAM and CRAM formats using parallel processing. Sambamba, which means “parallel” in Swahili, was recently the subject of a paper published in Bioinformatics Journal by GSoC alumnus Artem Tarasov. Since the tool is now used in DNA sequencing centres around the world, Artem has become well known in the bioinformatics community as Sambamba’s creator.

When we participated in GSoC 2012, we accepted five students, one of whom was Artem. His project was to “write the fastest parallelized BAM parser in D” as an alternative to the existing SAMtools software written in single-threaded C. I consider the D language to be particularly well-suited to bioinformatics given its modern hybrid OOP/functional syntax with close-to-the-metal performance optimizations.

Even before GSoC started that year, Artem was doing research and cranking out code. In his blog, he wrote about learning the D language, dealing with parallel executing code, and the sometimes-buggy compiler and garbage collector. The file formats he was working with are complicated and contain many assumptions, but he made wise choices which led to a very effective piece of software: people tend to rave about Sambamba when they use it the first time. Artem and I continued working on Sambamba after GSoC and before long, I found that he was the one mentoring me!

Since then, Artem has been invited to visit the Cuppen sequencing lab in the Netherlands where he added depth analysis to Sambamba. This is also when we started work on the manuscript for the Bioinformatics Journal. Later, the OBF was able to sponsor a second trip to the European Bioinformatics Institute in Cambridge, UK where he and I took part in a Codefest and met with other bioinformatics researchers and developers, including some OBF contributors.

Artem isn’t our only GSoC student who has continued making a difference in open source. Four of our five GSoC 2012 students are still active FOSS committers on GitHub, with three of them continuing in the bioinformatics space. Although GSoC can be competitive and we haven’t been accepted into the program every year, we’re grateful for the opportunities it has given us. Organizations like OBF and SciRuby are proof that GSoC and scientific projects work really well together. Without GSoC, Artem and I would probably not have ever met. He and I both hope to introduce more students to scientific open source projects in the future.
By Pjotr Prins, Sambamba GSoC Mentor

Posted in BioRuby, Google Summer of Code, OBF | Tagged , | Leave a comment

BOSC 2015 Keynote Speakers

Announcing the keynote speakers for the Bioinformatics Open Source Conference, BOSC 2015:

Holly Bik

Holly BikDr Holly Bik is a Birmingham Fellow (assistant professor) in the School of Biosciences at the University of Birmingham, UK. She obtained her Ph.D. in molecular phylogenetics at the University of Southampton, UK (working in conjunction with the Natural History Museum, London), followed by subsequent postdoctoral appointments at the Hubbard Center for Genome Studies at the University of New Hampshire and the UC Davis Genome Center.

Her research uses high-throughput environmental sequencing approaches (rRNA surveys, metagenomics) to explore biodiversity and biogeographic patterns in microbial eukaryote assemblages, with an emphasis on nematodes in marine sediments. Through active collaborations with computer scientists and participation in software development projects, her long-term research aims to address existing bottlenecks encountered in –Omic analyses focused on microbial eukaryotes.
Holly’s keynote talk topic is “Bioinformatics: Still a scary world for biologists”.

Many biological disciplines remain staunchly traditional, where high-throughput DNA sequencing and bioinformatics have not yet become widely adopted. In this talk, I’ll discuss the ongoing challenges and barriers facing biologists in the age of ‘Omics, based on my experiences in transitioning from nematode taxonomy to computational biology research.

Homepage: Holly Bik, Twitter: @hollybik

Ewan Birney

Ewan BirneyDr Ewan Birney is Joint Associate Director of EMBL-EBI, as well as Interim Head of the Centre for Therapeutic Target Validation. Together with Dr Rolf Apweiler, he has strategic responsibility and oversight for bioinformatics services at EMBL-EBI.

Ewan played a vital role in annotating the genome sequences of the human, mouse, chicken and several other organisms; this work has had a profound impact on our understanding of genomic biology. He led the analysis group for the ENCODE project, which is defining functional elements in the human genome. He was also one of the leaders of the BioPerl project. Ewan’s main areas of research include functional genomics, assembly algorithms, statistical methods to analyse genomic information (in particular information associated with individual differences) and compression of sequence information.

He has received a number of prestigious awards including the 2003 Francis Crick Award from the Royal Society, the 2005 Overton Prize from the International Society for Computational Biology and the 2005 Benjamin Franklin Award for contributions in Open Source Bioinformatics. He was elected a Fellow of the Royal Society in 2014.

Ewan was a cofounder of the Open Bioinformatics Foundation, the organization that sponsors BOSC, and has been involved in BOSC since the first conference in 2000. He chaired the meeting in 2001, and gave one of the keynote talks in 2002. We are delighted to have him back as a keynote speaker for 2015.

Ewan’s talk topic will be announced soon.

Homepage: Ewan Birney, Twitter: @ewanbirney

The BOSC 2015 call for abstracts is currently open, and BOSC/ISMB/ECCB 2015 registration has also just opened. We hope to see you in Dublin!

Posted in Blogroll, BOSC/ISMB, Community, General, OBF, Website | Tagged , , , | Leave a comment

BOSC 2015 call for Abstracts

Call for Abstracts for the 16th Annual Bioinformatics Open Source Conference (BOSC 2015), a Special Interest Group (SIG) of ISMB/ECCB 2015.

[BOSC Logo]

Important Dates:


The Bioinformatics Open Source Conference (BOSC) covers the wide range of open source bioinformatics software being developed, and encompasses the growing movement of Open Science, with its focus on transparency, reproducibility, and data provenance. We welcome submissions relating to all aspects of bioinformatics and open science software, including new computational essay for money methods, reusable software components, visualization, interoperability, and other approaches that help to advance research in the biomolecular sciences. We particularly wish to invite those who have not participated in previous BOSCs to join us this year!

Two full days of talks, posters, panel discussions, and informal discussion groups will enable BOSC attendees to interact with other developers and share ideas and code, as well as learning about some of the latest developments in the field of open source bioinformatics. BOSC is sponsored by the Open Bioinformatics Foundation, a non-profit, volunteer-run group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community.

We invite you to submit one-page abstracts for talks and posters. As mentioned, any topics relevant to open source bioinformatics and open science are welcome. Here are some potential session topics (but please don’t feel limited to these!):

  • Open Science and Reproducible Research
  • Standards and Interoperability
  • Data Science
  • Visualization
  • Translational Bioinformatics
  • Bioinformatics Open Source Libraries and Projects

If your company or organization is interested in being a sponsor for BOSC 2015, please contact us! Sponsors of BOSC 2014 included Google, Eagle Genomics, GigaScience, and Curoverse – we thank them for their support.

BOSC 2015 Organizing Committee:
Nomi Harris and Peter Cock (co-chairs), Raoul Jean Pierre Bonnal, Brad Chapman, Robert Davey, Christopher Fields, Sarah Hird, Karsten Hokamp, Hilmar Lapp, Monica Munoz-Torres.

Posted in Blogroll, BOSC/ISMB, Community, Development, General, OBF | Tagged , , , , , | Leave a comment

Sadly OBF not accepted for GSoC 2015

Last year’s Google Summer of Code 2014 was very productive for the OBF with six students working on Bio* and related bioinformatics projects. We applied to be part of GSoC 2015, but unfortunately this year were not accepted.

Google’s program is enormously popular, and over-subscribed, meaning Google has had to rotate organisation membership. The OBF is grateful to have been accepted in 2010, 2011, 2012 and 2014. This year any participation will be down to individual projects to find a willing umbrella group from the organisations accepted for GSoC 2015. For example, a Biopython project was included under NESCent for GSoC 2013.

Other organizations with bioinformatics as keyword are Ruby Science Foundation, Department of Biomedical Informatics, Stony Brook University, OncoBlocks, University of Nebraska – Helikar Lab. Other organizations related to sciences are ASCEND , BRL-CAD, Debian Project, HPCC Systems®International Neuroinformatics Coordinating Facility , lmonade: scientific software distribution, OSGeo – Open Source Geospatial Foundation, The Concord Consortium, The Visualization Toolkit. Languages: Python, Scala, Apache Foundation. Last but not least : Global Alliance for Genomics & Health.

On behalf of the OBF, we would like to thank our volunteer GSoC Administrators, Raoul Bonnal and Francesco Strozzi, for organising our application – and all our potential mentors across the Bio* projects who put forward potential project suggestions.

Posted in BioJava, BioLib, BioPerl, Biopython, BioRuby, Blogroll, Community, Development, General, Google Summer of Code, OBF, OBF Projects | Tagged , , , , | 7 Comments

OBF Google Summer of Code 2014 Wrap-up

GoogleSummer_2014logoIn 2014, OBF had six students in the Google Summer of Code 2014™ (GSoC) program mentored under its umbrella of Bio* and related open-source bioinformatics community projects: Loris Cro (Bioruby) with mentors Francesco Strozzi and Raoul Bonnal; Evan Parker (Biopython) with mentors Wibowo Arindrarto and Peter Cock; Sarah Berkemer (BioHaskell) with mentors Christian Höner zu Siederdissen and Ketil Malde; and three students contributed to JSBML: Victor Kofia (mentors: Alex Thomas and Sarah Keating), Ibrahim Vazirabad (mentors: Andreas Dräger and Alex Thomas), and Leandro Watanabe (mentors: Nicolas Rodriguez and Chris Myers).

As a change from earlier years in which OBF participated in GSoC as a mentoring organization, in 2014 we purposefully defined our umbrella as much more inclusive of the wider bioinformatics open-source community, bringing it more in line with the annual Bioinformatics Open-Source Conference (BOSC).  In part this was also motivated by “paying it forward“, a concept central to growing healthy open-source communities, after the larger domain-agnostic language projects such as SciRuby and PSF had extended an open hand to OBF mentors when OBF did not get admitted as a GSoC mentoring organization in 2013. In the end, four out of the six succeeding student applications were for projects outside of the traditional core Bio* projects, a result with which everyone won: We had a terrific crop of students, our community grew larger and stronger, and open-source bioinformatics was advanced in a more diverse way than would have been possible otherwise. Continue reading

Posted in Biopython, BioRuby, Community, Google Summer of Code, OBF | Tagged , , , , | 8 Comments

BOSC welcomes Sarah Hird as Outreach Coordinator

sarah-hirdThe BOSC 2015 Organizing Committee is pleased to welcome Sarah Hird as our new Outreach Coordinator. BOSC is eager to increase the participation of individuals and groups that have been historically underrepresented at our conferences, and Sarah will be spearheading this effort.

Sarah is currently a UC Davis Chancellor’s Postdoctoral Fellow with Jonathan Eisen in the UC Davis Genome Center, where her research interests lie at the intersection of phylogeography, bioinformatics and microbial diversity.  She earned her PhD in biology and bioinformatics at LSU. Sarah is also known for her focus on promoting diversity in STEM. “I am personally and professionally interested in how we can make “the Academy” a more representative sample of the world around us,” she says.

Please join us in welcoming Sarah to the BOSC organizing committee, and stay tuned for more information about BOSC 2015 (which will take place July 10-11, 2015, in Dublin).

Posted in BOSC/ISMB, Community | 13 Comments

Biopython 1.65 released

Dear Biopythoneers,

Source distributions and Windows installers for Biopython 1.65 are now available from the downloads page on the official Biopython website and from the Python Package Index (PyPI).

This release of Biopython supports Python 2.6, 2.7, 3.3 and 3.4. It is also tested on PyPy 2.0 to 2.4, PyPy3 version 2,4, and Jython 2.7b2.

The most visible change is that the Biopython sequence objects now use string comparison, rather than Python’s object comparison. This has been planned for a long time with warning messages in place (under Python 2, the warnings were sadly missing under Python 3).

The Bio.KEGG and Bio.Graphics modules have been expanded with support for the online KEGG REST API, and parsing, representing and drawing KGML pathways.

The Pterobranchia Mitochondrial genetic code has been added to Bio.Data (and the translation functionality), which is the new NCBI genetic code table 24.

The Bio.SeqIO parser for the ABI capillary file format now exposes all the raw data in the SeqRecord’s annotation as a dictionary. This allows further in-depth analysis by advanced users.

Bio.SearchIO QueryResult objects now allow Hit retrieval using its alternative IDs (any IDs listed after the first one, for example as used with the NCBI BLAST NR database).

Bio.SeqUtils.MeltingTemp has been rewritten with new functionality.

The new experimental module Bio.CodonAlign has been renamed Bio.codonalign (and similar lower case PEP8 style module names have been used for the sub-modules within this).

Bio.SeqIO.index_db(…) and Bio.SearchIO.index_db(…) now store any relative filenames relative to the index file, rather than (as before) relative to the current directory at the time the index was built. This makes the indexes less fragile, so that they can be used from other working directories. NOTE: This change is backward compatible (old index files work as before), however relative paths in new indexes will not work on older versions of Biopython!

Behind the scenes, we have done a lot of work applying PEP8 coding styles to Biopython, and improving the formatting of the source code documentation (PEP257 docstrings).

Many thanks to the Biopython developers and community for making this release possible, especially the following contributors:

  • Alan Du (first contribution)
  • Carlos Pena (first contribution)
  • Colin Lappala (first contribution)
  • Christian Brueffer
  • David Bulger (first contribution)
  • Eric Talevich
  • Evan Parker (first contribution)
  • Hongbo Zhu
  • Kai Blin
  • Kevin Wu (first contribution)
  • Leighton Pritchard
  • Leszek Pryszcz (first contribution)
  • Markus Piotrowski
  • Matt Shirley (first contribution)
  • Mike Cariaso (first contribution)
  • Peter Cock
  • Seth Sims (first contribution)
  • Tiago Antao
  • Travis Wrightsman (first contribution)
  • Tyghe Vallard (first contribution)
  • Vincent Davis
  • Wibowo ‘Bow’ Arindrarto
  • Zheng Ruan

This is a longer list of contributors and changes than usual, but it was also a longer gap since our last release.

Posted in Biopython, Blogroll, Code, Development, OBF, OBF Projects | Tagged , , , | 2 Comments

BOSC 2015 will be in Dublin with ISMB/ECCB 2015

We have asked you, and you have spoken! 59 past and/or future BOSC attendees participated in our survey, answering questions about what they liked at BOSC 2014, what changes they’d like to see, and — most importantly — what they thought about the proposal to possibly hold BOSC 2015 in Norwich (UK) rather than as an ISMB/ECCB SIG in Dublin (Ireland)..

Under this plan, BOSC 2015 would have been shortly before ISMB/ECCB, but in Norwich. We would have been hosted by The Genome Analysis Centre (TGAC) just after and co-located with the Galaxy Community Conference 2015 (GCC 2015, hosted by The Sainsbury Laboratory). Although some survey participants indicated that they would be more likely to attend BOSC 2015 if it were co-located with GCC, the majority preferred BOSC to remain an ISMB SIG, so we will hold BOSC 2015 in Dublin right before ISMB/ECCB 2015.

Here is the summary of responses to the questions about the location of BOSC 2015:



Although the survey is now closed, we are always happy to hear your suggestions for BOSC 2015. (We are particularly interested in increasing diversity at BOSC, and welcome suggestions of people to invite.) You can reach us at

Nomi Harris and Peter Cock
Co-Chairs, BOSC 2015

Posted in Blogroll, BOSC/ISMB, Community, OBF | Tagged , , , | 9 Comments

BOSC 2014 video recording

We’re pleased to publicly announce that we aim to video record all the talks at BOSC 2014, and the panel discussion, to be made freely available online after the conference. This is on an opt-out basis, and thus far none of our speakers have declined to be filmed.YouTubeLast year we managed to record many of the talks – including both keynotes, which you can watch via the YouTube links on the BOSC 2013 Schedule. This year we are hiring a professional from Next Day Video (@NextDayVideo on Twitter).

Google LogoThis is thanks to very generous support from Google’s Open Source Programs Office (who also run the amazing Google Summer of Code program which the OBF and its member projects have regularly taken part in), a new sponsor for BOSC this year.

Posted in OBF | Tagged , , | 7 Comments