Design Overview

Design Aims

BioJava is a general bioinformatics toolkit. It provides a framework for building everything from simple scripts to complete applications. BioJava is designed to be used as a library, so to make it usable we must:

  • Design by Interface but provide working implementations so that you can always extend or replace behaviour and implementations.
  • Provide extensive API documentation as well as a clear overview of how it all fits together.
  • Give simple examples that show how to use the APIs.

The code-base is open source, and we encourage you to modify it or fix bugs, as allowed by the LGPL. Of course, if you wish to change anything major, then you will probably want to discuss it on the mailing list first.

What we cover now

Currently, there are objects for:

  • Sequences and features
    • IO
    • Processing, storing, manipulating
    • Visualising
  • Dynamic programming
    • Single-sequence and pair-wise HMMs
    • Viterbi-path, Forward and Backward algorithms
    • Training models
    • Sampling sequences from models
  • External file formats and programs
    • GFF
    • Blast
    • Meme
  • Sequence Databases
    • BioCorba interoperability
    • ACeDB client
    • DAS client

Hopefully, as the project matures, each of these areas will be fleshed out. We also aim to develop code for:

  • Expression data
  • Gene networks
  • Many more programs and file formats
Style guide
  • All java files must contain the license header.
  • Place an @author tag in every file that you edit. The 'maintainer' (either the original author, or the person currently overseeing the code) should be first, and then all other authors follow. Don't be shy - anything from spelling corrections in the JavaDoc through to re-writing a whole method counts.
  • Always indent with spaces, not tabs. This makes it much easier for those of us with tab-paranoid editors
  • Javadoc all interface methods fully. An interface is defined by the method signatures, clear documentation and a reference implementation.
  • Javadoc class methods when they are not implementing an interface method. Javadoc methods that implement an interface method only if clarification is needed, otherwise trust the documentation inheritance.
  • Methods should nearly always specify types by interface, not concrete implementations. This makes it easier to extend the code later.
  • With every interface X that defines a useful object, provide an implementation named SimpleX in the same package that is a plain, pure-java reference version. This gives other people a clearer idea of what the interface is meant to encapsulate. It also often makes it obvious if something is missing.
  • Discuss things with the biojava list - we may have a cunning plan.