Skip to the content.

Bio::ToolBox

Home Install Libraries Applications Examples FAQ

Advanced Installation

This is an advanced installation guide for getting a complete installation.

TLDR Brief guide

This is a no-nonsense, quick guide for those who already know what they’re doing on an established Linux system with a modern Perl installation, and know how to adjust accordingly for their system. If that doesn’t describe you, skip ahead to the Detailed guide.

Detailed guide

This assumes installation on a Linux work station with available standard compilation tools. Installation on MacOS (x86_64) is also possible with Xcode Command Line Tools installed; see see MacOS Notes for additional guidance.

Perl installations and locations

As a Perl package, BioToolBox needs to be installed under a Perl installation. It is not dependent on a specific Perl release version, although later releases (5.16 or newer) are preferred. Nearly every unix-like OS (Linux, MacOS) includes a system Perl installation. If not, one can be installed, either from the OS package manager or from source.

Follow one of these options.

Home library

When you want to use the system-installed Perl (often /usr/bin/perl), but do not have write permissions to the system, you can install packages in your home directory. To do this, you should first install local::lib, which sets up a perl5 directory for local (home) module installations. The path is set appropriately by adding a statement to your home .profile or other equivalent file as described in the documentation. This can also be used for targeted, standalone installations; adjust accordingly. For example, the following command will install local::lib and the CPAN Minus application

curl -L https://cpanmin.us | perl - -l $HOME/perl5 local::lib App::cpanminus \
&& echo 'eval "$(perl -I$HOME/perl5/lib/perl5 -Mlocal::lib)"' >> ~/.profile \
&& . ~/.profile

Custom installation

When the system Perl is old (because many vendor OS Perl installations are sadly out of date), or you want or need to install a newer, modern Perl, but cannot or do not want to overwrite the system Perl, then you can and should install a new Perl version. This can be installed anywhere you have read/write access, including your home directory or wherever. While a new Perl version can be manually downloaded and installed from the main Perl site, there are easier ways.

An alternate package manager may be used to install a Perl version in a generally available location. For example, MacOS users can easily install a modern Perl using Homebrew. Similarly, Linux (and evidently Microsoft Windows Subsystem for Linux) users can use Linuxbrew. These typically install the latest production release with a single command.

To install a Perl in your home directory (or other location) with a simple, but powerful, tool, use the excellent PerlBrew. This tool can painlessly compile, install, and manage one or more Perl release versions side-by-side, allowing you to easily switch between releases with a simple command. It also manages multiple local::lib installations, in case you want to isolate packages.

BioToolBox does not utilize threading (it uses forks for parallel execution), so if you have a choice, compile a non-threaded Perl for a (very) slight performance gain. For those adventurous to try, BioToolBox does work under cperl, although installing some prerequisite modules is a trying experience (many failed tests and partial functionality).

System installation

For privileged installations (requiring root access or sudo privilege) you probably already know what to do. You can use the --sudo or -S option to cpanm. Note that installing lots of packages in the OS vendor system perl is generally not recommended, as it could interfere with other vital OS functions or programs that expect certain versions or modules to be present. It’s best to use one of the other two methods.

External libraries

There are two external C libraries that are required for reading Bam and BigWig files. These are commonly used bioinformatics tools maintained by separate organizations, and the Perl modules only provide the XS bindings to these libraries. As such, it’s best to install these up front separately before attempting the Perl module installation. Note that both Perl modules Bio::DB::HTS and Bio::DB::Big include INSTALL.pl scripts within their bundles that can compile these external libraries for you in a semi-automated fashion. Proceed here if you wish to have more control over what and where these are installed.

Perl modules

Using a simple CPAN package installer such as CPAN Minus, i.e. cpanm, is highly recommended for ease and simplicity in installing modules from CPAN. It can install directly from CPAN or take a URL or downloaded archive file. Other CPAN package managers are available too.

The following Perl packages should be explicitly installed. Most of these will bring along a number of dependencies (which in turn bring along more dependencies). In the end you will have installed dozens of packages.

An example of installing these Perl modules with cpanm is below. This assumes that you have local::lib or a writable Perl installation in your $PATH. Adjust accordingly.

cpanm Module::Build https://github.com/tjparnell/bioperl-live/releases/download/minimal-v1.7.8/Minimal-BioPerl-1.7.8.tar.gz
cpanm --configure-args="--htslib $HOME" Bio::DB::HTS
curl -o bio-db-big-master.zip -L https://codeload.github.com/Ensembl/Bio-DB-Big/zip/master
cpanm --configure-args="--libbigwig $HOME" bio-db-big-master.zip
cpanm --notest Data::Swap
cpanm Parallel::ForkManager Set::IntervalTree Set::IntSpan::Fast Bio::ToolBox

External applications

Some programs, for example bam2wig.pl and data2wig, requires external utilities for converting text formats to binary formats, for example wig files to bigWig. External utilities are preferred because they’re more efficient and spread the load on modern multi-CPU environments. You may download these from the UCSC Genome Browser utilities section for either Linux or macOS. Copy them to your bin directory in your PATH, for example $HOME/bin, $HOME/perl5/bin, or /usr/local/bin. Be sure to make them executable by running chmod +x on each file.

An example for downloading on Linux:

for name in wigToBigWig bedGraphToBigWig bigWigToWig bedToBigBed; \
do curl -o $HOME/bin/$name http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/$name \
&& chmod +x $HOME/bin/$name; done;

Legacy Perl modules

These are additional legacy Perl modules that are supported (for example, if you still have a GBrowse installation), but are either not required or have been superseded by other modules.

Some notes are below for anyone who may need to install these.

Database support

The Bio::DB::SeqFeature::Store is a convenient SeqFeature annotation database backed by a SQL engine. It used to be part of the BioPerl distribution prior to release 1.7.3, but is now split into its own distribution. If you wish to use annotation databases, you will need a SQL driver, such as DBD::SQLite (recommended for individuals) or DBD::mysql (for fancy multi-user installations).

Sam library

The Bio::DB::Sam library only works with the legacy Samtools version, which included both the C libraries, headers, and executables; use version 0.1.19 for best results. You will need to compile the Samtools code, but you do not have to install it (the library is not linked). Before compiling, edit the Makefile to include the cflags -fPIC and (most likely) -m64 for 64 bit OS. Export the SAMTOOLS environment variable to the path of the Samtools build directory, and then you can proceed to build the Perl module; it should find the necessary files using the SAMTOOLS environment variable. You may obtain the latest source from here.

UCSC BigFile library

The Bio::DB::BigWig and Bio::DB::BigBed modules are part of the same distribution, Bio-BigFile. Only use the code from the GitHub repository, as it should be compatible with recent UCSC libraries, whereas the distribution on CPAN is out of date.

You will need the UCSC source code; the userApps source code is sufficient, rather than the entire browser code. Version 375, at the time of this writing, works. This requires at least OpenSSL and libpng libraries to compile the required library; on MacOS, these need to be installed independently (see Homebrew for example). There are other requirements, such as MySQL client libraries, that are needed if you want to compile the actual command line utilities, if so desired.

For purposes here, only the library needs to be compiled. It does not need to be installed, as nothing is linked. Therefore, you can safely ignore the main Makefile commands. Below are the steps for compiling just the requisite C library for installing the Perl module.

Edit the file kent/src/inc/common.mk, and insert -fPIC into the CFLAGS variable. If you have installed any libraries in non-standard locations, e.g. openssl installed via HomeBrew on MacOS, then add these paths to the HG_INC variable. Save the file.

To simplify compilation, you can skip the main Makefile and simply compile only the libraries that you need. First, export the MACHTYPE environment variable to an acceptable simple value, usually x86_64.

Next, move to the included kent/src/htslib directory, and compile this library by issuing the make command.

Move to the kent/src/lib directory, and compile the library by issuing the make command. If it compiles successfully, you should get a jkweb.a file in the lib/x86_64 directory.

Finally, you can return to the Perl module. First, set the KENT_SRC environment variable to the full path of the kent/src build directory (otherwise you will need to interactively provide the Perl module Build script this path). Then issue the standard Build.PL commands to build, test, and install the Perl modules.