Site Tools


anthill_sri011

This is an old revision of the document!


description

Multivariate Analysis of Transcript Splicing (MATS). MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. Software page : http://rnaseq-mats.sourceforge.net/ .

software version

Below release of rMATS 4.0.21) ( 04/25/2018 ) is described and used.

prepare software

Login to anthill23 and download rMATS along with some test dataset, and gft files:

@anthill23:~$

mkdir -p ~/anthill23_rmats/{testData,gtf}
cd ~/anthill23_rmats
wget https://sourceforge.net/projects/rnaseq-mats/files/MATS/rMATS.4.0.2.tgz
tar xzf rMATS.4.0.2.tgz
cd ~/anthill23_rmats/testData
wget https://sourceforge.net/projects/rnaseq-mats/files/MATS/testData.tgz
tar xzf testData.tgz
cd ~/anthill23_rmats/gtf
wget ftp://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.gtf.gz
gunzip Homo_sapiens.GRCh37.87.gtf.gz
wget ftp://ftp.ensembl.org/pub/release-91/gtf/homo_sapiens/Homo_sapiens.GRCh38.91.gtf.gz
gunzip Homo_sapiens.GRCh38.91.gtf.gz

prepare gsl lib

rMATS uses gsl library that need to be prepared separately2). This library have to be compiled on computing node, so it is done in interactive mode (SLURM, batch and interactive mode) :

@anthill23:~$

srun -J gsl_compile -N1 -n4 --pty bash -l

cd ~/anthill23_rmats/
wget http://gnu.mirror.vexxhost.com/gsl/gsl-2.4.tar.gz
tar xzf gsl-2.4.tar.gz
cd gsl-2.4
./configure --prefix="/home/users/${USER}/anthill23_rmats/gsl-2.4"
make -j ${SLURM_NTASKS}
make install 

cd ~/anthill23_rmats/gsl-2.4/lib
ln -s libgsl.so libgsl.so.0

exit

simple test

If above steps ( prepare software, and prepare gsl lib ) finish successfully this simple test should finish with success. It is also done in interactive mode ( SLURM, batch and interactive mode) :

@anthill23:~$

srun -J rMATS_test --pty bash -l

cd /home/users/tmatejuk/anthill23_rmats
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/users/${USER}/anthill23_rmats/gsl-2.4/lib
python rMATS.4.0.2/rMATS-turbo-Linux-UCS4/rmats.py --b1 testData/b1.txt --b2 testData/b2.txt --gtf gtf/Homo_sapiens.GRCh37.87.gtf --od bam_test -t paired --readLength 50 --cstat 0.0001 --libType fr-unstranded

...
#read output and exit when finished
exit

sbatch example: bam

Bellow example was taken form rMATS User Guide. This description assumes that above steps ( prepare software, and prepare gsl lib, simple test ) were done beforehand if not, remember to modify them accordingly.

File rmats_bam.batch .

#!/bin/bash -l
#SBATCH --partition=test
#SBATCH --ntasks=4
#SBATCH --mem 4G

cd /home/users/${USER}/anthill23_rmats
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/users/${USER}/anthill23_rmats/gsl-2.4/lib

python rMATS-turbo-xxx-UCSx/rmats.py --b1 b1.txt --b2 b2.txt -gtf gtf/Homo_sapiens.GRCh38.91.gtf -od bam_test -t paired --readLength 50 --cstat 0.0001 --libType fr-unstranded

Run computation with sbatch rmats_bam.batch. You will find results in /home/users/${USER}/anthill23_rmats/bam_test folder.

performance tests

...
1)
rMATS is available in two 'versions'. Version rMATS-turbo-xxx-UCS4 is one to be used, more http://rnaseq-mats.sourceforge.net/user_guide.htm in section 'Which version to use'
2)
takes about five minutes
anthill_sri011.1562413874.txt.gz · Last modified: 2023/08/01 06:38 (external edit)