Site Tools


anthill_sri011

This is an old revision of the document!


description

Multivariate Analysis of Transcript Splicing (MATS). MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. Software page : http://rnaseq-mats.sourceforge.net/ .

software version

Below release of rMATS 4.0.21) ( 04/25/2018 ) is described and used.

prepare software

Login to anthill23 and download rMATS along with some test dataset, and gft files:

@anthill23:~$

mkdir -p ~/anthill23_rmats/{testData,gtf}
cd ~/anthill23_rmats
wget https://sourceforge.net/projects/rnaseq-mats/files/MATS/rMATS.4.0.2.tgz
tar xzf rMATS.4.0.2.tgz
cd ~/anthill23_rmats/testData
wget https://sourceforge.net/projects/rnaseq-mats/files/MATS/testData.tgz
tar xzf testData.tgz
cd ~/anthill23_rmats/gtf
wget ftp://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.gtf.gz
gunzip Homo_sapiens.GRCh37.87.gtf.gz
wget ftp://ftp.ensembl.org/pub/release-91/gtf/homo_sapiens/Homo_sapiens.GRCh38.91.gtf.gz
gunzip Homo_sapiens.GRCh38.91.gtf.gz

prepare gsl lib

rMATS uses gsl library that need to be prepared separately2). This library have to be compiled on computing node, so it is done in interactive mode ( SLURM, batch and interactive mode) :

@anthill23:~$

srun -J gsl_compile -N1 -n4 --pty bash -l

cd ~/anthill23_rmats/
wget http://gnu.mirror.vexxhost.com/gsl/gsl-2.4.tar.gz
tar xzf gsl-2.4.tar.gz
cd gsl-2.4
./configure --prefix="/home/users/${USER}/anthill23_rmats/gsl-2.4"
make -j ${SLURM_NTASKS}
make install 

cd ~/anthill23_rmats/gsl-2.4/lib
ln -s libgsl.so libgsl.so.0

exit

simple test

If above steps ( prepare software, and prepare gsl lib ) finish successfully this simple test should finish with success. It is also done in interactive mode ( SLURM, batch and interactive mode) :

@anthill23:~$

srun -J rMATS_test --pty bash -l

cd /home/users/tmatejuk/anthill23_rmats
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/users/${USER}/anthill23_rmats/gsl-2.4/lib
python rMATS.4.0.2/rMATS-turbo-Linux-UCS4/rmats.py --b1 testData/b1.txt --b2 testData/b2.txt --gtf gtf/Homo_sapiens.GRCh37.87.gtf --od bam_test -t paired --readLength 50 --cstat 0.0001 --libType fr-unstranded

...
#read output and exit when finished
exit

sbatch example: bam

Bellow example was taken form rMATS User Guide. This description assumes that above steps ( prepare software, and prepare gsl lib, simple test ) were done beforehand if not, remember to modify them accordingly.

File rmats_bam.batch .

#!/bin/bash -l
#SBATCH --partition=test
#SBATCH --ntasks=4
#SBATCH --mem 4G

cd /home/users/${USER}/anthill23_rmats
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/users/${USER}/anthill23_rmats/gsl-2.4/lib

python rMATS-turbo-xxx-UCSx/rmats.py --b1 b1.txt --b2 b2.txt -gtf gtf/Homo_sapiens.GRCh38.91.gtf -od bam_test -t paired --readLength 50 --cstat 0.0001 --libType fr-unstranded

Run computation with sbatch rmats_bam.batch. You will find results in /home/users/${USER}/anthill23_rmats/bam_test folder.

performance tests

...
1)
rMATS is available in two 'versions'. Version rMATS-turbo-xxx-UCS4 is one to be used, more http://rnaseq-mats.sourceforge.net/user_guide.htm in section 'Which version to use'
2)
takes about five minutes
anthill_sri011.1562413851.txt.gz · Last modified: 2023/08/01 06:38 (external edit)