Site Tools


anthill_sri018

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
anthill_sri018 [2023/11/04 08:33] – created tmatejukanthill_sri018 [2023/11/07 07:33] (current) tmatejuk
Line 1: Line 1:
 ==== description ==== ==== description ====
-The preseq package is aimed at predicting the yield of distinct reads from genomic library from an initial sequencing experimentSoftware page : [[https://github.com/smithlabcode/preseq]].+[[https://odelaneau.github.io/GLIMPSE/|GLIMPSE2]] is a set of tools for low-coverage whole genome sequencing imputation. GLIMPSE2 is based on the GLIMPSE model and designed for reference panels containing hundreads of thousands of reference samples, with special focus on rare variants 
 +  * **GLIMPSE2_chunk** splits the genome into chunks for imputation and phasing 
 +  * **GLIMPSE2_split_reference** creates the reference panel representation used by GLIMPSE2_phase. It allows major speedups for large reference panels. 
 +  * **GLIMPSE2_phase** program to impute and phase low coverage sequencing data 
 +  * **GLIMPSE2_ligate** concatenates phased chunks of data into chromosome-wide phased files 
 +=== citation === 
 +If you use GLIMPSE in your research work, please cite the following papers [[https://odelaneau.github.io/GLIMPSE/|more]].
  
 ==== software version ==== ==== software version ====
-Bellow description covers preseq version 2.3..+Bellow description covers preparation and compilation of GLIMPSE2_phase v2.0.0 ((Software is available as static binaries, see https://odelaneau.github.io/GLIMPSE/docs/installation/static_binaries))
  
-==== installation ==== 
-=== static binaries === 
-Software is available as static binaries : https://odelaneau.github.io/GLIMPSE/docs/installation/static_binaries . 
  
-=== compile GLIMPSE2 ===+==== prepare software ==== 
 +Login to anthill23, and download required sources ((this will take 8 minutes)) 
 +  mkdir -p ~/anthill23_soft/glimpse 
 +   
 +  cd ~/anthill23_soft/glimpse 
 +  git clone https://github.com/samtools/htslib htslib_src 
 +  cd htslib_src 
 +  git submodule update --init --recursive 
 +   
 +  cd ~/anthill23_soft/glimpse 
 +  wget https://boostorg.jfrog.io/artifactory/main/release/1.73.0/source/boost_1_73_0.tar.bz2 
 +  tar --bzip2 -xf boost_1_73_0.tar.bz2 
 +   
 +  cd ~/anthill23_soft/glimpse 
 +  git clone https://github.com/odelaneau/glimpse.git
  
  
-<note warning>This is not finished. Work in progress (2023.11.04)</note>+start interactive session in which download and compile htslib, boost, glimpse2_phase ((this will take 20 minutes)):
  
-<note warning>This is not finished. Work in progress (2023.11.04)</note>+  @anthill23:~$ srun --cpus-per-task 8 --mem-per-cpu 20G  -J glimpse_prep --partition=medium --pty --preserve-env bash 
 +  
  
- +compile htslib ((this will take minutes)): 
-<note warning>This is not finished. Work in progress (2023.11.04)</note> +  cd ~/anthill23_soft/glimpse/htslib_src 
- +  autoreconf -
-==== prepare software ==== +  ./configure --enable-libcurl --prefix=/home/users/${USER}/anthill23_soft/glimpse/htslib 
-Login to anthill23, start interactive session in which download and compile preseq ((this will take minutes)): +  make 
- +  make install
-  @anthill23:~$ srun -n 2 --mem=4G --pty bash -l+
      
-  mkdir ~/anthill23_soft/ 
-  cd ~/anthill23_soft/ 
-  wget -O preseq-3.1.2.tar.gz https://github.com/smithlabcode/preseq/releases/download/v3.1.2/preseq-3.1.2.tar.gz 
-  tar xvzf preseq-3.1.2.tar.gz 
-  cd preseq-3.1.2 
      
-  mkdir build +compile boost ((this will take 1 minute)):  
-  cd build+  cd ~/anthill23_soft/glimpse/boost_1_73_0 
 +  ./bootstrap.sh --with-libraries=iostreams,program_options,serialization --prefix=../boost 
 +  ./b2 install
      
-  mkdir -/home/users/${USER}/anthill23_preseq/preseq-3.1.2 +compile GLIMPSE2_phase ((this will take 3 minutes)):  
-  ../configure --prefix="/home/users/${USER}/anthill23_preseq/preseq-3.1.2" --enable-hts+  cd ~/anthill23_soft/glimpse/glimpse/phase 
 +  sed -i 's/desktop: HTSSRC.*/desktop: HTSSRC=\/home\/users\/${USER}\/anthill23_soft\/glimpse\/htslib/' makefile 
 +  sed -i 's/desktop: HTSLIB_INC.*/desktop: HTSLIB_INC=\$(HTSSRC)\/include/' makefile 
 +  sed -i 's/desktop: HTSLIB_LIB.*/desktop: HTSLIB_LIB=\$(HTSSRC)\/lib\/libhts.a/' makefile 
 +  sed -i 's/desktop: BOOST_INC.*/desktop: BOOST_INC=\/home\/users\/${USER}\/anthill23_soft\/glimpse\/boost\/include/' makefile 
 +  sed -i 's/desktop: BOOST_LIB_IO.*/desktop: BOOST_LIB_IO=\/home\/users\/${USER}\/anthill23_soft\/glimpse\/boost\/lib\/libboost_iostreams.a/' makefile 
 +  sed -i 's/desktop: BOOST_LIB_PO.*/desktop: BOOST_LIB_PO=\/home\/users\/${USER}\/anthill23_soft\/glimpse\/boost\/lib\/libboost_program_options.a/' makefile 
 +  sed -i 's/desktop: BOOST_LIB_SE.*/desktop: BOOST_LIB_SE=\/home\/users\/${USER}\/anthill23_soft\/glimpse\/boost\/lib\/libboost_serialization.a/' makefile 
 +  make desktop
      
-  make -j${SLURM_NPROCS} && make install +compile GLIMPSE2_chunk / ligate / split_reference  
 +  #repeat the same as for GLIMPSE2_phase described above, but in different folder 
 +  cd ~/anthill23_soft/glimpse/glimpse/chunk           # for GLIMPSE2_chunk 
 +  cd ~/anthill23_soft/glimpse/glimpse/ligate          # for GLIMPSE2_ligate 
 +  cd ~/anthill23_soft/glimpse/glimpse/split_reference # for GLIMPSE2_split_reference 
 +  
 +test created binary, you should get similar output : 
 +  cd ~/anthill23_soft/glimpse/glimpse/phase 
 +  ./bin/GLIMPSE2_phase 
      
-  #read output+  [GLIMPSE2] Phase and impute low coverage sequencing data 
 +    * Authors              : Simone RUBINACCI & Olivier DELANEAU, University of Lausanne 
 +    * Contact              : simone.rubinacci@unil.ch & olivier.delaneau@unil.ch 
 +    * Version              : GLIMPSE2_phase v2.0.0 / commit = 0fcf590 / release = 2023-10-09 
 +    * Citation             : BiorXiv, (2022). DOI: https://doi.org/10.1101/2022.11.28.518213 
 +    *                      : Nature Genetics 53, 120–126 (2021). DOI: https://doi.org/10.1038/s41588-020-00756-0 
 +    * Run date             : 05/11/2023 - 08:17:29
      
 +  ERROR: You must specify input files using one of the following options: --bam, --bam-list or --input-gl
 +
 +finish
   #exit interactive mode when finished with exit command   #exit interactive mode when finished with exit command
   exit   exit
Line 45: Line 84:
  
 ==== sbatch example ==== ==== sbatch example ====
-This description assumes that above step ( prepare software ) were done beforehand\\ +...FIXME
-Bellow slurm job creates temporary folder in your home folder (like ''tmp.AtrQrrJi3W''), then downloads file ''wgEncodeLicrRnaSeqBmdmCellPapFAdult8wksC57bl6AlnRep1.bam'' and run preseq on it ((this will take 2 minutes)). \\ +
-Inspect job output ex.: ''slurm-1949.out'' and ''yield_estimates.txt'' file for preseq output messages and output. +
- +
-File ''preseq_test.sbatch''+
- +
-<code> +
-#!/bin/bash -l +
-#SBATCH --partition=test +
-#SBATCH --ntasks=2 +
-#SBATCH --mem 4G +
-#SBATCH --job-name preseq_test +
- +
-#prepare temporary folder +
-cd `mktemp -d -p ~/` +
- +
-#download bam filefq file +
-wget --no-check-certificate http://hgdownload-test.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeLicrRnaSeq/wgEncodeLicrRnaSeqBmdmCellPapFAdult8wksC57bl6AlnRep1.bam +
- +
-#export path to preseq-3.1.2 binary +
-export PATH=$PATH:/home/users/${USER}/anthill23_preseq/preseq-3.1.2/bin +
- +
-#predict the yield of a future experiment if the input file is in .bam format +
-preseq lc_extrap -B -o yield_estimates.txt wgEncodeLicrRnaSeqBmdmCellPapFAdult8wksC57bl6AlnRep1.bam +
- +
-</code>+
  
 ==== performance tests ==== ==== performance tests ====
-NoneAt this time.+... FIXME
  
  
  
  
anthill_sri018.1699083236.txt.gz · Last modified: 2023/11/04 08:33 by tmatejuk