Jellyfish

Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. Jellyfish can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the “compare-and-swap” CPU instruction to increase parallelism. JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in a binary format, which can be translated into a human-readable text format using the “jellyfish dump” command, or queried for specific k-mers with “jellyfish query”. See the documentation for details. If you use Jellyfish in your research, please cite: Guillaume Marcais and Carl Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 27(6): 764-770 (first published online January 7, 2011) doi:10.1093/bioinformatics/btr011

Version: 2.3.0

Availability: GALILEO100

Target: all

Official web site: https://github.com/gmarcais/Jellyfish

Related Commands:

To load the application you should run:

module load profile/bioinf
module load autoload vcftools

 vcftools [ --vcf FILE | --gzvcf FILE | --bcf FILE] [ --out OUTPUT PREFIX ] [ FILTERING OPTIONS ] [ OUTPUT OPTIONS ]

 

Example

 

$ module load profile/bioinf

$ module load autoload vcftools

 

$ vcftools --gzvcf input_file.vcf.gz --freq --chr 1 --out chr1_analysi

Help and Documentation:

You can find documentation on the system, with the command

module help vcftools

To see a list of alla available tools, run

 man vcftools

Or visit the Documentation page: https://vcftools.github.io/man_latest.html

CINECA consultants can be reached through the address: superc@cineca.it