Next-generation DNA sequencing (NGS) has incredibly accelerated biological and biomedical research, by allowing the comprehensive analysis of genomes, transcriptomes and interactomes. Managing the huge amount of data from new sequencing platforms requires non trivial skills, strong computational power and storage capacity which are generally not available in most research labs. Our consortium has been recognized as big data center and HPC analysis for the Italian epigenomic flag project Epigen and is member of ELIXIR-IIB the Italian node of the european Infrastructure for Bioinformatics.
The CINECA centralized bioinformatics core facility provides shared resources for the computational and IT requirements.
Access to these resources for Italian and European researchers can be obtained through the ELIXIR-IIB HPC@CINECA service provided by ELIXIR-IIB, the Italian node of the european Infrastructure for Bioinformatics. .
Whole Exome Sequencing (WES) analysis is now available for several research purposes. A frequently updated pipeline, WEP, is used to call variants, both SNPs and indels. Variants are then filtered with many public databases including dbSNP, the 1000 Genomes project, HapMap exomes and more. Variant prioritization is obtained by comparing disease and healthy controls and performing their functional annotation (e.g. the functional relevance of a protein variant is assessed by SITF software). Moreover, for family-based samples, the advanced analysis of haplotype phasing and complex heterozygous or homologous mutations detection is available as well.
A new sequencing platform, the MiSeq Illumina sequencer, allows to identify known causative mutations by producing a Ultra-Deep coverage on a selected list of Targeted genomic regions Sequencing (UDT-Seq). UDT-Seq is becoming particularly suitable for clinical diagnostic applications since it implies full coverage of sequenced regions and guarantees that no other mutation was lost by the analysis. ODESSA (Online Deep Exome Sequencing Software Analysis) is a new automated high-performance bioinformatics web platform, developed for targeting genes at high coverage through deep sequencing with the maximum usability, and focused on rational diagnosis of targeted therapies. It identifies Single Nucleotide Variants (SNVs) and Deletion/Insertion Variants (DIVs) classified by different useful scores (e.g. depth coverage).
RNA-Seq (Transcriptome) analysis is now avaliable for transcriptome structural analysis and quantification. The transcriptome analysis allows the identification of known or novel expressed transcript variants, and their quantification.
RNA-Seq, unlike microarrays, does not require prior knowledge of the genome and therefore offers several advantages. Our facility, RAP, can study the transcriptome profiling of each sample, performs differential gene expression analysis, cassette exons, chimeric transcripts and polyA sites detection.
RNA editing is a post-transcriptional mechanism challenging the central dogma of molecular biology. Nowadays, the term RNA editing is also used to indicate post-transcriptional changes due to specific base substitutions. Such alterations may affect coding as well as non-coding RNAs located in different cellular compartments and occur in a variety of organisms. ExpEdit is a web application for assessing RNA editing in human at known or user-specified sites supported by transcript data obtained by RNA-Seq experiments. Mapping data or directly sequence reads can be provided as input to carry out a comparative analysis against a large collection of known editing sites collected in DARNED database as well as other user-provided potentially edited positions. Results are shown as dynamic tables containing University of California, Santa Cruz (UCSC) links for a quick examination of the genomic context.
ChIP-Seq is widely used to analyze DNA-protein interactions. It combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify binding sites of DNA-associated proteins, and can be used to precisely map global binding sites for any protein of interest. Our bioinformatic service, CAST, provides Genome-wide distribution of ChIP sequencing reads, peak identification and differential analysis across different samples.
Variants of clinical interest should be the centralized point to focus for direct and indirect collaborations. In the era of the social web, the power of web applications to bring easy tools for wide world collaborations we are about to launch a new project, DR.CREVV, which uses the NGS data of our cloud infrastructure to connect hospitals and laboratories for a common task. Variants will be commented and linked by users to other results and secondary databases. Featured information about SANGER validations, plus other clinical statistics on how many variants are found and the frequency they have on all the samples. Every additional information provided by any user is statistically aggregated as metadata attribute. Privacy is ensured by anonymous data collecting .