Welcome to ViTA User Manual

1. About

Vaccination is one of the most efficacious medical interventions. Viral vaccines save millions of people from the severity of diseases and death each year. Design of effective vaccine targets heavily depends on the sequence dynamics of viruses. Viral sequence diversity is a major challenge in effective vaccine design. The Vaccine Target Analyser (ViTA) is part of the ViVA ecosystem that aims to provide users with shortlisted vaccine target candidates for a given virus of interest. The candidates are highly conserved, functional, and immune-relevant sequences of the virus that potentially provide a broad coverage of the viral variants and are applicable to the human population at large.

Browser Compatability

browserc

Accessibility

ViTA is publicly available at: https://dim-devel.bioinfo.perdanauniversity.edu.my.

FrontEnd/BackEnd Frameworks

Python FastAPI is utilized for ViTA Backend. Typescript with ReactJS is utilized for ViTA Frontend.

2. Workflow

workflow

How does it work?

User uploads aligned FASTA file of the virus of interest to ViTA. ViTA then uses the DiMA tool (https://doi.org/10.48550/arxiv.2205.13915), embedded within the ViVA system, to extract historically conserved sequences (HCS). Briefly, DiMA is designed to facilitate the dissection of sequence diversity dynamics for viruses. As a base, DiMA provides a quantitative overview of sequence diversity by use of Shannon’s entropy, applied via a user-defined k-mer sliding window to an input alignment file. Distinctively, the key feature is that DiMA interrogates diversity dynamics by dissecting each k-mer position to various diversity motifs, defined based on the incidence of distinct sequences ([see next section])(Section_diversity). DiMA identifies HCSs across the input alignment length, selected based on a user-defined index incidence threshold (default: 100%). Selected index sequences that overlap or are immediately adjacent in their k-mer positions are concatenated as longer sequences. An HCS of 80% or higher incidence may be attractive for further consideration as an intervention target. The HCS are then further annotated by using of various bioinformatics tools: structure-function annotation, immune-relevance, virus coverage (intra-species analysis), and virus specificity (inter-species analysis).

Defining diversity motifs

For a given sequence alignment, all sequences at each of the aligned k-mer positions are quantified for distinct sequences and ranked-classified into diversity motifs based on their incidences, as described in Hu et al. (2013)(PMID: 23593157).

diversitymotifs

Figure 1. Definitions of diversity motifs. The ‘‘Index’’ nonamer is the most prevalent sequence, present in 8 of the 20 isolates. The ‘‘Major’’ variant is the most common variant of the index (5/20). ‘‘Minor’’ variants are multiple different repeated sequences, each with incidences less than the major variant. ‘‘Unique’’ variants are those represented by a single aligned sequence. Distinct variant sequences at a given nonamer position are the different sequence at the position; in this example one of major, two of minor, and three of unique.

3. Input file and parameters

Input File

ViTA only uses multiple sequence alignment (protein sequences) in (aligned) FASTA (.afa or .fas) format. Any existing, published alignment tool can be used to produce the MSA, such as MAFFT or MUSCLE, as long as the aligned sequences are provided in (aligned) FASTA format. It is the user’s responsibility to manually check and fix for any misalignment.

Parameters

Sample Name

Name of the sequence to be analysed.

Low support threshold

The support is defined as the number of sequences at a given k-mer position that do not harbor a gap and/or an unknown/ ambiguous amino acid residue. Positions below a statistical support of 30 sequences (default) are defined as of low support. The user has the flexibility to set the threshold for low support.

K-mer lenght

Select a k-mer window size that is appropriate for the analysis. While the minimum applicable size is 3, the maximum can equal to the alignment length of the uploaded input file. By default, a window size of nine (9; nonamer; 9-mer) is applied to evaluate the viral diversity with respect to cellular immune response.

4. How to interpret the results

How to interpret the results?

Download

5. FAQs and Support

FAQs

1.How to cite? For now: https://dim-devel.bioinfo.perdanauniversity.edu.my

Support

Please don’t hesitate to reach out to the developers for your questions, comments, or other feedback through mailing makhan@bezmialem.edu.tr.