Glossary of Terms

  • Amino Acid Composition (AAC): The proportion of each amino acid in a protein sequence, expressed as a percentage, used to calculate properties like hydrophobicity.
  • Aromaticity: The proportion of aromatic amino acids (phenylalanine, tyrosine, tryptophan) in a protein, affecting stability and interactions.
  • B-cell Epitope: A protein region recognized by B-cell receptors or antibodies, predicted using the Bepipred method.
  • Bepipred: A method for predicting linear B-cell epitopes based on sequence features like hydrophilicity.
  • Binder Class: Classification of MHC-binding peptides as Strong, Weak, or Intermediate based on binding affinity.
  • Charge: The net charge of a protein at pH 7, calculated from ionizable amino acids.
  • cs_probability: The probability of a signal peptide cleavage site, predicted by SignalP.
  • EL_Rank: Percentile rank of a peptide’s MHC binding affinity; lower ranks indicate stronger binding.
  • Encoded Dataset: A dataset with categorical features converted to numerical values for machine learning.
  • Epitope Probability: A Bepipred score indicating the likelihood of a protein region being a B-cell epitope.
  • Exposed/Buried: Classification of residues as exposed (E) or buried (B) in the protein structure, from Bepipred.
  • Full Dataset: A dataset combining MHC-I, MHC-II, B-cell, physicochemical, and signal peptide features.
  • GRAVY: Grand Average of Hydropathicity, measuring a protein’s hydrophobicity.
  • Hydrophobic Moment: A measure of amphiphilicity, indicating hydrophobic residue distribution.
  • Hydrophobicity: A protein’s tendency to avoid water, based on amino acid hydrophobicity values.
  • Identity: A unique identifier for each protein sequence in the input FASTA file.
  • Instability Index: A measure of protein stability; values above 40 suggest instability.
  • Isoelectric Point (pI): The pH at which a protein has no net charge.

Glossary of Terms (Continued)

  • MHC-I: Major Histocompatibility Complex Class I molecules that present peptides to cytotoxic T-cells, analyzed by netMHCpan.
  • MHC-II: Major Histocompatibility Complex Class II molecules that present peptides to helper T-cells, analyzed by netMHCIIpan.
  • Molecular Weight: The mass of a protein, calculated from its amino acids, in Daltons.
  • netMHCpan: A tool for predicting peptide binding to MHC class I molecules.
  • netMHCIIpan: A tool for predicting peptide binding to MHC class II molecules.
  • Peptide Count: The number of peptides analyzed for MHC binding, reflecting sequence coverage.
  • Physicochemical Properties: Protein characteristics like molecular weight and GRAVY, calculated using ProteinAnalysis.
  • Potential Vaccine Candidate: A protein predicted as a likely vaccine target by the machine learning model.
  • Protein Label: A binary indicator of strong MHC binders, used in machine learning.
  • Scaled Dataset: A dataset with standardized numerical features for machine learning.
  • Secondary Structure Fraction: The proportion of residues forming alpha helices, beta sheets, or turns.
  • Signal Peptide: A peptide directing proteins to cellular compartments, predicted by SignalP.
  • SignalP: A tool for predicting signal peptides and their cleavage sites.
  • Stacking Classifier: A machine learning model combining algorithms to predict vaccine candidates.
  • Strong Count: The number of peptides classified as strong MHC binders.
  • Strong Fraction: The proportion of a protein’s peptides that are strong MHC binders.
  • Structural Class: The predicted structural category of a protein (e.g., alpha, beta).
Back to Home