purpose of this exercise is to demonstrate the value of
MHC-peptide binding predictions in exploring HLA-disease associations.
Background: Read Kiepiela et al and Kosmrlj et al (especially the first
about the following issues as a preparation for the exercise:
Main research question: What is the mechanism behind the
fact that certain MHC molecules are associated with slow progression to AIDS?
Try to answer this question (the subquestions below will help
you to do that) as much as possible using your own analysis.
If you are stuck, refer to the hints at the end of the page.
- Which CTL responses seem to be associated with slow
disease progression in HIV-1 infection?
- Which HLA alleles are associated with slow and fast
progression to AIDS?
Figure 2 of
et al again. What does this
result tell you about the relationship between CD8 responses to different HIV-1
proteins and viral loads?
perform an analysis that can show that B*57 and B*18
target different HIV-1 proteins, ie. they have different number (or
from different HIV-1 proteins using the peptide
binding predictions ( NetMHC-3.2 server , hints 1&2).
do you think it might be beneficial to
present HIV-1 Gag?
||For HLA-B*57, Kosmrlj et al
proposes a different mechanism to explain its protective
effect. Try to
estimate the binding fraction of B*57 for other viruses than HIV-1 (proteome sequences
of viruses are available from EBI web page ). How
are the numbers you get to the number estimated for self by Kosmrlj
et al (Supplementary Material)? Is self the only proteome that would be
presented poorly by B*5701 molecule?
Study the binding motif of B*5701 (a database of known MHC ligands is IEDB, hint 3). Can the amino acid preference at
the anchor positions explain the observed phenomenon? (hint 4)
one of the two explanations why B*5701 is protective in HIV-1 infection
do you think is more likely?
- To find which alleles target which HIV-1
proteins, it is useful to use a data set that contains all HIV-1
proteins. We have prepared such a file for you here.
Alternatively, you can generate your HIV-1 protein data set in many ways, e.g. you can search for the
HIV-1 genome at NCBI
via the protein coding regions download all protein sequences (use the
FASTA" option under DISPLAY at the bottom of the page). Make sure you
don't use concatenated proteins (e.g. Gag-Pol) for the analysis.
- To analyze the output of NetMHC-3.2
use the "download output sheet" option, save the output (by using right
mouse button and "save as" option), and view it in Excel. You can sort
the affinities to find which peptides are the best binders. The sorting can be done
by selecting all columns that contain data and then in the pop-up menu
choose for the column that has the predicted affinity values to sort. Remember
smaller the predicted binding value, the better the binding. Your fasta
files should contain the protein names as the first word of the
identifier line. As NetMHC-3.2 does not offer predictions for
B*5703 (the allele mentioned in Kiepiela et al), please use B*5701 in
You can find binding peptides and eluted ligands for a particular
MHC molecule in the IEDB.
"Search" menu offers short-cuts to "MHC binding search" and "MHC ligand
elution search". Download your results using the "Epitopes" link
instead of "Assays" link to get unique ligands. Search for HLA-B5701
explicitly, and download only 9mer ligands, i.e., the peptides
of length 9. When you export the data to your excel, sometimes it can go wrong
and all your data ends up in one column. In this case use "text to column"
option, select the column you want to split, ise comma as a deliminator and
it will format nicely. Alternatively, in the SYFPEITHI
database use the Find motif, Ligand or epitope
option to find the ligands of a specific MHC molecule.
logos of the ligands/binders you find using WebLogo.
WebLogo takes a list of peptides as input. You can
change the logo dimensions to a suitable size (e.g. 8x15 cm). Study
which positions are anchor positions and what amino acids are
found at the anchor positions. Alternatively, you
can have a look at predicted binding motifs at the MHC
- You can find information on amino acid
frequencies in the Swissprot
statistics (scroll down to the bottom of the page).