HIV-I Gag responses
Learning objectives:
  1. Get familiar with MHC-peptide binding prediction tools.
  2. Demonstrate how certain assumptions drastically effect the results of computational analysis.
  3. Get a better understanding of MHC-disease associations via HIV-1 example.
Background: Read Kiepiela et al  beforehand. Think about the following issues as a preparation for the exercise:
  1. Which CTL responses seem to be associated with slow disease progression in HIV-1 infection?
  2. Which HLA alleles are associated with slow and fast progression to AIDS?
Main research question: What is the mechanism behind the fact that certain MHC molecules are associated with slow progression to AIDS? Try to answer this question (the subquestions below will help you to do that) as much as possible using your own analysis. If you are stuck, refer to the hints at the end of the page.

HLA-disease associations
1. Study Figure 2 of Kiepiela et al again. What does this result tell you about the relationship between CD8 responses to different HIV-1 proteins and viral loads?
2. Design and perform an analysis that can show that B*57 and B*18 target different HIV-1 proteins, ie. do they have different number (or density) of epitopes from different HIV-1 proteins using the peptide binding predictions: NetMHC-4.0 server (see hints 1&2). Focus on peptides of length 9 for this exercise and assume that top 1% of the peptides (sorted based on binding affinity) is the set of predicted epitopes for each HLA molecule.
3. Compare your results to expected number of epitopes from each protein. In NetMHC output you can see easily how many peptides of length 9 there are per HIV-1 protein. Remember: we are using the top 1% threshold to define potential epitopes.
4. Study the binding motif of B*5701 and B*1801 using the Sequence motifs link on NetMHC page. Can you now understand why these two HLA molecules present different HIV-1 peptides?
Why do you think it might be beneficial to present HIV-1 Gag? To answer this question search internet/PubMed for more information mutability of HIV-1 proteins.


budding HIV (source: Wikipdia)

Scanning electron micrograph of HIV-1 budding from cultured lymphocyte

  1. To find which alleles target which HIV-1 proteins, it is useful to use a data set that contains all HIV-1 proteins. We have prepared such a file for you here. Alternatively, you can generate your HIV-1 protein data set in many ways, e.g. you can search for the HIV-1 genome at NCBI and via the protein coding regions download all protein sequences (use the "Protein" link on the right hand side of the page under "Related Infromation" header. Under Summary option, at the top of the page, you can choose for "FASTA (text)" format and download all the sequences at once). Make sure you don't use concatenated proteins (e.g. Gag-Pol) for the analysis. If you have such sequences check out the NCBI entry to figure out which positions contain Gag and which positions contain Pol.
  2. To analyze the output of NetMHC-4.0 use the "Save output in XLS format" option, save the output (link is at the end of the output page; using right mouse button and "save as" option you can save the output), and view it in Excel. When you export the data to your excel, sometimes it can go wrong and all your data ends up in one column. In this case use "text to column" option, select the column you want to split, use right delliminators and it will format nicely. You can sort the affinities to find which peptides are the best binders. The sorting can be done by selecting all columns that contain data and then in the pop-up menu choose for the column that has the predicted affinity values to sort. Remember smaller the predicted binding value, the better the binding. Your fasta files should contain the protein names as the first word of the identifier line. As NetMHC-4.0 does not offer predictions for B*5703 (the allele mentioned in Kiepiela et al), please use B*5701 in this exercise.

Writing up this computer exercise as your research project
If you decide to write your research project report on this computer exercise, then we require that you perform the analysis suggested above, but moreover, you will need to perform an extra analysis. For HLA-B*57, Kosmrlj et al proposes a different mechanism to explain its protective effect. Study this paper, and if possible make your own extra analysis to find evidence for this alternative mechanism. Which one of the two explanations why B*5701 is protective in HIV-1 infection do you think is more likely?