Popis: |
Most genetic variation in humans occurs in a pattern consistent with neutral evolution, but a small subset is maintained by balancing selection. Identifying loci under balancing selection is important not only for understanding the processes explaining variation in the genome, but also to identify loci with alleles that affect response to the environment and disease. Several genome scans using genetic variation data have identified the 5’ end of the DMBT1 gene as a region undergoing balancing selection. DMBT1 encodes the pattern-recognition glycoprotein DMBT1, also known as SALSA, gp340 or salivary agglutinin. It binds to a wide variety of pathogens through a tandemly-arranged scavenger receptor cysteine-rich (SRCR) domain, with the number of SRCR domains varying in humans. Here we use expression analysis, linkage in pedigrees, and long range single transcript sequencing, to show that the signal of balancing selection is driven by one haplotype usually carrying shorter SRCR repeats, and another usually carrying a longer SRCR repeat, within the coding region of DMBT1. The DMBT1 protein size isoform encoded by a shorter SRCR domain repeat allele showed complete loss of binding of a cariogenic and invasive Streptococcus mutans strain in contrast to the long SRCR allele. Taken together, our results suggest that balancing selection at DMBT1 is due to host-microbe interactions of encoded SRCR tandem repeat alleles. |