VENN is a software program that maps sequence conservation onto the three-dimensional structures of proteins.
Proteins have sequences of amino acids covalently linked together in peptide bonds. Proteins can be very short peptide like Enkephalin, or very long like Titin. Proteins carry out many functions in the cell from catalyzing chemical reactions, to hormones that signal between cells, structural proteins that form a molecular cytoskeleton, etc. In order to identify important regions in proteins scientists perform sequence alignments of two or more proteins to identify those amino acids that are identical. When two or more proteins from the same species have high sequence similarity, they belong to the same gene family. For example, Rac1, Rac2, Rac3, RhoA, RhoB, RhoC, RhoD, RhoG, and Cdc42 are all members of the Rho GTPase family. It is also useful to align proteins from one or more species that are the most homologous pairings are and through to be orthologs or paralogs. Application of these concepts has vastly advanced our understanding of the functions of many proteins. However, a sequence alignment is a two dimensional abstraction of a protein. A protein structure model is a better abstraction capturing how the protein polymer folds back upon itself in three-dimensional space. Since proteins have three-dimensional folds that are most often necessary for their function we thought that it would be better to examine sequence conservation in the context of the native three dimensional folds with the hopes of identifying conserved patches in the protein. To enable this concept, VENN is a program that maps sequence conservation onto the three-dimensional structures of proteins. VENN was named after John Venn, the inventor of Venn diagrams, because VENN can be used to explore the interaction of sequence and structure to identify important functional regions in proteins. VENN then aligns the protein sequences, calculates a heatmap for conservation of each amino acid position and plots the heatmap onto the protein structure. Different strategies for selecting groups of sequences can be used to identify functional regions and specificity determinants in gene families.