

Ankyrin Repeat Domain (ARD) is an alpha-solenoid repeat structure formed by cascading a series of ankyrin repeat units. These fundamental repeat units within a structure possess low sequence similarity but high structural conservation. An ARD serves as a protein–protein interaction platform in nature, and it is discovered as an important factor influencing hypoxia response through hydroxylation interaction with Factor Inhibiting HIF (FIH) enzymes which can repress HIF under normoxia environment. In this study, we designed a sequence based method incorporated with secondary structural features to predict boundaries of all internal repeats within an ARD protein, and the binding positions for hydroxylation were also identified through pattern matching approaches. Performance of the proposed prediction system achieved a sensitivity of 73.1%, a specificity of 99.3%, and an accuracy of 94.1% for ARD recognition. In addition, a comprehensive web database system was constructed with a total of 15,322 identified ARD candidates from all 63 model species genomes collected in Ensembl (release version of 73). We believe that the proposed prediction system and developed database can facilitate biologists in further exploration on ARD related researches regarding protein-protein interaction mechanisms.