Potential sequences for inclusion in the database will be identified by searching publicly available protein databases (NCBI Protein) using keyword filtering approaches. These sequences will then be manually curated. The criteria for this filtering process will be publicly available. Once sequences are identified, a panel of academic peer reviewers will consider these sequences and their associated published literature to determine if they should be included in the 2017 database. The entire process and filtering algorithm will be transparent and the steps of the process will be documented. Criteria for inclusion or exclusion will also be developed and published.
Professional scientific staff at HESI will provide management oversight for the sequence search (to be conducted by informatics experts), academic peer review panel, and public release of the final database. A public-private steering team convened by HESI will provide input into matters of process but will not have any influence on decisions regarding sequence inclusion/exclusion in the database.
As genomic sequencing technology has become widespread, the number of sequences to be filtered has grown exponentially. The COMPARE process will accommodate this growth by implementing an automated cutting-edge and high-throughput bioinformatics platform to identify a meaningful subset of sequences for scientific review by a diverse group of recognized allergy experts. The COMPARE process will also meet contemporary needs for the population criteria of a well-documented and sustainable allergen database.
The COMPARE database relies on the contribution of scientific expertise as well as in-kind and direct financial support from both public and private scientific organizations to develop this public resource. If you would like to learn more about how you or your organization can contribute, please contact us here.
The first iteration of the database (COMPARE 2017) was released and publicly available on this website as of 03 February 2017. The database will be updated annually with the release of a new version at the beginning of the year.
There is an international nomenclature group – WHO-IUIS Allergen Nomenclature Subcommittee (http://www.allergen.org/) – that is responsible for designating names to allergens, and where necessary re-naming already listed allergens in case of inconsistencies or changes in biological names of organisms. Names of allergens are built up by using the first three letters of the genus and the first letter of the species, separated by a space, followed by another space and a number related to the order of discovery. As an example, the first allergen from the organism, Blomia tropicalis, was named Blo t 1. Subsequent, although structurally distinct allergens, from the same organism then are named sequentially; Blo t 2, Blo t 3, etc.
Homologous allergens in different species will be given the same number, but in this system there are some inconsistencies due to the fact that a number may already have been used for another allergen that was discovered earlier. An example of such inconsistency is the homologue of the major birch pollen allergen Bet v 1 in peanut is known as Ara h 8, because the name Ara h 1 was already taken.
Note: the IUIS database is not a comprehensive database of clinically reviewed allergens; its purpose is the standardization and regulation of allergen nomenclature.