Inicio  /  Algorithms  /  Vol: 14 Par: 1 (2021)  /  Artículo
ARTÍCULO
TITULO

Capturing Protein Domain Structure and Function Using Self-Supervision on Domain Architectures

Damianos P. Melidis and Wolfgang Nejdl    

Resumen

Predicting biological properties of unseen proteins is shown to be improved by the use of protein sequence embeddings. However, these sequence embeddings have the caveat that biological metadata do not exist for each amino acid, in order to measure the quality of each unique learned embedding vector separately. Therefore, current sequence embedding cannot be intrinsically evaluated on the degree of their captured biological information in a quantitative manner. We address this drawback by our approach, dom2vec, by learning vector representation for protein domains and not for each amino acid base, as biological metadata do exist for each domain separately. To perform a reliable quantitative intrinsic evaluation in terms of biology knowledge, we selected the metadata related to the most distinctive biological characteristics of a domain, which are its structure, enzymatic, and molecular function. Notably, dom2vec obtains an adequate level of performance in the intrinsic assessment?therefore, we can draw an analogy between the local linguistic features in natural languages and the domain structure and function information in domain architectures. Moreover, we demonstrate the dom2vec applicability on protein prediction tasks, by comparing it with state-of-the-art sequence embeddings in three downstream tasks. We show that dom2vec outperforms sequence embeddings for toxin and enzymatic function prediction and is comparable with sequence embeddings in cellular location prediction.

 Artículos similares

       
 
Siyuan Wang, Jianhua Liu, Bo Liu, Hao Wang, Jicang Si, Peng Xu and Minyi Xu    
Perception plays a pivotal role in both biological and technological interactions with the environment. Recent advancements in whisker sensors, drawing inspiration from nature?s tactile systems, have ushered in a new era of versatile and highly sensitive... ver más

 
Evaggelos Kaselouris, Stella Paschalidou, Chrisoula Alexandraki and Vasilis Dimitriou    
The transient acoustic dynamics of a splash cymbal are investigated via the Finite Element Method-Boundary Element Method. Real three-dimensional motion data recorded from the interaction of drummer?drumstick?cymbal provide the initial and the loading co... ver más
Revista: Acoustics

 
Eric E. Grossman, Babak Tehranirad, Cornelis M. Nederhoff, Sean C. Crosby, Andrew W. Stevens, Nathan R. Van Arendonk, Daniel J. Nowacki, Li H. Erikson and Patrick L. Barnard    
Extreme water-level recurrence estimates for a complex estuary using a high-resolution 2D model and a new method for estimating remotely generated sea level anomalies (SLAs) at the model boundary have been developed. The hydrodynamic model accurately res... ver más
Revista: Water

 
Md Asif Rahman and Yang Lu    
The assessment of concrete infrastructures? functionality during natural hazards is fundamental in evaluating their performance and emergency response. In this work, the alkali?silica reaction (ASR) in concrete is evaluated under the climate change impac... ver más
Revista: Infrastructures

 
Siqi Wang, Jingbo Yin and Rafi Ullah Khan    
Seaports function as lifeline systems in maritime transportation, facilitating critical processes like shipping, distribution, and allied cargo handling. These diverse subsystems constitute the Port Infrastructure System (PIS) and have intricate function... ver más