ChemXtract’ A System for Extraction of Chemical Events from Patent Documents

Pattabhi RK Rao, Sobha Lalitha Devi


Abstract
ChemXtraxt main goal is to extract the chemical events from patent documents. Event extraction requires that we first identify the names of chemical compounds involved in the events. Thus, in this work two extractions are done and they are (a) names of chemical compounds and (b) event that identify the specific involvement of the chemical compounds in a chemical reaction. Extraction of essential elements of a chemical reaction, generally known as Named Entity Recognition (NER), extracts the compounds, condition and yields, their specific role in reaction and assigns a label according to the role it plays within a chemical reaction. Whereas event extraction identifies the chemical event relations between the chemical compounds identified. Here in this work we have used Neural Conditional Random Fields (NCRF), which combines the power of artificial neural network (ANN) and CRFs. Different levels of features that include linguistic, orthographical and lexical clues are used. The results obtained are encouraging.
Anthology ID:
2023.ranlp-1.106
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
988–995
Language:
URL:
https://aclanthology.org/2023.ranlp-1.106
DOI:
Bibkey:
Cite (ACL):
Pattabhi RK Rao and Sobha Lalitha Devi. 2023. ‘ChemXtract’ A System for Extraction of Chemical Events from Patent Documents. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 988–995, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
‘ChemXtract’ A System for Extraction of Chemical Events from Patent Documents (RK Rao & Lalitha Devi, RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.106.pdf