An Approach to Ontology-Driven Pro-cessing of Scientific Texts and Us-er Natural Language Queries
Keywords:
ontology engineering, semantic web technologies, scientific texts, knowledge graph, OWL ontology, RDF triples, logical inferenceAbstract
This research presents a method and system architecture for ontology-driven processing of scientific natural-language texts that integrates a linguistic processor, a domain OWL ontology, a reasoning engine, and a knowledge graph. The system constructs a consistent knowledge graph suitable for the automatic interpretation of structurally complex user queries with subsequent transformation into SPARQL queries. The proposed pipeline includes formal mappings of linguistic analysis results to ontology classes, properties, and assertions; consistency checking; fact materialization; and an explanation mechanism that derives minimal sets of axioms and facts to justify the reasoning engine’s conclusions or to identify the causes of inconsistency. The output is a set of consistency-compliant subject–predicate–object triples. An ontology-consistency evaluation metric is introduced as the proportion of triples in the knowledge graph that do not violate ontological constraints, and the impact of logical processing on the harmonic mean of precision and recall for triple extraction is evaluated on a representative corpus of scientific texts in the field of knowledge engineering.
References
O. Palagin, M. Petrenko, K. Malakhov, “Challenges and role of ontolo-gy engineering in creating the knowledge industry: A research-related design perspective,” Cybernetics and Systems Analysis, vol. 60, issue 4, pp. 633–645, 2024. https://doi.org/10.1007/s10559-024-00702-6.
J.V. Rogushina, A.Y. Gladun, O.V. Anishchenko, S.M. Pryima, “Se-mantic technologies as a tool of information support for professionaliza-tion of andragogues,” Problems in Programming, no. 2–3, pp. 441–448, 2024. https://doi.org/10.15407/pp2024.02-03.441.
N. Guarino, (Ed.). Formal Ontology in Information Systems: Proceed-ings of the 1st International Conference FOIS’98, Trento, Italy, June 6–8, 1998. Amsterdam: IOS Press, 1998.
A. Gómez-Pérez, M. Fernández-López, O. Corcho, Ontological Engi-neering, London: Springer, 2004. https://doi.org/10.1007/b97353.
M. Petrenko, E. Cohn, O. Shchurov, K. Malakhov, “Ontology–driven computer systems: elementary senses in domain knowledge processing,” South African Computer Journal, vol. 35, issue 2, pp. 127–144, 2023. https://doi.org/10.18489/sacj.v35i2.17445.
V. López, C. Unger, P. Cimiano, E. Motta, “Evaluating question an-swering over linked data,” Journal of Web Semantics, vol. 21, pp. 3–13, 2013. https://doi.org/10.1016/j.websem.2013.05.006.
E. Dimitrakis, K. Sgontzos, Y. Tzitzikas, “A survey on question answer-ing systems over linked data and documents,” Journal of Intelligent In-formation Systems, vol. 55, issue 2, pp. 233–259, 2020. https://doi.org/10.1007/s10844-019-00584-7.
S. Ferré, “Sparklis: An expressive query builder for SPARQL endpoints with guidance in natural language,” Semantic Web, vol. 8, issue 3, pp. 405–418, 2017. https://doi.org/10.3233/SW-150208.
D. Diefenbach, J. Giménez-García, A. Both, K. Singh, P. Maret, “QAnswer KG: Designing a portable question answering system over RDF data,” In: The Semantic Web – ESWC 2020. LNCS 12123. Spring-er, 2020, pp. 429–445. https://doi.org/10.1007/978-3-030-49461-2_25.
T. Soru, E. Marx, D. Moussallem, A. Valdestilhas, D. Esteves, et al., “SPARQL as a Foreign Language,” arXiv preprint, 2017. https://arxiv.org/abs/1708.07624.
X. Yin, D. Gromann, S. Rudolph, “Neural machine translating from natural language to SPARQL,” Future Generation Computer Systems, vol. 117, pp. 510–519, 2021. https://doi.org/10.1016/j.future.2020.12.013
M.A. Borroto, F. Ricca, “SPARQL-QA-v2 system for knowledge base question answering,” Expert Systems with Applications, vol. 229(A), Art. 120383, 2023. https://doi.org/10.1016/j.eswa.2023.120383.
P. Trivedi, G. Maheshwari, M. Dubey, J. Lehmann, “LC-QuAD: A corpus for complex question answering over knowledge graphs,” In: The Semantic Web – ISWC 2017, Springer, 2017, pp. 210–218. https://doi.org/10.1007/978-3-319-68204-4_22.
M. Dubey, D. Banerjee, A. Abdelkawi, J. Lehmann, “LC-QuAD 2.0: A large dataset for complex question answering over Wikidata and DBpe-dia,” In: The Semantic Web – ISWC 2019, LNCS 11779. Springer, 2019, pp. 69–78. https://doi.org/10.1007/978-3-030-30796-7_5.
A. Maedche, S. Staab, “Ontology learning for the Semantic Web,” IEEE Intelligent Systems, vol. 16, issue 2, pp. 72–79, 2001. https://doi.org/10.1109/5254.920602.
W.Y. Wong, W. Liu, M. Bennamoun, “Ontology learning from text: A look back and into the future,” ACM Computing Surveys, vol. 44, issue 4, Art. 20, 2012. https://doi.org/10.1145/2333112.2333115.
J. Wątróbski, “Ontology learning methods from text – An extensive knowledge-based approach,” Procedia Computer Science, vol. 176, pp. 3356–3368, 2020. https://doi.org/10.1016/j.procs.2020.09.061.
P. Cimiano, U. Reyle, J. Šarić, “Ontology-driven discourse analysis for information extraction,” Data & Knowledge Engineering, vol. 55, issue 1, pp. 59–83, 2005. https://doi.org/10.1016/j.datak.2004.11.009.
V. Pertsas, P. Constantopoulos, “Ontology-driven information extrac-tion from research publications,” In: Digital Libraries for Open Knowledge (TPDL 2018). LNCS 11057. Springer, 2018, pp. 241–253. https://doi.org/10.1007/978-3-030-00066-0_21.
A. Hogan, E. Blomqvist, M. Cochez, et al., “Knowledge graphs,” ACM Computing Surveys, vol. 54, issue 4, Art. 71, 2021. https://doi.org/10.1145/3447772.
A. Hogan, “Knowledge graphs: Research directions,” In: Reasoning Web 2020. LNCS 12258. Springer, 2020, pp. 223–253. https://doi.org/10.1007/978-3-030-60067-9_8.
N. Noy, Y. Gao, A. Jain, A. Narayanan, A. Patterson, J. Taylor, “Indus-try-scale knowledge graphs: Lessons and challenges,” Communications of the ACM, vol. 62, issue 8, pp. 36–43, 2019. https://doi.org/10.1145/3331166.
J.F. Sowa, “Conceptual graphs as a universal knowledge representa-tion,” Computers & Mathematics with Applications, vol. 23, issues 2–5, pp. 75–95, 1992. https://doi.org/10.1016/0898-1221(92)90137-7.
W3C OWL Working Group. OWL 2 Web Ontology Language: Docu-ment Overview. W3C Recommendation, 2012. [Online]. Available at: https://www.w3.org/TR/owl2-overview/.
H. Knublauch, D. Kontokostas, Shapes Constraint Language (SHACL). W3C Recommendation, 2017. [Online]. Available at: https://www.w3.org/TR/shacl/.
T. Lebo, S. Sahoo, D. McGuinness, PROV-O: The PROV Ontology. W3C Recommendation, 2013. [Online]. Available at: https://www.w3.org/TR/prov-o/.
I. Horrocks, P.F. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, M. Dean, SWRL: A Semantic Web Rule Language Combining OWL and RuleML. W3C Member Submission, 2004. [Online]. Available at: https://www.w3.org/Submission/SWRL/.
E. Sirin, B. Parsia, B.C. Grau, A. Kalyanpur, Y. Katz, “Pellet: A practi-cal OWL-DL reasoner,” Web Semantics, vol. 5, issue 2, pp. 51–53, 2007. https://doi.org/10.1016/j.websem.2007.03.004.
B. Glimm, I. Horrocks, B. Motik, G. Stoilos, Z. Wang, “HermiT: An OWL 2 Reasoner,” Journal of Automated Reasoning, vol. 53, issue 3, pp. 245–269, 2014. https://doi.org/10.1007/s10817-014-9305-1
Apache Software Foundation. Apache Jena Fuseki Documentation. 2024. [Online]. Available at: https://jena.apache.org/documentation/fuseki2/.
B. DuCharme, Learning SPARQL: Querying and Updating with SPARQL 1.1, 2nd ed. O’Reilly Media, 2013. https://doi.org/10.1089/big.2012.0004.
A.V. Palagin, N.G. Petrenko, “Methodological foundations for devel-opment, formation and IT-support of transdisciplinary researches,” Problems of Control and Informatics, vol. 50, issue 10, pp. 1-17, 2018. https://doi.org/10.1615/JAutomatInfScien.v50.i10.10.
A.V. Palagin, N.G. Petrenko, “Towards the design of an ontology-driven information system with natural language processing,” Mathe-matical Machines and Systems, no. 2, pp. 14–23, 2008. (In Russian). [Online]. Available at: https://nasplib.isofts.kiev.ua/handle/123456789/2402.
Downloads
Published
How to Cite
Issue
Section
License
International Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.