USING MULTIPLE SEMANTIC MEASURES FOR COREFERENCE RESOLUTION IN ONTOLOGY POPULATION
Keywords:coreference resolution, ontology properties, semantic measure, ontology population, semantic text analysis.
AbstractThe problem of populating an ontology consists in adding to it some new, domain-specific content from an input expressed, in particular, in a natural language. We focus on an important aspect in the ontology population process – finding and resolving coreferences, i.e., similar mentions of entities in the input text. Our contribution is a novel formal framework that extends the state-of-the-art approaches to coreference resolution by using multiple semantic similarity properties in the resolution process, i.e., we extend the list of the ontological properties used for coreference resolution with additional properties such as inverse, symmetry, intersection, union, etc. We use the proposed framework to improve our previously proposed algorithm for coreference resolution used in our general approach to text analysis and information extraction for populating subject domain ontologies. We describe a multi-agent implementation of our information extraction system and we show that using additional semantic similarity measures for evaluating coreferential candidates improves the quality of the coreference resolution process, especially for complex objects whose coreferencing has not been yet studied in detail.
S. E. Brennan, M. W. Friedman, and C. J. Pollard, “A centering approach to pronouns,” in Proceedings of the 25th annual meeting on Association for Computational Linguistics, Stanford, California, USA, July 06-09, 1987, pp. 155-162.
J. G. Carbonell and R. D. Brown, “Anaphora resolution: a multi-strategy approach,” in Proceedings of the 12 International Conference on Computational Linguistics, Budapest, Hungary, August 22-27, 1988, pp. 96-101.
P. Elango, Coreference Resolution: A Survey, Technical Report, UW-Madison, 2005, 8 p.
N. O. Garanina and E. A. Sidorova, “Ontology population as algebraic information system processing based on multi-agent natural language text analysis algorithms,” Programming and Computer Software, vol. 41, issue 3, pp. 140-148, 2015.
N. Garanina, E. Sidorova, “Context-dependent lexical and syntactic disambiguation in ontology population,” in Proceedings of the 25th International Workshop on Concurrency, Specification and Programming, Rostock, Germany, September 28-30, 2015, pp. 101-112.
N. Garanina, E. Sidorova, and I. Kononenko, “A distributed approach to coreference resolution in multiagent text analysis for ontology population,” in Proceedings of the Ershov Informatics Conference (PSI’2017), June 27-29, 2017, Moscow, Russia.
B. J. Grosz, S. Weinstein, and A. K. Joshi, “Centering: A framework for modeling the local coherence of discourse,” Journal Computational Linguistics, vol. 21, issue 2, pp. 203-225, 1995.
S. M. Harabagiu, R. C. Bunescu, and S. J. Maiorano, “Text and knowledge mining for coreference resolution,” in Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, Pittsburgh, PA, USA, June 2-7, 2001, pp. 1-8.
Handbook on Ontologies, Edt.: S. Staab, R. Studer, International Handbooks on Information Systems, Springer Berlin Heidelberg, 2009, 808 p.
D. Hladky, C. Ehrlich, I. Efimenko, V. Vorobyov, Discover Shadow Groups from the Dark Web, in: M. Last, A. Kandel (Eds.), Web Intelligence and Security: Advances in Data and Text Mining Techniques for Detecting and Preventing Terrorist Activities on the Web, IOS Press, 2010, pp. 67-81.
J. Hobbs, Resolving Pronoun References, in: B. J. Grosz, B. L. Webber, K. S. Jones (Eds.), Readings in Natural Language Processing, Morgan Kaufmann Publishers Inc., San Francisco,1986, pp. 339-352.
A. A. Kibrik, Anaphora in Russian Narrative Discourse: A Cognitive Calculative Account, in: B. Fox (ed.) Studies in anaphora, J. Benjamins Pub., Amsterdam, 1996, pp. 255-304.
A. A. Kibrik, G. B. Dobrov, M. V. Khudyakova, N. V. Loukachevitch, A. Pechenyj, “A corpus-based study of referential choice: Multiplicity of factors and machine learning techniques,” in Proceedings of the 13th International Conference Cognitive Modeling in Linguistics, Corfu, Greece, September 22-29, 2011, pp. 118-126.
I. S. Kononenko, E. A. Sidorova, “Language resources in ontology-driven information systems,” in Proceedings of the First Russia and Pacific Conference on Computer Technology and Applications, Vladivostok, Russia, September 6-9, 2010, pp. 18-23.
E. Motta, S. Siqueira, A. Andreatta, “An unsupervised rule-based method to populate ontologies from text,” in Proceedings of the 5th International Conference on Web Information Systems and Technologies, Lisboa, Portugal, March 23-26, 2009, pp. 157-169.
R. Mitkov, Anaphora Resolution: The State of the Art, Technical report, University of Wolverhampton, 1999, 34 p.
R. Mitkov, Anaphora resolution, in R. Mitkov (Ed.), The Oxford Handbook of Computational Linguistics, Oxford University Press, New York, 2003, pp. 266-283.
G. Petasis, V. Karkaletsis, G. Paliouras, A. Krithara, and E. Zavitsanos, Ontology Population and Enrichment: State of the Art, in G. Paliouras, C. D. Spyropoulos, G. Tsatsaronis (Eds.), Knowledge-driven Multimedia Information Extraction and Ontology Evolution, Springer-Verlag, Berlin, 2011, pp. 134-166.
R. Prokofyev, A. Tonon, M. Luggen, L. Vouilloz, D. E. Difallah, and P. Cudre-Mauroux, “SANAPHOR: Ontology-based coreference resolution,” in Proceedings of the 14th International Semantic Web Conference, Bethlehem, Pennsylvania, USA, October 11-15, 2015, pp. 458-473.
E. Rich and S. Luper Foy, “An architecture for anaphora resolution,” in Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas, USA, February 9-12, 1988, pp. 18-24.
S. G. Shanmugham, C. A. Roberts, “Application of graphical specification methodologies to manufacturing control logic development: a classification and comparison,” Int. J. Computer Integrated Manufacturing, vol. 11, issue 2, pp. 142-152, 1998.
E. A. Sidorova, I. S. Kononenko, “Representation and use of the jenre structure of documentation in text processing,” in Proceedings of the Science-Intensive Software Workshop, Novosibirsk, Russia, June 15-19, 2009, pp. 248-254. (in Russian)
W. M. Soon, H. T. Ng, and D. C. Y. Lim, “A machine learning approach to coreference resolution of noun phrases,” Journal Computational Linguistics, vol. 27, issue 4, pp. 521-544, 2001.
Y. Wilks, Preference Semantics, ed. by E. Keenan (Ed.), The Formal Semantics of Natural Language, Cambridge University Press, 1975, pp. 329-348.
M. Yatskevich, C. Welty, and J. W. Murdock, “Coreference resolution on RDF Graphs generated from information extraction: first results,” in Proceedings of ISWC’06 Workshop on Web Content Mining with Human Language Technologies, Athens, GA, USA, November 6, 2006.
G. D. Zhou and J. Su, “A high-performance coreference resolution system using a constraint-based multi-agent strategy,” in Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland, August 23-27, 2004, pp. 522-528.
How to Cite
LicenseInternational Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:
• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.