Abstract:
The domain of traditional web is gradually evolving with the adaptation of newer techniques, which includes Semantic Web. Integration of web content using ontologies in a language independent manner is a required feature in this process. For better utilization of the resources, it is necessary that the ontology, which is working as a central knowledge repository, to be language independent as well. Apart from being language independent, we should consider that the better the source data set of the ontology, the better options there will be for the adaptation of new knowledge.
In this thesis, we introduce a framework for multilingual ontology, which is be able to adapt to the addition of new languages, as well as the addition of new data to the existing sources. The framework, which itself is an extension of the framework used at present, augments the domain of the current ontology. The augmentation is ensured by introducing a universal technique to integrate infinite number of languages to the understanding of the multilingual ontology. Once elaboration of the framework is done, we highlight the significance of efficient data extraction techniques from the ontology.
This thesis also introduces a way to improve the extraction technique by concentrating multiple data sources into one single source. We present a process where machine-readable properties of individual entities are filtered through intelligent techniques and a precise knowledge source is generated.
Thus, sub-merging multiple knowledge bases into one single and richer data set. Lastly, we present the results obtained by experimental implementation of the sub-merging mechanism to demonstrate the magnitude of enhancement and its contribution to fulfill our ultimate goal to augment Wikipedia entries.