DSpace Repository

Modified inverted files and algorithms for phrase query and not query

Show simple item record

dc.contributor.advisor Kabir, Dr. Md. Humayun
dc.contributor.author Paul, Tuhin
dc.date.accessioned 2015-12-26T10:31:34Z
dc.date.available 2015-12-26T10:31:34Z
dc.date.issued 2009-12
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1559
dc.description.abstract Inverted files, equivalent to database indices, are used to speed up the search of both Hyper Text Markup Language (HTML) and eXtensible Markup Language (XML) files in the web. Searching XML files differs from that ofHTML in two ways: inverted files for XML need to be compressed because of their large size and the query evaluation against XML files requires keyword searching both in the structure and in the values. XML queries are often composed of multiple keywords with logical relations. XML queries with conjunction, disjunction, ancestor-descendant, and preceding-following relations among the multiple keywords have already been evaluated successfully. Multiple keywords often appear in the XML queries as a phrase. Phrase Query in a single XML document has already been evaluated. However, the method to evaluate phrase query in a large or small collection of XML documents does not exist. Additionally, a special type of query where keywords or phrases must not be present in the evaluated XML documents is alsoTequired in many applications. As per our study, the method to evaluate this NOT queries does not exist either. XML document retrieval will not be complete without evaluating these two important types of queries. New solutions are required to process both phrase and NOT queries efficiently. In this thesis, we introduce the methods to evaluate both phrase and NOT queries proposing necessary changes in the inverted file structure and query processing algorithms. We have used pull parser to parse the XML documents. We have developed a prototype query processor which is capable of creating inverted files and evaluating all types of queries including phrase and NOT queries. Our experimental results using this prototype query processor show the effectiveness of our proposed query evaluation methods. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering, BUET en_US
dc.subject XML (Document markup language) en_US
dc.title Modified inverted files and algorithms for phrase query and not query en_US
dc.type Thesis-MSc en_US
dc.contributor.id 100605048 P en_US
dc.identifier.accessionNumber 107529
dc.contributor.callno 005.72/PAU/2009 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account