A Hash Map based Binary Matrix Approach for Text Document Classification
Abstract
The conventional model uses the sequential approach for classifying the text document. In this paper, a new approach for the text document classification is proposed. The proposed method preserves the sequence of words that are occurring in a document. The data structure that is used in this method to preserve the word sequences is called “Binary Matrix”. A classification technique is also proposed for classifying the text document. To index the terms, it uses Hash Map and this is associated with the list of class labels of the document in which the word is present.