Open Access Open Access  Restricted Access Subscription Access

A Neural Network Approach for Text Document Classification and Semantic Text Analytics

Arpit Kakade, Kunal Dhumal, Sachin Das, Shikhar Jain, N. M. Ranjan

Abstract


The nature of data that is being produced on a daily basis is vast and most amount of this data is in unstructured format. Hence, it is necessary to organize this data into different categories such that meaningful knowledge can be derived from such large volumes of data. The proposed methodology consists of a feature selection component and then a neural network classifier. The neural network system is trained against a large variety and of text document so that it can correctly predict the type of document presented as input. A machine learning algorithm is designed to select terms that will serve as basis to differentiate between various categories of topics. The algorithm will also analyse synonyms so that redundant type of information is kept under a same label.

Full Text:

PDF

References


Lee, Hahn-Ming and Chen, Chih-Ming and Hwang, Cheng-Wei, "A Neural Network Document Classifier with Linguistic Feature Selection",13th International Conference on Industrial and Engineering applications of Artificial Intelligence and Expert Systems, IEA/AIE 2000 New Orleans, Louisiana, USA, June 19--22, 2000.

Diganta Saha, "Web Text Classification Using a Neural Network", vol. 00, no. , pp. 57-60, 2011.

Zhihang Chen, Chengwen Ni and Yi L. Murphey “Neural Network Approaches for Text Document Categorization” IEEE 2006 International Joint Conference on Neural Networks,pp-1054-1060, 2006

Taeho Jo, "NTC (Neural Text Categorizer): Neural network for text categorization", International Journal of Information Studies, volume 2, 2000.

Chihli Hung and Stefan Wermter. 2004. Neural Network Based Document Clustering Using WordNet Ontologies. Int. J. Hybrid Intell. Syst. 1, 3,4 (December 2004), 127-142.

J. Tian, M. Gao and Y. Sun, "Study on web classification mining method based on fuzzy neural network," 2009 IEEE International Conference on Automation and Logistics, Shenyang, 2009, pp. 1781-1785.

Qiuming Ma, Zhiguang Qin, Fengli Zhang, Qiao Liu “Text Spam Neural Network Classification Algorithm” Published in: Communications, Circuits and Systems (ICCCAS), 2010 International Conference on pp-466- 469, 2010.

Saravanan K and S. Sasithra, "REVIEW ON CLASSIFICATION BASED ON ARTIFICIAL NEURAL NETWORKS, International Journal of Ambient Systems and Applications (IJASA) Vol.2, No.4, December 2014.

Tekwani, Hemlata, and Mahak Motwani. "Text Categorization Comparison between Simple BPNN and Combinatorial Method of LSI and BPNN."International Journal of Computer Applications 97.22 (2014).

Nihar Ranjan, Rajesh Prasad, “A Brief Survey on Text Classification Process and Algorithms”, CAASR - 2nd ICIET`16 & CAASR - ICCSE`16 CAASR, Kuala Lumpur, Malaysia - 05-06 May ,2016.


Refbacks