Text based Machine Learning Using Discriminative Classifiers

Rongon Chatterjee; Vasundhara Acharya; Krishna Prakasha; R. Vijaya Arjunan

Text based Machine Learning Using Discriminative Classifiers

Rongon Chatterjee, Vasundhara Acharya, Krishna Prakasha and R. Vijaya Arjunan

Abstract

Ever since the invention of computer, a curiosity exists to see if it can be made to learn. If humans could understand how to program them and learn to improve automatically with experience, the impact would be dramatic. A successful understanding of how to make computers learn would open up many new uses of computers and new levels of competence and customization. In this paper, two applications of Machine Learning are explored. In the first one, linear regression to understand the correlation of the feature columns with the output and make predictions based on the â€œline of best fitâ€ is given. In the second one, discriminative classifiers for analyzing and segregating text-based data is proposed. On applying regression analysis on advertising data, it is observed that TV advertising has the strongest linear correlation with sales. In the later section, text-based machine learning is employed using the scikit-learn library of Python. Multiple contemporary classifiers are applied on a set of SMSâ€™s to perform spam detection. The performance of the classifiers is evaluated using suitable accuracy metrics. The results show that the NaÃ¯ve Bayes algorithm is much faster than other algorithms such as Logistic Regression. Using a Bayesian probabilistic approach, a spam ratio is attached to all the tokens in the input set. The proposed work proves to be helpful in the field of advertising and spam detection systems.

Volume 11 | Issue 7

Pages: 32-41

Download PDF

Archives

Text based Machine Learning Using Discriminative Classifiers

Rongon Chatterjee, Vasundhara Acharya, Krishna Prakasha and R. Vijaya Arjunan

Abstract

Back to Archives