Theme NexT works best with JavaScript enabled
0%

Text Classification Review 2020

QIAN Li etc. write the review of Text Classifiction in 2020 ^ _ ^

Overview

Pipeline Of Text Classification

Method Changes

The biggest difference between shallow learning(feature engineering) and deep learning is deep learning extracts features autommatically.

Methods

Shallow Learngin Models

pipeline

  1. Preprocess: Such as word segmentation, data cleaning, data statistics
  2. Text Representation: aims to express preprocessed text in a form that is easy for computation. Such as Bag-Of-Words(BOW), N-gram, Term Frequency Inverse Document Frequency(TF-IDF), word2vec, GloVe.
  3. Represented text is fed into the classifier according to selected features.

The discussion is about representative classifiers.

PGM-based mothods

Probabilistic graphical models(PHMs) express the conditional dependencies among features in graphs.

  1. Naive Bayes
  2. Hidden Markov Model

KNN(K-Nearest Neiborhood)-based Method

SVM(Support Vector Machine)-based Method

DT(Decision Tree)-based Method

Integration-based Method

aims to aggregate the results of multiple algorithm, sunch as RF(Random Forest), XGBoost and stacking.

Deep Learning Models