Voting Classifier

SANCHITA MANGALE
3 min readMay 18, 2019

--

A collection of several models working together on a single set is called an ensemble. The method is called Ensemble Learning. It is much more useful use all different models rather than any one.

Why ensembles?

  1. Lower error
  2. Less over-fitting
  3. Taste great

Voting is one of the simplest way of combining the predictions from multiple machine learning algorithms. Voting classifier isn’t an actual classifier but a wrapper for set of different ones that are trained and valuated in parallel in order to exploit the different peculiarities of each algorithm.

Voting Classifier

We can train data set using different algorithms and ensemble then to predict the final output. The final output on a prediction is taken by majority vote according to two different strategies :

Hard voting / Majority voting : Hard voting is the simplest case of majority voting. In this case, the class that received the highest number of votes Nc​(y​​t) will be chosen. Here we predict the class label y^ via majority voting of each classifier.

Hard voting formula

Assuming that we combine three classifiers that classify a training sample as follows:

  • classifier 1 -> class 0
  • classifier 2 -> class 0
  • classifier 3 -> class 1

y^=mode{0,0,1}=0

Via majority vote, we would we would classify the sample as “class 0”.

2. Soft voting : In this case, the probability vector for each predicted class (for all classifiers) are summed up &averaged. The winning class is the one corresponding to the highest value (only recommended if the classifiers are well calibrated).

Soft voting formula

Assuming the example in the previous section was a binary classification task, our ensemble could make the following prediction:

  • C1(x)→[0.9,0.1]
  • C2(x)→[0.8,0.2]
  • C3(x)→[0.4,0.6]

However, assigning the weights {0.1, 0.1, 0.8} would yield a prediction y^=1

For example, consider three classifiers : logistic regression, a decision tree (with default Gini impurity), and a SVM classifier (with polynomial kernel and probability=True in order to generate the probability vectors). Models are pitted against each other and selected upon best performance by voting using the ‘VotingClassifier’ Class from sklearn.ensemble.

voting classifier code

As we can see, the ensemble takes advantage of the different algorithms and yields better performance than any single one. Make sure to include diverse classifier so that models which fall prey to similar types of errors do not aggregate the errors. Similarly we can repeat it with soft voting.

A voting classifier can be a good choice whenever a single strategy is not able to reach the desired accuracy threshold. In short voting classifier instead allows the mixing of different classifiers adopting a majority vote to decide which class must be considered as the winning one during a prediction.

--

--

No responses yet