<<AI人工智慧 PyTorch自學>> 7.7 TorchEnsemble 模型集成庫－HCHUNGW的部落格

7.7 TorchEnsemble 模型集成庫

俗話說，三個臭皮匠頂個諸葛亮，機器學習模型亦是如此。Model Ensemble（模型集成）是機器學習領域重要的研究方向，在傳統機器學習以及各種資料科學競賽中，Model Ensemble成了標配，因此，本節就介紹工業生產中實用的模型集成技術。

pytorch的生態系統中已經有ensemble的庫，本節將介紹torchensemble的使用，各種集成方法的邏輯，torchensemble的代碼結構以及如何改造自己的代碼。

ensemble-pytorch 簡介

TorchEnsemble是pytorch生態的一員，提供了統一的集成模組供pytorch使用。目前支援的方法有

Fusion	Mixed	fusion.py	Classification / Regression
Voting[1]	Parallel	voting.py	Classification / Regression
Neural Forest	Parallel	voting.py	Classification / Regression
Bagging[2]	Parallel	bagging.py	Classification / Regression
Gradient Boosting[3]	Sequential	gradient_boosting.py	Classification / Regression
Snapshot Ensemble[4]	Sequential	snapshot_ensemble.py	Classification / Regression
Adversarial Training[5]	Parallel	adversarial_training.py	Classification / Regression
Fast Geometric Ensemble[6]	Sequential	fast_geometric.py	Classification / Regression
Soft Gradient Boosting[7]	Parallel	soft_gradient_boosting.py	Classification / Regression

各方法的理論簡介可查閱文檔

（關於模型集成的理論概念，請自行通過機器學習基礎瞭解）

原理分析介紹

torchensemble提供了訓練示例,本節對示例進行了整理及注釋，可參見配套代碼。

通過程式碼片段可知道，torchensemble提供了集成類，集成類中的基學習器都是同一個結構（如LeNet-5），然後通過集成類的fit()函數完成訓練，通過evaluate()來評估。

本部分最核心的問題在於理解不同的集成方法，它是如何使用多個基學習器的輸出結果，下面通過原始程式碼中的forward函數觀察不同方法的集成方式。

fusion

原理：先平均，後概率。

通過FusionClassifier類的 forward函數可看到，它是對基學習器的輸出進行逐元素的平均，然後再進行softmax進行輸出分類概率向量。

class FusionClassifier(BaseClassifier):

def _forward(self, *x):

"""

Implementation on the internal data forwarding in FusionClassifier.

"""

# Average

outputs = [estimator(*x) for estimator in self.estimators_]

output = op.average(outputs)

return output

@torchensemble_model_doc(

"""Implementation on the data forwarding in FusionClassifier.""",

"classifier_forward",

)

def forward(self, *x):

output = self._forward(*x)

proba = F.softmax(output, dim=1)

return proba

Copy

voting

voting：先概率，後平均。

voting先對基學習器進行softmax，然後把多個概率向量進行平均。

關於投票法，在sklearn中還有多種選擇，vote模式有soft與hard，在torchensemble採用的是soft方式，及返回的是各分類器分類概率的平均。

If voting='soft' and flatten_transform=True:

returns ndarray of shape (n_samples, n_classifiers * n_classes), being class probabilities calculated by each classifier.

If voting='soft' and `flatten_transform=False:

ndarray of shape (n_classifiers, n_samples, n_classes)

If voting='hard':

ndarray of shape (n_samples, n_classifiers), being class labels predicted by each classifier.

Copy

class VotingClassifier(BaseClassifier):

@torchensemble_model_doc(

"""Implementation on the data forwarding in VotingClassifier.""",

"classifier_forward",

)

def forward(self, *x):

# Average over class distributions from all base estimators.

outputs = [

F.softmax(estimator(*x), dim=1) for estimator in self.estimators_

]

proba = op.average(outputs)

return proba

Copy

bagging

bagging：先概率，後平均。這與voting一樣。

bagging的主要原理在於基模型的訓練資料不一樣，因此可得到不同的基模型，而torchensemble文檔裡提到，深度學習中資料越多，模型越好，因此就沒有採用K-Flod的方法劃分資料了。

"bagging further uses sampling with replacement on each batch of data. Notice that sub-sampling is not typically used when training neural networks, because the neural networks typically achieve better performance with more training data."

class BaggingClassifier(BaseClassifier):

@torchensemble_model_doc(

"""Implementation on the data forwarding in BaggingClassifier.""",

"classifier_forward",

)

def forward(self, *x):

# Average over class distributions from all base estimators.

outputs = [F.softmax(estimator(*x), dim=1) for estimator in self.estimators_]

proba = op.average(outputs)

return proba

Copy

GradientBoostingClassifier

GradientBoostingClassifier：先求和，再概率。

這裡先求和是因為Gradient Boosting演算法原理就是“加法模型”，最終的結果是利用N個學習器的結果之和得到。為什麼呢？因為第二個學習器學習的是第一個學習器與目標檢測的差距，第三個學習器學習的是第一個+第二個學習器結果之和與結果之間的差距，以此類推。因此才有了sum_with_multiplicative這個函數中的代碼邏輯。

如果不瞭解集成學習的基礎概念，是無法理解上面這段話的，因為上面這段話是對梯度提升（GradientBoosting）演算法的簡述，而梯度提升（GradientBoosting）演算法相對於bagging方法而言，不是那麼容易理解，還請自行補足機器學習基礎概念。

def forward(self, *x):

output = [estimator(*x) for estimator in self.estimators_]

output = op.sum_with_multiplicative(output, self.shrinkage_rate)

proba = F.softmax(output, dim=1)

return proba

def sum_with_multiplicative(outputs, factor):

"""

Compuate the summation on a list of tensors, and the result is multiplied

by a multiplicative factor.

"""

return factor * sum(outputs)

Copy

SnapshotEnsembleClassifier

SnapshotEnsembleClassifier：先平均，後概率。

SnapshotEnsembleClassifier是深度學習模型提出後才發明的集成方法，這與深度學習模型訓練過程有關。其思路是保存多個局部最後的模型，然後將它們的結果進行集成輸出。

這個思路非常新奇，集成學習的核心點之一是如何尋找多個基學習器，通常方法是從資料、參數、模型類型出發，獲得多個性能不同的基學習器。而SnapShotEnsemble是通過一次訓練過程中，保存多個局部最優的狀態為基學習器，這樣做的好處是高效，且各基學習器的錯誤樣本通常不會重複，因此模型是基於上一次錯誤樣本進行訓練的。

[Huang Gao, Sharon Yixuan Li, Geoff Pleisset, et al., “Snapshot Ensembles: Train 1, Get M for Free.” ICLR, 2017.]

def _forward(self, *x):

"""

Implementation on the internal data forwarding in snapshot ensemble.

"""

# Average

results = [estimator(*x) for estimator in self.estimators_]

output = op.average(results)

return output Average

results = [estimator(*x) for estimator in self.estimators_]

output = op.average(results)

Copy

更多方法請關注官方文檔的介紹。

官方demo提供了一些對比實驗，這裡進行匯總，供參考

<<AI人工智慧 PyTorch自學>> 7.7 Torch

代碼結構

torchensemble庫將模型訓練、推理、集成過程都進行了高度封裝，並且提供統一的介面，如何將複雜的、多種多樣的模型編寫為統一的API介面？

這裡就繪製簡略版的UML類圖，梳理代碼結構。

<<AI人工智慧 PyTorch自學>> 7.7 Torch

通過類圖，可以看到所有的繼承類都是nn.Module類，由此可見第四章的nn.Module有多重要。

從代碼結構可以學習瞭解到，一個複雜的模組都可以提煉、抽象出最基礎的BaseClass，BaseClass中定義最核心的、最通用的屬性和方法，如這裡的基學習器、基學習器數量、學習器容器，forward(), fit(), predict()等。

手動實現模型集成

雖然torchensemble提供了豐富的集成方法，但是有的時候並不適用於手動訓練的多個模型，下面記錄兩個手動實現模型集成（投票法）的代碼方案。

一、借助nn.ModuleList()

class Ensemble(nn.Module):

def __init__(self, device):

super(Ensemble,self).__init__()

# you should use nn.ModuleList. Optimizer doesn't detect python list as parameters

self.models = nn.ModuleList(models)

def forward(self, x):

# it is super simple. just forward num_ models and concat it.

output = torch.zeros([x.size(0), len(training.classes)]).to(device)

for model in self.models:

output += model(x)

return output

Copy

原文連結：https://www.kaggle.com/code/georgiisirotenko/pytorch-fruits-transferlearing-ensemble-test99-18/notebook

二、借助list，並Counter

for img, label in tqdm(testloader):

img, label = img.to(device), label.to(device)

for i, mlp in enumerate(mlps):

mlp.eval()

out = mlp(img)

_, prediction = torch.max(out, 1) # 按行取最大值

pre_num = prediction.cpu().numpy()

mlps_correct[i] += (pre_num == label.cpu().numpy()).sum()

pre.append(pre_num)

arr = np.array(pre)

pre.clear()

result = [Counter(arr[:, i]).most_common(1)[0][0] for i in range(BATCHSIZE)]

vote_correct += (result == label.cpu().numpy()).sum()

print("epoch:" + str(ep) + "集成模型的正確率" + str(vote_correct / valid_data_size))

Copy

原文連結：https://blog.csdn.net/weixin_42237113/article/details/108970920

ResNet在Cifar-10上的實驗效果

為了觀察模型融合的效果，特地編寫了代碼，實現ResNet在Cifar10上的訓練，並對每個基學習器的性能進行對比，直觀的看出模型集成的作用是立竿見影的，請看效果圖。

<<AI人工智慧 PyTorch自學>> 7.7 Torch

本實驗採用3個學習器進行投票式集成，因此繪製了7條曲線，其中各學習器在訓練和驗證各有2條曲線，集成模型的結果通過 valid_acc輸出（藍色），通過上圖可發現，集成模型與三個基學習器相比，分類準確率都能提高3個多百分點左右，是非常高的提升了。

為了能繪製出這幅圖，特地構思了代碼，代碼主要是自訂了class MyEnsemble(VotingClassifier)，並重寫fit函數，使得訓練過程的資訊可以被記錄下來。

小結

本節簡單介紹了pytorch中使用模型集成的方法庫——torchensemble，詳細內容還請查看官方文檔，同時可以關注kaggle的方案，集成學習是競賽的必備方案，也是工業項目中常用的方法，請重點學習。

HCHUNGW

HCHUNGW的部落格

HCHUNGW 發表在痞客邦留言(0) 人氣()

HCHUNGW的部落格

破軍突破革新希望多元開放平等進步

<<AI人工智慧 PyTorch自學>> 7.7 TorchEnsemble 模型集成庫

歷史上的今天

留言列表

站方公告

活動快報

【寵物...

我的好友

熱門文章

文章分類

最新文章

最新留言

動態訂閱

文章精選

文章搜尋

新聞交換(RSS)

誰來我家

參觀人氣

QR Code

POWERED BY

HCHUNGW的部落格

破軍 突破 革新 希望 多元 開放 平等 進步

<<AI人工智慧 PyTorch自學>> 7.7 TorchEnsemble 模型集成庫

歷史上的今天

留言列表

站方公告

活動快報

【寵物...

我的好友

熱門文章

文章分類

最新文章

最新留言

動態訂閱

文章精選

文章搜尋

新聞交換(RSS)

誰來我家

參觀人氣

QR Code

POWERED BY

破軍突破革新希望多元開放平等進步