日本强奷中文字幕在线播放,精品亚洲Aⅴ无码一区二区三区,国产精品538一区二区在线

原文： https://pytorch.org/tutorials/intermediate/quantized_transfer_learning_tutorial.html

作者： Zafar Takhirov

被審核： Raghuraman Krishnamoorthi

由編輯：林 ess 琳

本教程以 Sasank Chilamkurthy 編寫的原始 PyTorch 轉(zhuǎn)移學(xué)習(xí)教程為基礎(chǔ)。

轉(zhuǎn)移學(xué)習(xí)是指利用預(yù)訓(xùn)練的模型應(yīng)用于不同數(shù)據(jù)集的技術(shù)。使用轉(zhuǎn)移學(xué)習(xí)的主要方式有兩種：

作為固定特征提取器的 ConvNet ：在這里，您“凍結(jié)” 網(wǎng)絡(luò)中所有參數(shù)的權(quán)重，但最后幾層(又稱“頭部”）的權(quán)重通常連接的圖層）。將這些最后一層替換為使用隨機(jī)權(quán)重初始化的新層，并且僅訓(xùn)練這些層。
對(duì) ConvNet 進(jìn)行微調(diào)：使用隨機(jī)訓(xùn)練的網(wǎng)絡(luò)初始化模型，而不是隨機(jī)初始化，然后像往常一樣進(jìn)行訓(xùn)練，但使用另一個(gè)數(shù)據(jù)集。通常，如果輸出數(shù)量不同，則在網(wǎng)絡(luò)中也會(huì)更換磁頭(或磁頭的一部分）。這種方法通常將學(xué)習(xí)率設(shè)置為較小的值。這樣做是因?yàn)橐呀?jīng)對(duì)網(wǎng)絡(luò)進(jìn)行了訓(xùn)練，并且只需進(jìn)行較小的更改即可將其“微調(diào)”到新的數(shù)據(jù)集。

您還可以結(jié)合以上兩種方法：首先，可以凍結(jié)特征提取器，并訓(xùn)練頭部。之后，您可以解凍特征提取器(或其一部分），將學(xué)習(xí)率設(shè)置為較小的值，然后繼續(xù)進(jìn)行訓(xùn)練。

在本部分中，您將使用第一種方法-使用量化模型提取特征。

第 0 部分。先決條件

在深入學(xué)習(xí)遷移學(xué)習(xí)之前，讓我們回顧一下“先決條件”，例如安裝和數(shù)據(jù)加載/可視化。

# Imports
import copy
import matplotlib.pyplot as plt
import numpy as np
import os
import time
plt.ion()

安裝每夜構(gòu)建

因?yàn)槟鷮⑹褂?PyTorch 的實(shí)驗(yàn)部分，所以建議安裝最新版本的torch和torchvision。您可以在中找到有關(guān)本地安裝的最新說明。例如，要在沒有 GPU 支持的情況下進(jìn)行安裝：

pip install numpy
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
## For CUDA support use https://download.pytorch.org/whl/nightly/cu101/torch_nightly.html

載入資料

注意

本部分與原始的遷移學(xué)習(xí)教程相同。

我們將使用torchvision和torch.utils.data包加載數(shù)據(jù)。

您今天要解決的問題是從圖像中對(duì)螞蟻和蜜蜂進(jìn)行分類。該數(shù)據(jù)集包含約 120 張針對(duì)螞蟻和蜜蜂的訓(xùn)練圖像。每個(gè)類別有 75 個(gè)驗(yàn)證圖像。可以認(rèn)為這是一個(gè)很小的數(shù)據(jù)集。但是，由于我們正在使用遷移學(xué)習(xí)，因此我們應(yīng)該能夠很好地概括。

此數(shù)據(jù)集是 imagenet 的很小子集。

Note

從此處下載數(shù)據(jù)，并將其提取到data目錄。

import torch
from torchvision import transforms, datasets
## Data augmentation and normalization for training
## Just normalization for validation
data_transforms = {
    'train': transforms.Compose([
        transforms.Resize(224),
        transforms.RandomCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(224),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}
data_dir = 'data/hymenoptera_data'
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=16,
                                              shuffle=True, num_workers=8)
              for x in ['train', 'val']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']}
class_names = image_datasets['train'].classes
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

可視化一些圖像

讓我們可視化一些訓(xùn)練圖像，以了解數(shù)據(jù)擴(kuò)充。

import torchvision
def imshow(inp, title=None, ax=None, figsize=(5, 5)):
  """Imshow for Tensor."""
  inp = inp.numpy().transpose((1, 2, 0))
  mean = np.array([0.485, 0.456, 0.406])
  std = np.array([0.229, 0.224, 0.225])
  inp = std * inp + mean
  inp = np.clip(inp, 0, 1)
  if ax is None:
    fig, ax = plt.subplots(1, figsize=figsize)
  ax.imshow(inp)
  ax.set_xticks([])
  ax.set_yticks([])
  if title is not None:
    ax.set_title(title)
## Get a batch of training data
inputs, classes = next(iter(dataloaders['train']))
## Make a grid from batch
out = torchvision.utils.make_grid(inputs, nrow=4)
fig, ax = plt.subplots(1, figsize=(10, 10))
imshow(out, title=[class_names[x] for x in classes], ax=ax)

模型訓(xùn)練的支持功能

以下是模型訓(xùn)練的通用功能。此功能也

安排學(xué)習(xí)率
保存最佳模型

def train_model(model, criterion, optimizer, scheduler, num_epochs=25, device='cpu'):
  """
  Support function for model training.
  Args:
    model: Model to be trained
    criterion: Optimization criterion (loss)
    optimizer: Optimizer to use for training
    scheduler: Instance of ``torch.optim.lr_scheduler``
    num_epochs: Number of epochs
    device: Device to run the training on. Must be 'cpu' or 'cuda'
  """
  since = time.time()
  best_model_wts = copy.deepcopy(model.state_dict())
  best_acc = 0.0
  for epoch in range(num_epochs):
    print('Epoch {}/{}'.format(epoch, num_epochs - 1))
    print('-' * 10)
    # Each epoch has a training and validation phase
    for phase in ['train', 'val']:
      if phase == 'train':
        model.train()  # Set model to training mode
      else:
        model.eval()   # Set model to evaluate mode
      running_loss = 0.0
      running_corrects = 0
      # Iterate over data.
      for inputs, labels in dataloaders[phase]:
        inputs = inputs.to(device)
        labels = labels.to(device)
        # zero the parameter gradients
        optimizer.zero_grad()
        # forward
        # track history if only in train
        with torch.set_grad_enabled(phase == 'train'):
          outputs = model(inputs)
          _, preds = torch.max(outputs, 1)
          loss = criterion(outputs, labels)
          # backward + optimize only if in training phase
          if phase == 'train':
            loss.backward()
            optimizer.step()
        # statistics
        running_loss += loss.item() * inputs.size(0)
        running_corrects += torch.sum(preds == labels.data)
      if phase == 'train':
        scheduler.step()
      epoch_loss = running_loss / dataset_sizes[phase]
      epoch_acc = running_corrects.double() / dataset_sizes[phase]
      print('{} Loss: {:.4f} Acc: {:.4f}'.format(
        phase, epoch_loss, epoch_acc))
      # deep copy the model
      if phase == 'val' and epoch_acc > best_acc:
        best_acc = epoch_acc
        best_model_wts = copy.deepcopy(model.state_dict())
    print()
  time_elapsed = time.time() - since
  print('Training complete in {:.0f}m {:.0f}s'.format(
    time_elapsed // 60, time_elapsed % 60))
  print('Best val Acc: {:4f}'.format(best_acc))
  # load best model weights
  model.load_state_dict(best_model_wts)
  return model

可視化模型預(yù)測(cè)的支持功能

通用功能可顯示一些圖像的預(yù)測(cè)

def visualize_model(model, rows=3, cols=3):
  was_training = model.training
  model.eval()
  current_row = current_col = 0
  fig, ax = plt.subplots(rows, cols, figsize=(cols*2, rows*2))
  with torch.no_grad():
    for idx, (imgs, lbls) in enumerate(dataloaders['val']):
      imgs = imgs.cpu()
      lbls = lbls.cpu()
      outputs = model(imgs)
      _, preds = torch.max(outputs, 1)
      for jdx in range(imgs.size()[0]):
        imshow(imgs.data[jdx], ax=ax[current_row, current_col])
        ax[current_row, current_col].axis('off')
        ax[current_row, current_col].set_title('predicted: {}'.format(class_names[preds[jdx]]))
        current_col += 1
        if current_col >= cols:
          current_row += 1
          current_col = 0
        if current_row >= rows:
          model.train(mode=was_training)
          return
    model.train(mode=was_training)

第 1 部分。訓(xùn)練基于量化特征提取器的自定義分類器

在本部分中，您將使用“凍結(jié)”量化特征提取器，并在其頂部訓(xùn)練自定義分類器頭。與浮點(diǎn)模型不同，您不需要為量化模型設(shè)置 require_grad = False，因?yàn)樗鼪]有可訓(xùn)練的參數(shù)。請(qǐng)參閱文檔了解更多詳細(xì)信息。

加載預(yù)訓(xùn)練的模型：在本練習(xí)中，您將使用 ResNet-18 。

import torchvision.models.quantization as models
## You will need the number of filters in the `fc` for future use.
## Here the size of each output sample is set to 2.
## Alternatively, it can be generalized to nn.Linear(num_ftrs, len(class_names)).
model_fe = models.resnet18(pretrained=True, progress=True, quantize=True)
num_ftrs = model_fe.fc.in_features

此時(shí)，您需要修改預(yù)訓(xùn)練模型。該模型在開始和結(jié)束時(shí)都有量化/去量化塊。但是，由于只使用要素提取器，因此反量化層必須在線性層(頭部）之前移動(dòng)。最簡單的方法是將模型包裝在nn.Sequential模塊中。

第一步是在 ResNet 模型中隔離特征提取器。盡管在本示例中，您被責(zé)成使用fc以外的所有圖層作為特征提取器，但實(shí)際上，您可以根據(jù)需要選擇任意數(shù)量的零件。如果您也想替換一些卷積層，這將很有用。

注意：

將特征提取器與量化模型的其余部分分開時(shí)，必須手動(dòng)將量化器/去量化器放置在要保持量化的部分的開頭和結(jié)尾。

下面的函數(shù)創(chuàng)建一個(gè)帶有自定義頭部的模型。

from torch import nn
def create_combined_model(model_fe):
  # Step 1\. Isolate the feature extractor.
  model_fe_features = nn.Sequential(
    model_fe.quant,  # Quantize the input
    model_fe.conv1,
    model_fe.bn1,
    model_fe.relu,
    model_fe.maxpool,
    model_fe.layer1,
    model_fe.layer2,
    model_fe.layer3,
    model_fe.layer4,
    model_fe.avgpool,
    model_fe.dequant,  # Dequantize the output
  )
  # Step 2\. Create a new "head"
  new_head = nn.Sequential(
    nn.Dropout(p=0.5),
    nn.Linear(num_ftrs, 2),
  )
  # Step 3\. Combine, and don't forget the quant stubs.
  new_model = nn.Sequential(
    model_fe_features,
    nn.Flatten(1),
    new_head,
  )
  return new_model

警告

當(dāng)前，量化模型只能在 CPU 上運(yùn)行。但是，可以將模型的未量化部分發(fā)送到 GPU。

import torch.optim as optim
new_model = create_combined_model(model_fe)
new_model = new_model.to('cpu')
criterion = nn.CrossEntropyLoss()
## Note that we are only training the head.
optimizer_ft = optim.SGD(new_model.parameters(), lr=0.01, momentum=0.9)
## Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

訓(xùn)練和評(píng)估

此步驟在 CPU 上大約需要 15-25 分鐘。由于量化模型只能在 CPU 上運(yùn)行，因此您不能在 GPU 上運(yùn)行訓(xùn)練。

new_model = train_model(new_model, criterion, optimizer_ft, exp_lr_scheduler,
                        num_epochs=25, device='cpu')
visualize_model(new_model)
plt.tight_layout()

第 2 部分。微調(diào)可量化模型

在這一部分中，我們將微調(diào)用于遷移學(xué)習(xí)的特征提取器，并對(duì)特征提取器進(jìn)行量化。請(qǐng)注意，在第 1 部分和第 2 部分中，特征提取器都是量化的。不同之處在于，在第 1 部分中，我們使用了預(yù)訓(xùn)練的量化模型。在這一部分中，我們將在對(duì)感興趣的數(shù)據(jù)集進(jìn)行微調(diào)之后創(chuàng)建一個(gè)量化的特征提取器，因此這是一種在具有量化優(yōu)勢(shì)的同時(shí)通過轉(zhuǎn)移學(xué)習(xí)獲得更好的準(zhǔn)確性的方法。請(qǐng)注意，在我們的特定示例中，訓(xùn)練集非常小(120 張圖像），因此微調(diào)整個(gè)模型的好處并不明顯。但是，此處顯示的過程將提高使用較大數(shù)據(jù)集進(jìn)行傳遞學(xué)習(xí)的準(zhǔn)確性。

預(yù)訓(xùn)練特征提取器必須是可量化的。為確保其可量化，請(qǐng)執(zhí)行以下步驟：

使用torch.quantization.fuse_modules熔斷(Conv, BN, ReLU)，(Conv, BN)和(Conv, ReLU)。將特征提取器與自定義頂端連接。這需要對(duì)特征提取器的輸出進(jìn)行反量化。在特征提取器的適當(dāng)位置插入偽量化模塊，以模擬訓(xùn)練期間的量化。

對(duì)于步驟(1），我們使用torchvision/models/quantization中的模型，這些模型具有成員方法fuse_model。此功能將所有conv，bn和relu模塊融合在一起。對(duì)于定制模型，這將需要使用模塊列表調(diào)用torch.quantization.fuse_modules API 進(jìn)行手動(dòng)融合。

步驟(2）由上一節(jié)中使用的create_combined_model功能執(zhí)行。

步驟(3）通過使用torch.quantization.prepare_qat來實(shí)現(xiàn)，它會(huì)插入偽量化模塊。

在步驟(4）中，您可以開始“微調(diào)”模型，然后將其轉(zhuǎn)換為完全量化的版本(步驟 5）。

要將微調(diào)模型轉(zhuǎn)換為量化模型，可以調(diào)用torch.quantization.convert函數(shù)(在本例中，僅對(duì)特征提取器進(jìn)行量化）。

注意：

由于隨機(jī)初始化，您的結(jié)果可能與本教程中顯示的結(jié)果不同。

＃注意 <cite>quantize = False</cite> model = models.resnet18(pretrained = True，progress = True，quantize = False）num_ftrs = model.fc.in_features

＃步驟 1 model.train(）model.fuse_model(）＃步驟 2 model_ft = create_combined_model(model）model_ft [0] .qconfig = torch.quantization.default_qat_qconfig＃使用默認(rèn) QAT 配置＃步驟 3 model_ft = torch.quantization.prepare_qat (model_ft，inplace = True）

優(yōu)化模型

在當(dāng)前教程中，整個(gè)模型都經(jīng)過了微調(diào)。通常，這將導(dǎo)致更高的精度。但是，由于此處使用的訓(xùn)練集很小，最終導(dǎo)致我們過度適應(yīng)了訓(xùn)練集。

步驟 4.微調(diào)模型

for param in model_ft.parameters():
  param.requires_grad = True
model_ft.to(device)  # We can fine-tune on GPU if available
criterion = nn.CrossEntropyLoss()
## Note that we are training everything, so the learning rate is lower
## Notice the smaller learning rate
optimizer_ft = optim.SGD(model_ft.parameters(), lr=1e-3, momentum=0.9, weight_decay=0.1)
## Decay LR by a factor of 0.3 every several epochs
exp_lr_scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size=5, gamma=0.3)
model_ft_tuned = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                             num_epochs=25, device=device)

步驟 5.轉(zhuǎn)換為量化模型

from torch.quantization import convert
model_ft_tuned.cpu()
model_quantized_and_trained = convert(model_ft_tuned, inplace=False)

讓我們看看量化模型在幾張圖像上的表現(xiàn)

visualize_model(model_quantized_and_trained)
plt.ioff()
plt.tight_layout()
plt.show()

PyTorch (實(shí)驗(yàn)性）計(jì)算機(jī)視覺教程的量化轉(zhuǎn)移學(xué)習(xí)

第 0 部分。先決條件