使用 HuggingFace 构建自定义架构 🤗

社区文章 发布于 2024 年 4 月 22 日

Open In Colab

基线

在本节中,我们将创建一个基线模型并对其进行训练。在本例中,我们将针对 MNIST 数据集训练一个简单的 CNN 模型。

import torch
from torch import nn, optim
import torchvision
from torchvision import datasets, transforms
import torch.nn.functional as F
from torch.utils.data import DataLoader

train_dataset = datasets.MNIST(root='./data', train=True, download=True,transform=transforms.ToTensor())

# Define batch size and number of workers (if any) for data loading
batch_size = 64
num_workers = 2

# Create a DataLoader for the training dataset with specified batch size and number of workers
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers)

然后定义我们的模型

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)
        self.softmax = nn.Softmax(dim=-1)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        output = self.softmax(x)

        return output

然后训练我们的模型并保存权重

model = Net()
criterion = nn.CrossEntropyLoss()
learning_rate = 0.01
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
epochs = 10
for epoch in range(epochs):
    running_loss = 0.0
    for i, data in enumerate(train_dataloader, 0):
        inputs, labels = data[0], data[1]
        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if i % 20 == 19:    # print every 20 mini-batches
            print('Epoch [%d/%d], Step [%d/%d], Loss: %.3f' %
                  (epoch + 1, epochs, i + 1, len(train_dataloader),running_loss / 20))
            running_loss = 0.0


# Save the entire model and other necessary information
checkpoint = {
    'state_dict': model.state_dict(),
}
# Specify the file path where you want to save the model
torch.save(checkpoint, 'model.pth')

自定义模型

要创建一个 🤗 友好的自定义架构,我们需要 3 个文件

  1. MyConfig.py : 定义架构的文件
  2. MyModel.py : 定义模型架构的文件
  3. MyPipe.py : 定义管道的文件

这些文件都必须在主 Python 解释器之外定义。我们这样做是因为这将自动上传我们的依赖项和自定义架构。

配置

配置文件是存储架构信息的,用于实例化模型。在我的例子中,我只选择了存储 conv1conv2 层的两个参数,您可以选择添加更多参数。

from transformers import PretrainedConfig

class MnistConfig(PretrainedConfig):
    # since we have an image classification task
    # we need to put a model type that is close to our task
    # don't worry this will not affect our model
    model_type = "MobileNetV1"
    def __init__(
        self,
        conv1=10,
        conv2=20,
        **kwargs):
      self.conv1 = conv1
      self.conv2 = conv2
      super().__init__(**kwargs)
.
├── MyFolder
│   ├── __init__.py
│   └── MyConFig.py
└── model.pth

模型

对于模型,我们需要继承 PreTrainedModel 类,并将前面定义的配置传递给 config_class。不要忘记使用 config 参数实例化模型。

from transformers import PreTrainedModel
from .MyConfig import MnistConfig # local import
from torch import nn
import torch.nn.functional as F

class MnistModel(PreTrainedModel):
    # pass the previously defined config class to the model
    config_class = MnistConfig

    def __init__(self, config):
        # instantiate the model using the configuration
        super().__init__(config)
        # use the config to instantiate our model
        self.conv1 = nn.Conv2d(1, config.conv1, kernel_size=5)
        self.conv2 = nn.Conv2d(config.conv1, config.conv2, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)
        self.softmax = nn.Softmax(dim=-1)
        self.criterion = nn.CrossEntropyLoss()
    def forward(self, x,labels=None):
        # the labels parameter allows us to finetune our model
        # with the Trainer API easily
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        logits = self.softmax(x)
        if labels != None :
          # this will make your AI compatible with the trainer API
          loss = self.criterion(logits, labels)
          return {"loss": loss, "logits": logits}
        return logits

labels 参数使您的模型与 Trainer API 兼容。这里有一个笔记本,展示了如何使用它。

.
├── MyFolder
│   ├── __init__.py
│   ├── MyConFig.py
│   └── MyModel.py
└── model.pth

推送到 Hub 🤗

首先我们需要使用具有写入权限的 TOKEN 登录

from huggingface_hub import notebook_login
notebook_login()

然后加载模型并将其注册到自动类

from MyFolder.MyConfig import MnistConfig
from MyFolder.MyModel import MnistModel
import torch

conf = MnistConfig()
HF_Model = MnistModel(conf) # instantiate the model using the config

# load the weights
weights = torch.load("model.pth")
HF_Model.load_state_dict(weights['state_dict'])

conf.register_for_auto_class()
HF_Model.register_for_auto_class("AutoModelForImageClassification")

最后将我们的配置和模型推送到 Hub 🤗

conf.push_to_hub('MyRepo')
HF_Model.push_to_hub('MyRepo')

现在您的模型应该可以在您自己的仓库中使用了。

自定义管道

理解工作流程

让我们调用我们之前定义的模型并用它来分类一张新图像

from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("not-lain/MyRepo", trust_remote_code=True)

# download an image from the web
import requests
url = "https://huggingface.co/datasets/not-lain/dependencies/resolve/main/7.webp" 
response = requests.get(url, stream=True)
response.raise_for_status()  # Raise an HTTPError for bad responses (4xx and 5xx)

# Open a local file to save the image
with open("image.png", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)
print("image saved as image.png")

# load and process the image
from PIL import Image
import torchvision.transforms as transforms
import torch
img = Image.open("image.png") # read image
gray = img.convert('L') # convert to grayscale if needed
print(gray.size) # get image dimensions
# >> (1490, 1480)
# process input
transform = transforms.Compose(
    [transforms.ToTensor(), # convert to a torch tensor
     transforms.Resize((28,28), antialias=True) # resize img
     ])
tensor = transform(gray) # apply to input
tensor = tensor.unsqueeze(0) # add extra dimensionality, think batch_size = 1
with torch.no_grad():
  out = model(tensor) # calculate the output
label = torch.argmax(out,axis=-1) # get class
print(label.tolist()[0]) # extract the label
# >> 7

创建管道

让我们使用自定义管道自动化此过程,并创建一个更复杂的管道以涵盖大多数用例

from transformers import Pipeline
import requests
from PIL import Image
import torchvision.transforms as transforms
import torch

class MnistPipe(Pipeline):
    def __init__(self,**kwargs):

      # self.tokenizer = (...) # code if you want to instantiate more parameters

      Pipeline.__init__(self,**kwargs) # self.model automatically instantiated here

      self.transform = transforms.Compose(
                              [transforms.ToTensor(),
                              transforms.Resize((28,28), antialias=True)
                              ])

    def _sanitize_parameters(self, **kwargs):
        # will make sure where each parameter goes
        preprocess_kwargs = {}
        postprocess_kwargs = {}
        if "download" in kwargs:
            preprocess_kwargs["download"] = kwargs["download"]
        if "clean_output" in kwargs :
          postprocess_kwargs["clean_output"] = kwargs["clean_output"]
        return preprocess_kwargs, {}, postprocess_kwargs

    def preprocess(self, inputs, download=False):
        if download == True :
          # call download_img method and name image as "image.png"
          self.download_img(inputs)
          inputs = "image.png"

        # we open and process the image
        img = Image.open(inputs)
        gray = img.convert('L')
        tensor = self.transform(gray)
        tensor = tensor.unsqueeze(0)
        return tensor

    def _forward(self, tensor):
        with torch.no_grad():
            # the model has been automatically instantiated
            # in the __init__ method
            out = self.model(tensor)
        return out

    def postprocess(self, out, clean_output=True):
        if clean_output ==True :
          label = torch.argmax(out,axis=-1) # get class
          label = label.tolist()[0]
          return label
        else :
          return out

    def download_img(self,url):
      # if download = True download image and name it image.png
      response = requests.get(url, stream=True)

      with open("image.png", "wb") as f:
          for chunk in response.iter_content(chunk_size=8192):
              f.write(chunk)
      print("image saved as image.png")

让我们解释一下我们的管道

  • 当使用 pipe = pipeline(...) 实例化模型时,这些参数将被传递给 __init__ 方法
  • 当调用前面定义的管道 pipe(...) 时,这些参数将被传递给 _sanitize_parameters 方法,该方法将分割参数并将其传递给以下之一:
    • preprocess 方法:此方法通常用于清理输入,在我们的例子中,它将加载图像,将其转换为灰度,并将其转换为 torch 张量
    • _forward 方法:此方法主要用于调用我们的模型预测输出
    • postprocess 方法:此方法通常用于清理我们的输出,在我们的例子中,如果 clean_output 参数不是 True,它将返回原始输入,否则它将应用 argmax 并为我们提取标签。
    • download_img 方法:这是我添加到我们架构中的自定义方法,创建管道时不需要它。在上面的示例中,如果 download 参数为 true,我们将调用 preprocess 方法来下载图像

使用 pipe(...) 时,我们按顺序调用以下方法

  1. _sanitize_parameters:确保每个关键字参数的去向
  2. preprocess:清理输入
  3. _forward:使用 AI
  4. postprocess:清理输出

不要忘记将您的代码保存到外部文件中,因为这会自动为我们完成代码推送过程

.
├── MyFolder
│   ├── __init__.py
│   ├── MyConFig.py
│   ├── MyModel.py
│   └── MyPipe.py
└── model.pth

推送到 Hub 🤗

你需要 transformers>=4.40.0

pip install transformers>=4.40.0
from MyFolder.MyPipe import MnistPipe
from transformers.pipelines import PIPELINE_REGISTRY
from transformers import pipeline, AutoModelForImageClassification


# register pipeline
PIPELINE_REGISTRY.register_pipeline(
    "image-classification", # or any other custom task 
    pipeline_class=MnistPipe,
    pt_model=AutoModelForImageClassification,
    # Optional parameters :
    # select a default revision/branch/commit_hash for the model
    # default={"pt": ("not-lain/MyRepo", "dba8d15072d743b6cb4a707246f801699897fb72")},
    type="image",  # current support type: text, audio, image, multimodal
)
# call the pipeline
pipe = pipeline(
              # Optional : pass the task used above here
              # "image-classification",
              model="not-lain/MyRepo",
              trust_remote_code=True)
# upload to 🤗
pipe.push_to_hub('not-lain/MyRepo')

全部完成,现在您可以使用您的新管道了

from transformers import pipeline
# no need to specify what task we are using
pipe = pipeline(model="not-lain/MyRepo", trust_remote_code=True)
pipe( "https://huggingface.co/datasets/not-lain/dependencies/resolve/main/7.webp",
    download=True, # will call the download_img method
    clean_output=False # will be passed as postprocess_kwargs
  )
# >> image saved as image.png
# >> tensor([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]])

pipe("image.png")
# >> 7

pipe.download_img("https://huggingface.co/datasets/not-lain/dependencies/resolve/main/7.webp")
# >> image saved as image.png

最后在您的仓库中添加一个 README.md 文件,让大家知道如何使用您的自定义架构 🥳

资源

仓库 自定义代码 自定义管道 备注
not-lain/MyRepo 代码简洁易懂
vikhyatk/moondream1 大型架构,管道可在此处找到
microsoft/phi-2 🟡 大型架构,工作管道
Qwen/Qwen-VL-Chat 大型架构,暂无管道
tiiuae/falcon-7b 🟡 大型架构,工作管道
briaai/RMBG-1.4 大型架构,工作管道

📺 YouTube:https://www.youtube.com/watch?v=9gZ7LvEJRBo

🌐 如何联系我:https://not-lain.github.io/

社区

注册登录以发表评论