您的位置:首页 > 财经 > 金融 > 搭建模板_关注清远发布_百度app下载并安装最新版_网络营销师官网

搭建模板_关注清远发布_百度app下载并安装最新版_网络营销师官网

2024/12/23 0:41:50 来源:https://blog.csdn.net/qq_64693987/article/details/143311287  浏览:    关键词:搭建模板_关注清远发布_百度app下载并安装最新版_网络营销师官网
搭建模板_关注清远发布_百度app下载并安装最新版_网络营销师官网

          YOLOv11 的网络结构由多个卷积层和池化层组成,这些层逐步提取图像的特征,并在不同的尺度上进行目标检测。它还采用了一些先进的技术,如自适应锚框、多尺度训练等,以适应不同大小和形状的目标。

1. 多尺度注意力聚合(MSAA)模块介绍   

        对来自backbone的特征进行细化处理。通过空间和通道两个路径的操作,增强了空间和通道方面的特征信息,使得输出的特征图在空间和通道维度上都更加优质。

        1. 在空间细化路径中,通过对不同核大小的卷积进行求和以及一系列的空间特征聚合操作,实现了多尺度空间信息的融合。

        2. 在通道聚合路径中,通过全局平均池化、卷积和激活等操作生成通道注意力图,并与空间细化后的图相结合,实现了通道维度上的多尺度信息融合。

2. YOLOv11与MSAA的结合   

        原论文是将MSAA 模块放在编码器与解码器之间,起到连接和增强特征传递的作用。因此本文将这个模块放在Neck部分的contact层后面,弥补跨层拼接特征时可能存在的特征提取不充分或多尺度信息融合不足的问题。。

3. MSAA代码部分

import torch
import torch.nn as nnclass ChannelAttentionModule(nn.Module):def __init__(self, in_channels, reduction=4):super(ChannelAttentionModule, self).__init__()self.avg_pool = nn.AdaptiveAvgPool2d(1)self.max_pool = nn.AdaptiveMaxPool2d(1)self.fc = nn.Sequential(nn.Conv2d(in_channels, in_channels // reduction, 1, bias=False),nn.ReLU(inplace=True),nn.Conv2d(in_channels // reduction, in_channels, 1, bias=False))self.sigmoid = nn.Sigmoid()def forward(self, x):avg_out = self.fc(self.avg_pool(x))max_out = self.fc(self.max_pool(x))out = avg_out + max_outreturn self.sigmoid(out)class SpatialAttentionModule(nn.Module):def __init__(self, kernel_size=7):super(SpatialAttentionModule, self).__init__()self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=kernel_size//2, bias=False)self.sigmoid = nn.Sigmoid()def forward(self, x):avg_out = torch.mean(x, dim=1, keepdim=True)max_out, _ = torch.max(x, dim=1, keepdim=True)x = torch.cat([avg_out, max_out], dim=1)x = self.conv1(x)return self.sigmoid(x)class MSAA(nn.Module):def __init__(self, in_channels, out_channels, factor=4.0):super(MSAA, self).__init__()dim = int(out_channels // factor)self.down = nn.Conv2d(in_channels, dim, kernel_size=1, stride=1)self.conv_3x3 = nn.Conv2d(dim, dim, kernel_size=3, stride=1, padding=1)self.conv_5x5 = nn.Conv2d(dim, dim, kernel_size=5, stride=1, padding=2)self.conv_7x7 = nn.Conv2d(dim, dim, kernel_size=7, stride=1, padding=3)self.spatial_attention = SpatialAttentionModule()self.channel_attention = ChannelAttentionModule(dim)self.up = nn.Conv2d(dim, out_channels, kernel_size=1, stride=1)self.down_2 = nn.Conv2d(in_channels, dim, kernel_size=1, stride=1)def forward(self, x):x = self.down(x)x = x * self.channel_attention(x)x_3x3 = self.conv_3x3(x)x_5x5 = self.conv_5x5(x)x_7x7 = self.conv_7x7(x)x_s = x_3x3 + x_5x5 + x_7x7x_s = x_s * self.spatial_attention(x_s)x_out = self.up(x_s + x)return x_outif __name__ =='__main__':MSAA = MSAA(256,256)#创建一个输入张量batch_size = 8input_tensor=torch.randn(batch_size, 256, 64, 64 )#运行模型并打印输入和输出的形状output_tensor =MSAA(input_tensor)print("Input shape:",input_tensor.shape)print("0utput shape:",output_tensor.shape)

 4. 将MSAA引入到YOLOv11中

第一: 将下面的核心代码复制到D:\bilibili\model\YOLO11\ultralytics-main\ultralytics\nn路径下,如下图所示。

第二:在task.py中导入MSAA包

第三:在task.py中的模型配置部分下面代码

第四:将模型配置文件复制到YOLOV11.YAMY文件中

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolo11n.yaml' will call yolo11.yaml with scale 'n'# [depth, width, max_channels]n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPss: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPsm: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPsl: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPsx: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs# YOLO11n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 2, C3k2, [256, False, 0.25]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 2, C3k2, [512, False, 0.25]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 2, C3k2, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 2, C3k2, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 2, C2PSA, [1024]] # 10# YOLO11n head
head:- [-1, 1, nn.Upsample, [None, 2, "nearest"]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 1, MSAA, []] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 14- [-1, 1, nn.Upsample, [None, 2, "nearest"]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 1, MSAA, []] # cat backbone P4- [-1, 2, C3k2, [256, False]] # 18 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 14], 1, Concat, [1]] # cat head P4- [-1, 1, MSAA, []] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 22 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 10], 1, Concat, [1]] # cat head P5- [-1, 1, MSAA, []] # cat backbone P4- [-1, 2, C3k2, [1024, True]] # 26 (P5/32-large)- [[18, 22, 26], 1, Detect, [nc]] # Detect(P3, P4, P5)

第五:运行成功


from ultralytics.models import NAS, RTDETR, SAM, YOLO, FastSAM, YOLOWorldif __name__=="__main__":# 使用自己的YOLOv11.yamy文件搭建模型并加载预训练权重训练模型model = YOLO(r"D:\bilibili\model\YOLO11\ultralytics-main\ultralytics\cfg\models\11\yolo11_MSAA.yaml")\.load(r'D:\bilibili\model\YOLO11\ultralytics-main\yolo11n.pt')  # build from YAML and transfer weightsresults = model.train(data=r'D:\bilibili\model\ultralytics-main\ultralytics\cfg\datasets\VOC_my.yaml',epochs=100, imgsz=640, batch=8,amp=False)

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com