APPLICATION

YOLOv12交通拥堵检测

在这次教程中，我们将探讨YOLOv12的独特之处，并实现一个交通检测管道，该管道可以统计车辆数量并根据新加坡—柔佛长堤的交通状况进行分类，使用Google Colab。

admin

Apr 23, 2025 • 15 min read

新加坡—柔佛长堤是世界上最繁忙的陆地海关之一。每年有近1亿至1.4亿人通过这个陆地海关。如果有一个物体检测系统可以帮助建立一个检测系统，用来统计车辆数量，一旦超过阈值，就会发送关于道路状况的通知，那岂不是很好？

通过利用像YOLOv12这样的目标检测模型，我们可以自动化识别车辆、计数以及分类交通状况（拥堵或不拥堵）。这些实时信息可以用于优化交通流量、提高事故响应速度并增强整体交通管理。

1、YOLOv12简介

在2025年2月，YOLOv12引入了以注意力为中心的架构，标志着对传统卷积神经网络（CNN）依赖性的突破（YOLOv12 GitHub）。

与YOLOv11的主要区别包括：区域注意力（A²）模块和残差高效层聚合网络（R-ELAN）

YOLOv12在延迟-准确性（左）和浮点运算次数（FLOPs）-准确性（右）之间的权衡上比之前的对象检测模型更具性能优势来源

YOLOv12-N实现了40.6%的平均精度均值（mAP），比YOLOv11-N高出1.2%，同时保持相似的推理延迟，如YOLOv12论文所述。其基于注意力的架构可能在复杂场景中增强了泛化能力，相比YOLOv11。此外，YOLOv12在实时检测器如RT-DETR和RT-DETRv2方面表现出色，实现了42%更快的推理速度，同时减少了参数和计算需求。(YOLOv12论文)

1.1 mAP vs 延迟

YOLOv12在竞争或更好的延迟下实现了更高的mAP，如红色曲线所示。较大的YOLOv12-X模型达到了55.2%的mAP，比YOLOv11-X提高了0.6%，反映了准确性和效率的提升。这些进步使YOLOv12在需要快速且准确检测的时间关键应用中非常有效。

1.2 mAP vs FLOPs（每秒浮点运算次数）

代表YOLOv12的红色曲线始终位于其他模型之上，表明在等效或更低的FLOPs下具有更高的准确性。这种效率对于部署在资源受限设备上的模型（如边缘和移动平台）至关重要。YOLOv12在不同模型大小上有效扩展，优于之前模型，具有相似或更少的FLOPs。例如，YOLOv12-L在88.9 GFLOPs下达到53.7%的mAP，而YOLOv11-L在86.9 GFLOPs下达到53.3%的mAP，突显了其架构优化的优势。(YOLOv12论文)

2、YOLOv12架构

(a) YOLOv12目标检测模型的架构，集成了区域注意力（A²）模块、R-ELAN块和精简的检测头。来源

主要创新：

区域注意力（A²）模块：通过分割特征图以实现局部但广泛的注意力，减少普通注意力的复杂性。
残差高效层聚合网络（R-ELAN）：增强特征重用和梯度流动，同时最小化计算开销。
闪存注意力：通过消除位置编码、减少内存访问开销和最小化操作数来提升性能。

2.1 区域注意力模块（A²）

YOLOv12中的A²模块解决了传统普通注意力机制的计算效率问题，其复杂度为O(L²d)，其中输入序列长度为L，特征维度为d。通过将尺寸为(H, W)的特征图沿水平或垂直方向分为l个相等且非重叠的部分（默认l=4），A²模块将复杂度降低到O((L/l)²d)。

这种分割使得每个区域可以独立处理，显著降低了计算开销，同时保留了广泛的接收场。

与全局注意力机制不同，全局注意力机制在整个图像上处理交互，A²模块通过简单的重塑操作将特征图分为部分（默认为4）。

这种方法避免了其他局部注意力方法（如Shifted Window、Criss-Cross Attention或Axial Attention）的复杂性，这些方法通常增加计算需求。来源

YOLOv12论文通过视觉对比这些变体，显示交叉注意力限制了行和列模式的交互，窗口注意力限制了上下文到固定区域，轴向注意力按顺序处理维度，而A²模块的简单划分在保持较大接收场的同时具有最小的计算成本。

研究中的热图显示，这种机制改善了对物体边界和关键特征的关注，使YOLOv12能够有效处理大规模图像，同时保持高检测准确性。来源

这种效率和稳健性能的平衡使A²模块特别适合需要全局依赖建模的对象检测任务。

2.2 闪存注意力

YOLOv12集成了闪存注意力，通过消除位置编码、减少内存访问开销和最小化操作数来增强计算效率。

这种优化提升了处理速度，同时保持检测准确性，维持了一个大的接收场以捕捉广泛的上下文信息，尽管是分块处理。为了进一步提高性能，YOLOv12将MLP比率从4调整为1.2（或某些尺度为2），在注意力和前馈网络之间平衡计算，减少堆叠块的深度以简化优化，并最大化卷积算子的使用以提高效率。来源

然而，闪存注意力需要特定的CUDA配置，这可能限制其在非GPU硬件上的适用性。来源

通过集成这些注意力模块，YOLOv12增强了对关键图像区域的关注，显著提高了对象检测任务的检测准确性。

2.3 残差高效层聚合网络（R-ELAN）

与流行的模块架构比较，包括(a): CSPNet [55]，(b) ELAN [56]，© C3K2（GELAN的一个案例）[28, 58]，以及(d) 提出的R-ELAN（残差高效层聚合网络）。它展示了在每次迭代中梯度传播、特征重用和计算效率的改进。来源

YOLOv12中的R-ELAN架构增强了特征聚合和梯度流动，解决了CSPNet、ELAN和C3K2等先前模块的局限性。

CSPNet（跨阶段部分网络）：CSPNet通过将特征图分成两部分来增强梯度流动，一部分通过卷积处理，另一部分保持不变，然后合并它们。这减少了冗余计算，同时保留了表示能力。来源

以前的YOLO模型使用了高效层聚合网络（ELAN），它将1×1卷积层的输出分成多个并行流进行特征融合，然后再合并在一起。来源

然而，这种2输入部分的方法引入了两个主要缺点：梯度阻塞和优化困难。这些问题在更深的模型（如L和X模型）中尤为明显，缺乏直接残差连接使输入和输出之间的有效梯度传播受阻，导致训练缓慢或不稳定。收敛。来源

C3K2（GELAN的一个变体）：C3K2通过对ELAN进行额外的变换进行了修改，但继承了其梯度阻断问题，未能完全解决深度网络中的优化挑战。来源

R-ELAN通过引入具有缩放因子（默认值为0.01）的残差快捷方式和修订后的特征聚合策略克服了这些问题，在最小计算开销的情况下改善了梯度传播，防止在训练过程中丢失图像中的重要细节。

R-ELAN应用一个过渡层来提前调整通道维度，通过瓶颈层处理统一的特征图，然后将结果拼接起来。这种瓶颈结构减少了计算成本和参数数量，同时保持性能。来源

3、YOLOv12的附加限制

基于注意力机制的YOLOv12架构，包括Area Attention (A²)模块、Flash Attention以及Residual Efficient Layer Aggregation Networks (R-ELAN)，虽然提供了卓越的准确性，但在资源受限的环境中也引入了一些限制。

内存瓶颈：由于广泛的特征图和矩阵乘法，基于注意力的架构需要更高的VRAM使用量。来源
推理延迟：YOLOv12相比Vision Transformers减少了注意力开销，但在边缘硬件上仍落后于基于CNN的YOLO模型。来源

现在我们已经了解了YOLOv12的所有基础知识，我们可以回到流程中。你可以在Google Colab中同时运行代码并跟随流程。

让我们开始吧！

4、交通拥堵检测系统

为了实现我们的交通拥堵检测系统，我们将使用Google Colab，这是一个提供免费GPU访问的云平台（记得打开它！）以加速计算。

以下是设置环境的关键步骤：

4.1 安装必要的库

!pip install -q git+https://github.com/sunsmarterjie/yolov12.git flash-attn supervision  
!pip install -q ultralytics==8.3.107

4.2 导入库

import os  
import requests  
from bs4 import BeautifulSoup  
from ultralytics import YOLO  
from IPython.display import Image  
import matplotlib.pyplot as plt  
from PIL import Image  
import locale  
import cv2  
  
locale.getpreferredencoding = lambda: "UTF-8"

4.3 加载YOLOv12模型

接下来，我们加载预训练的YOLOv12模型，以便您可以尝试不同的模型大小，平衡准确性和计算资源。

# 加载YOLOv12模型进行比较  
n_model = YOLO("yolo12n.pt")  
s_model = YOLO("yolo12s.pt")  
m_model = YOLO("yolo12m.pt")  
l_model = YOLO("yolo12l.pt")  
x_model = YOLO("yolo12x.pt")

4.4 获取实时交通图片

新加坡陆路交通管理局（LTA）已在网上提供了这些信息，便于访问。

website_url = "https://onemotoring.lta.gov.sg/content/onemotoring/home/driving/traffic_information/traffic-cameras/woodlands.html"  
  
image_src = None  
print(f"Fetching HTML from: {website_url}")  
  
try:  
    # 发送GET请求到网站  
    response = requests.get(website_url, timeout=10) # 添加超时  
    response.raise_for_status() # 对于不良状态码（4xx或5xx）引发异常  
      
    # 获取HTML内容  
    html_content = response.text  
      
    # 解析HTML  
    soup = BeautifulSoup(html_content, 'html.parser')  
    img_tag = soup.find('img', alt="View from Woodlands Causeway (Towards Johor)")   
      
    if img_tag and img_tag.has_attr('src'):  
        image_src = img_tag['src']  
        if image_src.startswith('//'):  
            image_src = 'https:' + image_src  
        elif not image_src.startswith('http'):  
            from urllib.parse import urljoin  
            image_src = urljoin(website_url, image_src)  
              
        print(f"Extracted Image Source: {image_src}")  
    else:  
        print("Image tag with the specified criteria not found on the page.")  
  
except requests.exceptions.RequestException as e:  
    print(f"Error fetching website content: {e}")  
except Exception as e:  
    print(f"An error occurred during parsing: {e}")  
  
if image_src:  
    IMAGE_DIR = "traffic_images"   
    os.makedirs(IMAGE_DIR, exist_ok=True)  
      
    try:  
        print(f"\nDownloading image from: {image_src}")  
        img_response = requests.get(image_src, timeout=15)  
        img_response.raise_for_status()  
        # 创建文件名  
        img_filename = os.path.basename(image_src.split('?')[0])  
        if not img_filename:  
             img_filename = "downloaded_traffic_image.jpg"  
        save_path = os.path.join(IMAGE_DIR, img_filename)   
        # 保存图片  
        with open(save_path, 'wb') as f:  
            f.write(img_response.content)  
        print(f"Image downloaded successfully to: {save_path}")  
          
    except requests.exceptions.RequestException as e:  
        print(f"Failed to download image: {e}")  
    except Exception as e:  
        print(f"An error occurred saving or displaying the image: {e}")

这确保我们有现实世界的测试数据。

运行夜间道路条件下的图像分析，长周末期间。尝试所有不同大小的模型。

original_img = Image.open(save_path)  
  
fig, axes = plt.subplots(2, 3, figsize=(15, 10))  
axes[0, 0].imshow(original_img)  
axes[0, 0].set_title('Original Image')  
axes[0, 0].axis('off')  
  
models = [n_model, s_model, m_model, l_model, x_model]  
model_names = ['YOLOv12n', 'YOLOv12s', 'YOLOv12m', 'YOLOv12l', 'YOLOv12x']  
positions = [(0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]  
  
# 循环遍历模型并绘制结果  
for model, name, (row, col) in zip(models, model_names, positions):  
    results = model(save_path)  
    result_img = results[0].plot()  
  
    axes[row, col].imshow(result_img)  
    axes[row, col].set_title(f'{name} Detection Results')  
    axes[row, col].axis('off')  
  
plt.tight_layout()  
plt.show()

确实，我们可以看到尽管图像有些模糊，可能是因为早晨持续下雨，YOLOv12能够捕捉到图像中的小汽车和公交车，特别是对于YOLOv12L和YOLOv12X来说。

4.5 车辆检测与计数函数

我们将定义一个名为get_vehicle_class_ids的函数，以获取相应类别的ID，以便在后续函数中进行计数。

def get_vehicle_class_ids(model, vehicle_classes=['car', 'bus', 'truck']):  
    """从模型的类别名称中获取车辆类别的ID。"""  
    names = model.names  
    if isinstance(names, dict):  
        names_list = list(names.values())  
    else:  
        names_list = names  
    return [names_list.index(cls) for cls in vehicle_classes if cls in names_list]

系统的核心在于detect_and_count_vehicles函数。它接受图像路径、YOLOv12模型以及车辆类别名称作为输入。该函数处理图像，检测车辆，计数它们，并计算车辆密度。

def detect_and_count_vehicles(image_path, model, vehicle_classes=['car', 'bus', 'truck'], confidence_threshold=0.01):  
    if not isinstance(image_path, str) or not os.path.exists(image_path):  
        print(f"Error: Invalid or non-existent image path provided: {image_path}")  
        return None  
    try:  
        image = cv2.imread(image_path)  
        if image is None:  
            print(f"Error: Could not read image {image_path}")  
            return None  
  
        height, width = image.shape[:2]  
        image_area = height * width  
  
        if 'model' not in globals():  
            print("Error: YOLO model not loaded.")  
            return None  
  
        vehicle_class_ids = get_vehicle_class_ids(model, vehicle_classes)  
  
        results = model(image, conf=confidence_threshold, classes=vehicle_class_ids)  
        vehicle_counts = {cls: 0 for cls in vehicle_classes}  
        total_vehicle_area = 0  
        annotated_img = image.copy()  
  
        if results and len(results) > 0:  
            boxes = results[0].boxes  
            for box in boxes:  
                cls_id = int(box.cls.item())  
                cls_name = model.names[cls_id]  
                if cls_name in vehicle_classes:  
                    vehicle_counts[cls_name] += 1  
                    x1, y1, x2, y2 = map(int, box.xyxy[0].cpu().numpy())  
                    conf = float(box.conf.item())  
                    box_area = (x2 - x1) * (y2 - y1)  
                    total_vehicle_area += box_area  
                    cv2.rectangle(annotated_img, (x1, y1), (x2, y2), (0, 255, 0), 2)  
                    label = f"{cls_name}: {conf:.2f}"  
                    cv2.putText(annotated_img, label, (x1, y1 - 10),  
                                cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)  
  
        total_vehicles = sum(vehicle_counts.values())  
        vehicle_density = total_vehicle_area / image_area if image_area > 0 else 0  
  
        return {  
            "counts": vehicle_counts,  
            "total_count": total_vehicles,  
            "density": vehicle_density,  
            "annotated_image": annotated_img  
        }  
    except Exception as e:  
        print(f"An error occurred processing {image_path}: {e}")  
        return None

classify_traffic函数接受检测结果并应用基于阈值的规则来确定交通状况：

def classify_traffic(detection_results, count_threshold=10):
    if detection_results is None:
        return "Error"
    total_count = detection_results['total_count']
    density = detection_results['density']
    if total_count >= count_threshold:
        return "JAM"
    else:
        return "NO JAM"

4.6 运行检测和分类

最后，笔记本演示了如何使用这些函数处理下载的交通图像：

model = x_model
if 'save_path' in locals() and os.path.exists(save_path):
    print(f"\n--- Analyzing downloaded image: {os.path.basename(save_path)} ---")
    COUNT_THRESHOLD = 1

    detection_result = detect_and_count_vehicles(save_path, model)
    if detection_result:
        traffic_status = classify_traffic(detection_result, COUNT_THRESHOLD)
        print(f"  Detections: {detection_result['counts']}")
        print(f"  Density: {detection_result['density']:.4f}")
        print(f"  Status: {traffic_status}")

        result_data = {
            'Image': os.path.basename(save_path),
            'Total Vehicles': detection_result['total_count'],
            'Status': traffic_status
        }
        result_data.update(detection_result['counts'])

        annotated_image_bgr = detection_result['annotated_image']
        status_color = (0, 0, 255) if traffic_status == "JAM" else (0, 255, 0)
        cv2.putText(annotated_image_bgr, f"Status: {traffic_status}", (40, 80),
                    cv2.FONT_HERSHEY_SIMPLEX, 3, status_color, 4)
        annotated_image_rgb = cv2.cvtColor(annotated_image_bgr, cv2.COLOR_BGR2RGB)
        plt.figure(figsize=(12, 8))
        plt.imshow(annotated_image_rgb)
        plt.title(f"Analysis: {os.path.basename(save_path)} - Status: {traffic_status}")
        plt.axis('off')
        plt.show()
    else:
        print("Failed to process the downloaded image.")
else:
    print("\nSkipping analysis: 'save_path' not defined or file does not exist.")
    print("Please ensure the previous cell successfully downloaded an image.")

虽然YOLOv12在挑战条件下能够强力检测车辆，但在新加坡-柔佛长堤进行实际部署时，将受益于专门定制的标注数据集。通过仔细调整和资源考虑，该模型在智能交通监控和管理方面具有巨大潜力。

原文链接：YOLOv12: Using Attention Mechanisms for Traffic Detection on the Singapore-JB Causeway + Collab Code

汇智网翻译整理，转载请标明出处