val.py简介

val.py文件主要是在每一轮的训练结束后，验证当面模型的mAP、混淆矩阵等指标。

mAP：英文全称为 Mean Average Precision，作为目标检测中的平均精度

AP：(平均精度)是衡量目标检测算法好坏的常用指标，在Faster R-CNN，SSD等算法中作为评估指标。

AP等于recall值取0-1时，precision值的平均值

混淆矩阵：也称误差矩阵，是表示精度评价的一种标准格式，用n行n列的矩阵形式来表示。具体评价指标有总体精度、制图精度、用户精度等，这些精度指标从不同的侧面反映了图像分类的精度。在人工智能中，混淆矩阵（confusion matrix）是可视化工具，特别用于监督学习，在无监督学习一般叫做匹配矩阵。在图像精度评价中，主要用于比较分类结果和实际测得值，可以把分类结果的精度显示在一个混淆矩阵里面。混淆矩阵是通过将每个实测像元的位置和分类与分类图像中的相应位置和分类相比较计算的。

实际上这个脚本最常用的应该是通过train.py调用run函数，而不是通过执行val.py的。所以在这个脚本中，最重要的就是run函数。

opt参数

def parse_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--data', type=str, default=ROOT / 'data/ship.yaml', help='dataset.yaml path')
    parser.add_argument('--weights', nargs='+', type=str, default='runs/train/exp4/weights/best.pt', help='model.pt path(s)')
    parser.add_argument('--batch-size', type=int, default=2, help='batch size')
    parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)')
    parser.add_argument('--conf-thres', type=float, default=0.001, help='confidence threshold')
    parser.add_argument('--iou-thres', type=float, default=0.6, help='NMS IoU threshold')
    parser.add_argument('--task', default='val', help='train, val, test, speed or study')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
    parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
    parser.add_argument('--augment', action='store_true', help='augmented inference')
    parser.add_argument('--verbose', action='store_true', help='report mAP by class')
    parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
    parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to *.txt')
    parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
    parser.add_argument('--save-json', action='store_true', help='save a COCO-JSON results file')
    parser.add_argument('--project', default=ROOT / 'runs/val', help='save to project/name')
    parser.add_argument('--name', default='exp', help='save to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
    opt = parser.parse_args()
    opt.data = check_yaml(opt.data)  # check YAML
    opt.save_json |= opt.data.endswith('coco.yaml')
    opt.save_txt |= opt.save_hybrid
    print_args(FILE.stem, opt)
    return opt

opt参数详解：

opt参数详解
data: 数据集配置文件地址包含数据集的路径、类别个数、类名、下载地址等信息
weights: 模型的权重文件地址 weights/yolov5s.pt
batch_size: 前向传播的批次大小默认32
imgsz: 输入网络的图片分辨率默认640
conf-thres: object置信度阈值默认0.25
iou-thres: 进行NMS时IOU的阈值默认0.6
task: 设置测试的类型有train, val, test, speed or study几种默认val
device: 测试的设备
single-cls: 数据集是否只用一个类别默认False
augment: 测试是否使用TTA Test Time Augment 默认False
verbose: 是否打印出每个类别的mAP 默认False
下面三个参数是auto-labelling(有点像RNN中的teaching forcing)

save-txt: traditional auto-labelling

save-hybrid: save hybrid autolabels, combining existing labels with new predictions before NMS (existing predictions given confidence=1.0 before NMS.

save-conf: add confidences to any of the above commands
save-json: 是否按照coco的json格式保存预测框，并且使用cocoapi做评估（需要同样coco的json格式的标签）默认False
project: 测试保存的源文件默认runs/test
name: 测试保存的文件地址默认exp 保存在runs/test/exp下
exist-ok: 是否存在当前文件默认False 一般是 no exist-ok 连用所以一般都要重新创建文件夹
half: 是否使用半精度推理默认False

main()函数

def main(opt):
    global x
    check_requirements(requirements=ROOT / 'requirements.txt', exclude=('tensorboard', 'thop'))

    if opt.task in ('train', 'val', 'test'):  # run normally
        if opt.conf_thres > 0.001:  # https://github.com/ultralytics/yolov5/issues/1466
            LOGGER.info(f'WARNING: confidence threshold {opt.conf_thres} >> 0.001 will produce invalid mAP values.')
        run(**vars(opt))

    else:
        weights = opt.weights if isinstance(opt.weights, list) else [opt.weights]
        opt.half = True  # FP16 for fastest results
        if opt.task == 'speed':  # speed benchmarks
            # python val.py --task speed --data coco.yaml --batch 1 --weights yolov5n.pt yolov5s.pt...
            opt.conf_thres, opt.iou_thres, opt.save_json = 0.25, 0.45, False
            for opt.weights in weights:
                run(**vars(opt), plots=False)

        elif opt.task == 'study':  # speed vs mAP benchmarks
            # python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n.pt yolov5s.pt...
            for opt.weights in weights:
                f = f'study_{Path(opt.data).stem}_{Path(opt.weights).stem}.txt'  # filename to save to
                x, y = list(range(256, 1536 + 128, 128)), []  # x axis (image sizes), y axis
                for opt.imgsz in x:  # img-size
                    LOGGER.info(f'\nRunning {f} --imgsz {opt.imgsz}...')
                    r, _, t = run(**vars(opt), plots=False)
                    y.append(r + t)  # results and times
                np.savetxt(f, y, fmt='%10.4g')  # save
            os.system('zip -r study.zip study_*.txt')
            plot_val_study(x=x)  # plot

在这个模块中，根据opt.task分为三个分支，即[train, val, test]、[speed]、[study]，最主要的分支还是在

1	opt.task in ('train', 'val', 'test')

这段代码的意思是如果task in [‘train’, ‘val’, ‘test’]就正常测试训练集/验证集/测试集。
一般直接进入第一个分支，执行run()函数。

run()函数

if RANK in [-1, 0]:
  # mAP
  callbacks.run('on_train_epoch_end', epoch=epoch)
  ema.update_attr(model, include=['yaml', 'nc', 'hyp', 'names', 'stride', 'class_weights'])
  final_epoch = (epoch + 1 == epochs) or stopper.possible_stop
  if not noval or final_epoch:  # Calculate mAP
      results, maps, _ = val.run(data_dict,
                                 batch_size=batch_size // WORLD_SIZE * 2,
                                 imgsz=imgsz,
                                 model=ema.ema,
                                 single_cls=single_cls,
                                 dataloader=val_loader,
                                 save_dir=save_dir,
                                 plots=False,
                                 callbacks=callbacks,
                                 compute_loss=compute_loss)

run()函数在train.py中执行，用来在每个epoch后验证当前模型。

载入参数

def run(data,
        weights=None,  # model.pt path(s)
        batch_size=32,  # batch size
        imgsz=640,  # inference size (pixels)
        conf_thres=0.001,  # confidence threshold
        iou_thres=0.6,  # NMS IoU threshold
        task='val',  # train, val, test, speed or study
        device='',  # cuda device, i.e. 0 or 0,1,2,3 or cpu
        workers=8,  # max dataloader workers (per RANK in DDP mode)
        single_cls=False,  # treat as single-class dataset
        augment=False,  # augmented inference
        verbose=False,  # verbose output
        save_txt=False,  # save results to *.txt
        save_hybrid=False,  # save label+prediction hybrid results to *.txt
        save_conf=False,  # save confidences in --save-txt labels
        save_json=False,  # save a COCO-JSON results file
        project=ROOT / 'runs/val',  # save to project/name
        name='exp',  # save to project/name
        exist_ok=False,  # existing project/name ok, do not increment
        half=True,  # use FP16 half-precision inference
        dnn=False,  # use OpenCV DNN for ONNX inference
        model=None,
        dataloader=None,
        save_dir=Path(''),
        plots=True,
        callbacks=Callbacks(),
        compute_loss=None,
        ):

参数解释：

data: 数据集配置文件地址–包含数据集的路径、类别个数、类名、下载地址等信息 train.py时传入data_dict
weights: 模型的权重文件地址运行train.py=None 运行test.py=默认weights/yolov5s.pt
batch_size: 前向传播的批次大小运行test.py传入默认32 运行train.py则传入batch_size // WORLD_SIZE * 2
imgsz: 输入网络的图片分辨率运行test.py传入默认640 运行train.py则传入imgsz_test
conf_thres: object置信度阈值默认0.25
iou_thres: 进行NMS时IOU的阈值默认0.6
task: 设置测试的类型有train, val, test, speed or study几种默认val
device: 测试的设备
single_cls: 数据集是否只用一个类别运行test.py传入默认False 运行train.py则传入single_cls
augment: 测试是否使用TTA Test Time Augment 默认False
verbose: 是否打印出每个类别的mAP 运行test.py传入默认Fasle 运行train.py则传入nc < 50 and final_epoch
save_txt: 是否以txt文件的形式保存模型预测框的坐标默认True
save_hybrid: 是否save label+prediction hybrid results to *.txt 默认False
save_conf: 是否保存预测每个目标的置信度到预测tx文件中默认True
save_json: 是否按照coco的json格式保存预测框，并且使用cocoapi做评估（需要同样coco的json格式的标签）
- 运行test.py传入默认Fasle 运行train.py则传入is_coco and final_epoch(一般也是False)
project: 测试保存的源文件默认runs/test
name: 测试保存的文件地址默认exp 保存在runs/test/exp下
exist_ok: 是否存在当前文件默认False 一般是 no exist-ok 连用所以一般都要重新创建文件夹
half: 是否使用半精度推理 FP16 half-precision inference 默认False
model: 模型如果执行test.py就为None 如果执行train.py就会传入ema.ema(ema模型)
dataloader: 数据加载器如果执行test.py就为None 如果执行train.py就会传入testloader
save_dir: 文件保存路径如果执行test.py就为‘’ 如果执行train.py就会传入save_dir(runs/train/expn)
plots: 是否可视化运行test.py传入默认True 运行train.py则传入plots and final_epoch
wandb_logger: 网页可视化类似于tensorboard 运行test.py传入默认None 运行train.py则传入wandb_logger(train)
compute_loss: 损失函数运行test.py传入默认None 运行train.py则传入compute_loss(train)
return (Precision, Recall, map@0.5, map@0.5:0.95, box_loss, obj_loss, cls_loss)

初始化/加载模型并选择处理器

训练时（train.py）调用：初始化模型参数、训练设备
验证时（val.py）调用：初始化设备、save_dir文件路径、make dir、加载模型、check imgsz、加载+check data配置信息

# Initialize/load model and set device
global stride, ap50
training = model is not None
if training:  # called by train.py
    device, pt, jit, engine = next(model.parameters()).device, True, False, False  # get model device, PyTorch model

    half &= device.type != 'cpu'  # half precision only supported on CUDA
    model.half() if half else model.float()
else:  # called directly
    device = select_device(device, batch_size=batch_size)

    # Directories
    save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
    (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

    # Load model
    model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data)
    stride, pt, jit, onnx, engine = model.stride, model.pt, model.jit, model.onnx, model.engine
    imgsz = check_img_size(imgsz, s=stride)  # check image size
    half &= (pt or jit or onnx or engine) and device.type != 'cpu'  # FP16 supported on limited backends with CUDA
    if pt or jit:
        model.model.half() if half else model.model.float()
    elif engine:
        batch_size = model.batch_size
    else:
        half = False
        batch_size = 1  # export.py models default to batch-size 1
        device = torch.device('cpu')
        LOGGER.info(f'Forcing --batch-size 1 square inference shape(1,3,{imgsz},{imgsz}) for non-PyTorch backends')

    # Data
    data = check_dataset(data)  # check

调整模型

    half &= device.type != 'cpu'  # half precision only supported on CUDA
    model.half() if half else model.float()
else:  # called directly
    device = select_device(device, batch_size=batch_size)

半精度验证half model + 模型剪枝prune + 模型融合conv+bn

模型验证

model.eval()
is_coco = isinstance(data.get('val'), str) and data['val'].endswith('coco/val2017.txt')  # COCO dataset
nc = 1 if single_cls else int(data['nc'])  # number of classes
iouv = torch.linspace(0.5, 0.95, 10).to(device)  # iou vector for mAP@0.5:0.95
niou = iouv.numel()

是否是COCO数据集is_coco + 类别数nc + 计算mAP相关参数 + 初始化日志Logging

加载val数据集

训练时（train.py）调用：加载val数据集
验证时（val.py）调用：不需要加载val数据集直接从train.py 中传入testloader

# Dataloader
if not training:
    model.warmup(imgsz=(1 if pt else batch_size, 3, imgsz, imgsz), half=half)  # warmup
    pad = 0.0 if task in ('speed', 'benchmark') else 0.5
    rect = False if task == 'benchmark' else pt  # square inference for benchmarks
    task = task if task in ('train', 'val', 'test') else 'val'  # path to train/val/test images
    dataloader = create_dataloader(data[task], imgsz, batch_size, stride, single_cls, pad=pad, rect=rect,
                                   workers=workers, prefix=colorstr(f'{task}: '))[0]

初始化配置

seen = 0
confusion_matrix = ConfusionMatrix(nc=nc)
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
class_map = coco80_to_coco91_class() if is_coco else list(range(1000))
s = ('%20s' + '%11s' * 6) % ('Class', 'Images', 'Labels', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
dt, p, r, f1, mp, mr, map50, map = [0.0, 0.0, 0.0], 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
loss = torch.zeros(3, device=device)
jdict, stats, ap, ap_class = [], [], [], []
pbar = tqdm(dataloader, desc=s, bar_format='{l_bar}{bar:10}{r_bar}{bar:-10b}')  # progress bar

初始化混淆矩阵 + 数据集类名 + 获取coco数据集的类别索引 + 设置tqdm进度条 + 初始化p, r, f1, mp, mr, map50, map指标和时间t0, t1, t2 + 初始化测试集的损失 + 初始化json文件中的字典统计信息 ap等

开始验证

1	for batch_i, (im, targets, paths, shapes) in enumerate(pbar):

预处理图片和target

t1 = time_sync()
if pt or jit or engine:
    im = im.to(device, non_blocking=True)
    targets = targets.to(device)
im = im.half() if half else im.float()  # uint8 to fp16/32
im /= 255  # 0 - 255 to 0.0 - 1.0
nb, _, height, width = im.shape  # batch size, channels, height, width
t2 = time_sync()
dt[0] += t2 - t1

model前向推理

1
2
3

# Inference
out, train_out = model(im) if training else model(im, augment=augment, val=True)  # inference, loss outputs
dt[1] += time_sync() - t2

计算验证集损失

1
2
3

# Loss
if compute_loss:
    loss += compute_loss([x.float() for x in train_out], targets)[1]  # box, obj, cls

运行NMS

# NMS
targets[:, 2:] *= torch.Tensor([width, height, width, height]).to(device)  # to pixels
lb = [targets[targets[:, 0] == i, 1:] for i in range(nb)] if save_hybrid else []  # for autolabelling
t3 = time_sync()
out = non_max_suppression(out, conf_thres, iou_thres, labels=lb, multi_label=True, agnostic=single_cls)
dt[2] += time_sync() - t3

首先将真实框target的xywh(因为target是在labelimg中做了归一化的)映射到img(test)尺寸；

save_hybrid: adding the dataset labels to the model predictions before NMS

意思是在NMS之前将数据集标签targets添加到模型预测中
这允许在数据集中自动标记(for autolabelling)其他对象(在pred中混入gt) 并且mAP反映了新的混合标签
targets: [num_target, img_index+class_index+xywh] = [31, 6]
lb: {list: bs} 第一张图片的target[17, 5] 第二张[1, 5] 第三张[7, 5]

统计每章图片的真实框、预测框信息

# Metrics
for si, pred in enumerate(out):
    labels = targets[targets[:, 0] == si, 1:]
    nl = len(labels)
    tcls = labels[:, 0].tolist() if nl else []  # target class
    path, shape = Path(paths[si]), shapes[si][0]
    seen += 1

    if len(pred) == 0:
        if nl:
            stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
        continue

	# Predictions
    if single_cls:
         pred[:, 5] = 0
     predn = pred.clone()
     scale_coords(im[si].shape[1:], predn[:, :4], shape, shapes[si][1]) # native-space pred