首页分享 YOLO数据集统计标签框尺寸分布的算法

YOLO数据集统计标签框尺寸分布的算法

来源：花匠小妙招时间：2024-12-27 16:21

YOLO数据集统计标签框尺寸分布

在深度学习中，使用YOLO算法进行目标检测时，需要准备符合YOLO数据集格式的数据。其中，每个标签框由5个参数确定：标签、框中心的x坐标、框中心的y坐标、框的宽度、框的高度。本文将介绍如何使用Python统计YOLO数据集中标签框的尺寸分布。

准备工作

首先，需要准备符合YOLO数据集格式的数据。数据集中应包含图片和对应的标签文件。标签文件中应包含每张图片中所有标签的信息，每个标签占一行。以标签文件的第一行为例，格式如下：

0 0.345 0.678 0.123 0.456 12

其中，第一个数字为标签，后面四个数字分别是框中心的x坐标、框中心的y坐标、框的宽度、框的高度。这些数字都是相对于图片尺寸的比例，范围在0到1之间。

统计标签框尺寸分布

首先，我们需要定义一个函数traverse_folder，用于遍历一个文件夹中的所有图片和标签文件，并将它们的文件名以列表的形式返回：

import os def traverse_folder(folder_path): # 遍历这个文件夹中的文件返回名称 images = [] for file in os.listdir(folder_path): if file.endswith('.jpg') or file.endswith('.png') or file.endswith('.jpeg') or file.endswith('.txt'): images.append(file) images.sort() return images 1234567891011

接下来，我们需要定义一个函数traverse_file，用于读取一个标签文件，并将其中的每一行以列表的形式返回：

def traverse_file(file_path): # 读txt txt_files = [] with open(file_path, 'r') as f: for line in f: txt_files.append(line) return txt_files 12345678

定义一个函数update_lables，用于更新标签框的尺寸和位置。其中，tag参数为1、2、3、4中的一个，分别对应左上、左下、右上、右下四个位置。该函数将计算出新的标签框的尺寸和位置，并将它们以列表的形式返回：

def update_lables(lables_info, tag): # tag 1 2 3 4 分别对应左上，左下，右上，右下 resl = [] for lable_info in lables_info: lable_info = lable_info.split() x = float(lable_info[1])/2 y = float(lable_info[2])/2 if (tag == 2 or tag == 4): x = x+400/800 if (tag == 3 or tag == 4): y = y+400/800 w = float(lable_info[3])/2 h = float(lable_info[4])/2 resl.append(lable_info[0]+' '+str(x)+' '+str(y)+' '+str(w)+' '+str(h)) return resl

12345678910111213141516

最后，我们可以使用以下代码，统计标签框的尺寸分布：

lable_path = '/Users/aoxin/CODE/GraduationProject/plusDataSet/labels/train/' lables_name = traverse_folder(lable_path) num_0_50 = [0] * 20 num_50_100 = [0] * 20 num_100_200 = [0] * 20 num_200_300 = [0] * 20 num_300_400 = [0] * 20 num_400_800 = [0] * 20 for lable_name in lables_name: content = traverse_file(lable_path + lable_name) for row in content: row = row.split() square = float(row[3]) * float(row[4]) * 800 * 800 if square >= 0 * 0 and square < 50 * 50: num_0_50[int(row[0])] += 1 if square >= 50 * 50 and square < 100 * 100: num_50_100[int(row[0])] += 1 if square >= 100 * 100 and square < 200 * 200: num_100_200[int(row[0])] += 1 if square >= 200 * 200 and square < 300 * 300: num_200_300[int(row[0])] += 1 if square >= 300 * 300 and square < 400 * 400: num_300_400[int(row[0])] += 1 if square >= 400 * 400 and square < 500 * 500: num_400_800[int(row[0])] += 1

12345678910111213141516171819202122232425262728

以上代码会输出六个列表，分别统计了标签框的尺寸分布。列表中的下标表示标签的类别，列表中的值表示属于该类别的标签框的数量。

结束语

本文介绍了如何使用Python统计YOLO数据集中标签框的尺寸分布。这些尺寸参数在深度学习中非常重要，能够帮助我们更好地了解目标检测模型的性能。

import cv2 import os def write_list_to_txt(lst, file_path): with open(file_path, 'w') as f: for item in lst: f.write("%sn" % item) def traverse_folder(folder_path): # 遍历这个文件夹中的文件返回名称 images = [] for file in os.listdir(folder_path): if file.endswith('.jpg') or file.endswith('.png') or file.endswith('.jpeg') or file.endswith('.txt'): images.append(file) images.sort() return images def traverse_file(folder_path): # 读txt txt_files = [] with open(folder_path, 'r') as f: for line in f: txt_files.append(line) return txt_files def draw_bounding_box(img1, lables_info): # 画框在图片上，参数是图片，[标签，框中心x，框中心y，框长，框宽] for label_info in lables_info: label_info = label_info.split() bounding_box_center = (int(float( label_info[1])*img1.shape[0]), int(float(label_info[2])*img1.shape[1])) # 框中心坐标 bounding_box_length = int(float(label_info[3])*img1.shape[0]) bounding_box_width = int(float(label_info[4])*img1.shape[0]) # 计算左上角和右下角的坐标 x1 = int(bounding_box_center[0] - bounding_box_length / 2) y1 = int(bounding_box_center[1] - bounding_box_width / 2) x2 = int(bounding_box_center[0] + bounding_box_length / 2) y2 = int(bounding_box_center[1] + bounding_box_width / 2) cv2.rectangle(img1, (x1, y1), (x2, y2), (0, 255, 0), 2) cv2.imshow('image', img1) cv2.waitKey(2000) cv2.destroyAllWindows() def contact_4_img(imgs): # 将四张图像按照水平方向拼接起来 h1 = cv2.hconcat([imgs[0], imgs[1]]) h2 = cv2.hconcat([imgs[2], imgs[3]]) result1 = cv2.vconcat([h1, h2]) # 缩放比例 scale_percent = 50 # 计算缩放后的大小 width = int(result1.shape[1] * scale_percent / 100) height = int(result1.shape[0] * scale_percent / 100) dim = (width, height) # 缩小图像 resized = cv2.resize(result1, dim, interpolation=cv2.INTER_AREA) # print(img1.shape[0]) 800 # 保存图像 return resized # cv2.imwrite('output.jpg', resized) def update_lables(lables_info, tag): # tag 1 2 3 4 分别对应左上，左下，右上，右下 resl = [] for lable_info in lables_info: lable_info = lable_info.split() x = float(lable_info[1])/2 y = float(lable_info[2])/2 if (tag == 2 or tag == 4): x = x+400/800 if (tag == 3 or tag == 4): y = y+400/800 w = float(lable_info[3])/2 h = float(lable_info[4])/2 resl.append(lable_info[0]+' '+str(x)+' '+str(y)+' '+str(w)+' '+str(h)) return resl # 读取四张图像 lable_path='/Users/aoxin/CODE/GraduationProject/plusDataSet/labels/train/' lables_name=traverse_folder(lable_path) num_0_50=[0]*20 num_50_100=[0]*20 num_100_200=[0]*20 num_200_300=[0]*20 num_300_400=[0]*20 num_400_800=[0]*20 for lable_name in lables_name: content=traverse_file(lable_path+lable_name) for row in content: row=row.split() square=float(row[3])*float(row[4])*800*800 if square>=0*0 and square<50*50: num_0_50[int(row[0])]+=1 if square>=50*50 and square<100*100: num_50_100[int(row[0])]+=1 if square>=100*100 and square<200*200: num_100_200[int(row[0])]+=1 if square>=200*200 and square<300*300: num_200_300[int(row[0])]+=1 if square>=300*300 and square<400*400: num_300_400[int(row[0])]+=1 if square>=400*400 and square<500*500: num_400_800[int(row[0])]+=1 print(num_0_50) print(num_50_100) print(num_100_200) print(num_200_300) print(num_300_400) print(num_400_800)

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111