월리를 찾아라 (GPT-4o, 코드 기능)

먼저 간단한 방법으로 “월리를 찾아라”를 해봤습니다.

이번에는 코드를 활용해 ‘월리를 찾아라’ 이미지를 분석하고, 군중 속에서 월리를 찾아내는 과정을 거쳤습니다. 먼저, 월리가 포함된 큰 이미지와 월리의 작은 템플릿 이미지를 준비한 후, 템플릿 매칭 기법을 통해 이미지 속 월리의 위치를 찾아보기로 했습니다.

템플릿 매칭을 수행하면서 각 위치에 대한 월리와의 유사도를 계산했고, 이를 바탕으로 히트맵을 생성해 잠재적인 위치를 확인했습니다.

다음으로, 계산된 확률이 20% 이상인 영역에 바운딩 박스를 추가하고 해당 확률을 텍스트로 표시했는데, 초기에는 중복되는 박스가 많아 월리의 위치를 명확하게 파악하기 어려웠습니다. 이를 해결하기 위해 비최대 억제(Non-Max Suppression) 기법을 적용해 중복된 박스를 제거하여, 가장 높은 확률을 가진 영역만 남기게 했습니다.

마지막으로, 월리와 일치할 가능성이 높은 영역은 확률에 따라 강조 표현을 달리했습니다. 특히, 확률이 80% 이상인 영역은 더 두꺼운 경계선으로 표시해 쉽게 식별할 수 있도록 했습니다. 이렇게 분석한 결과, 월리가 있는 위치를 정확하게 찾아낼 수 있었고, 분석 과정을 SNS에 공유하여 재미있는 ‘월리를 찾아라’ 경험을 나눌 계획입니다.

ChatGPT가 작성한 코드는 아래와 같습니다.

import cv2
import numpy as np
import matplotlib.pyplot as plt

# 비최대 억제 함수 정의
def non_max_suppression(boxes, scores, threshold=0.3):
    """
    Applies Non-Maximum Suppression (NMS) to bounding boxes.
    Parameters:
    - boxes: Array of bounding boxes (x1, y1, x2, y2)
    - scores: Array of confidence scores for each bounding box
    - threshold: Intersection-over-Union (IoU) threshold for suppression
    Returns:
    - List of indices of boxes to keep
    """
    if len(boxes) == 0:
        return []

    # Convert to float if necessary
    boxes = boxes.astype(float)

    # Coordinates of bounding boxes
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]

    # Area of the boxes and order by score
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        # Compute IoU of the remaining boxes with the box with the highest score
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0, xx2 - xx1 + 1)
        h = np.maximum(0, yy2 - yy1 + 1)

        inter = w * h
        iou = inter / (areas[i] + areas[order[1:]] - inter)

        # Suppress boxes with IoU above threshold
        inds = np.where(iou <= threshold)[0]
        order = order[inds + 1]

    return keep

# 이미지 파일 경로
main_image_path = "path_to_main_image.png"
template_image_path = "path_to_template_image.png"

# 이미지 읽기
main_img_cv = cv2.imread(main_image_path)
template_img_cv = cv2.imread(template_image_path)

# 템플릿 매칭 수행
result = cv2.matchTemplate(main_img_cv, template_img_cv, cv2.TM_CCOEFF_NORMED)

# 확률 분포를 시각화
plt.figure(figsize=(10, 6))
plt.imshow(result, cmap='hot', interpolation='nearest')
plt.colorbar(label='Matching Probability')
plt.title("Template Matching Probability Distribution")
plt.axis('off')
plt.show()

# 확률이 0.2 이상인 영역을 찾고 바운딩 박스를 추가
threshold = 0.2
locations = np.where(result >= threshold)
boxes = []
scores = []

for pt in zip(*locations[::-1]):
    match_score = result[pt[1], pt[0]]
    if match_score >= 0.2:
        # Define bounding box as (x1, y1, x2, y2)
        box = [pt[0], pt[1], pt[0] + template_img_cv.shape[1], pt[1] + template_img_cv.shape[0]]
        boxes.append(box)
        scores.append(match_score)

# numpy 배열로 변환
boxes_np = np.array(boxes)
scores_np = np.array(scores)

# 비최대 억제 적용
keep_indices = non_max_suppression(boxes_np, scores_np, threshold=0.3)

# 결과 이미지에 바운딩 박스와 확률 표시
main_img_cv = cv2.imread(main_image_path)
for idx in keep_indices:
    box = boxes_np[idx]
    score = scores_np[idx]
    top_left = (int(box[0]), int(box[1]))
    bottom_right = (int(box[2]), int(box[3]))
    
    # 확률에 따라 두께 조절
    thickness = 5 if score >= 0.8 else 2
    cv2.rectangle(main_img_cv, top_left, bottom_right, (0, 255, 0), thickness)
    
    # 매칭 점수 표시
    cv2.putText(main_img_cv, f"{score:.2%}", (top_left[0], top_left[1] - 10), 
                cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)

# 결과 이미지 출력
main_img_rgb = cv2.cvtColor(main_img_cv, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 8))
plt.imshow(main_img_rgb)
plt.axis("off")
plt.show()

코드 설명은 아래와 같습니다.

non_max_suppression 함수는 겹치는 바운딩 박스를 제거하는 역할을 합니다.
템플릿 매칭을 통해 result 배열에 각 좌표의 매칭 확률을 저장합니다.
매칭 확률이 20% 이상인 좌표에 대해 바운딩 박스를 그리고, Non-Max Suppression을 적용하여 중복된 박스를 제거합니다.
최종 이미지에 확률 80% 이상의 박스는 굵게 표시하고, 각 영역의 매칭 확률을 텍스트로 표시합니다.

지금까지 간단한 템플릿 매칭을 통해서 월리를 찾아봤습니다. 다음은 템플릿 이미지를 모르는 상태에서도 찾을 수 있도록 시도해보겠습니다.