[FastAPI + PyTorch] 컴퓨터비전 모델 기반 이미지 분석 API 서버 구축하기

오늘은 EfficientNet B0 모델을 활용해서 이미지를 분류하고, 그 결과를 MongoDB에 저장하는 API 서버를 구축해보았다.

FastAPI + PyTorch + MongoDB 조합으로, 이미지를 업로드하면 모델이 추론한 Top-K 분류 결과를 반환하는 구조!

이 내용이 오늘 시험 내용이었다!

1. 프로젝트 구조

test/
├── app/
│   ├── main.py       # FastAPI 앱 및 API 엔드포인트
│   ├── models.py     # EfficientNet B0 추론 로직
│   ├── schemas.py    # Pydantic 응답 스키마
│   └── db.py         # MongoDB 비동기 연결
├── uploads/          # 업로드 이미지 저장
└── requirements.txt

핵심은 역할 분리이다.

models.py → 모델 로드 & 추론만 담당
db.py → DB 연결만 담당
schemas.py → 응답 형식 정의
main.py → 이 모든 것을 조합해서 API 엔드포인트를 구성

2. 모델 설정 — EfficientNet B0

from torchvision import models
from torchvision.models import EfficientNet_B0_Weights

weights = EfficientNet_B0_Weights.DEFAULT
model = models.efficientnet_b0(weights=weights).to(device).eval()

process = weights.transforms()
categories = list(weights.meta["categories"])

EfficientNet B0은 Google에서 개발한 경량 이미지 분류 모델로, ImageNet으로 사전학습되어 1,000개 클래스를 분류할 수 있다.

여기서 포인트는:

weights.transforms() → 모델 전용 전처리를 자동으로 가져옴 (리사이즈, 정규화 등을 직접 안 해도 됨)
weights.meta["categories"] → 1,000개 클래스 라벨을 바로 가져올 수 있음
torch.inference_mode() → 추론 시 gradient 계산을 끄고 메모리 절약

def infer_topk_efficientnet_b0(pil_img, top_k=3):
    with torch.inference_mode():
        x = process(img).unsqueeze(0).to(device)
        logits = model(x)
        probs = torch.softmax(logits, dim=1)[0]
        scores, indices = torch.topk(probs, k)

    return [{"label": categories[idx], "score": score} for score, idx in zip(...)]

softmax로 확률을 구하고, topk로 상위 K개만 뽑아서 반환하는 깔끔한 구조이다.

3. API 엔드포인트

POST /inference — 이미지 업로드 & 추론

@app.post("/inference")
async def create_inference(
    file: UploadFile = File(...),
    model_name: str = Form(...),
    top_k: int = Form(3)
):

multipart/form-data로 이미지 파일과 파라미터를 함께 받는다.

유효성 검사를 꼼꼼하게 넣었다:

model_name이 efficientnet_b0가 아니면 → 400 에러
top_k가 0 이하이거나 50,000 초과 → 400 에러
Content-Type이 image/*가 아니면 → 400 에러
PIL로 열 수 없는 파일이면 → 400 에러

추론 성공 시 결과를 MongoDB에 저장하고, 생성된 문서의 ID와 분류 결과를 응답으로 반환한다.

GET /inference/{id} — 단건 조회

@app.get("/inference/{id}")
async def get_inference(id: str):
    oid = _safe_objectid(id)  # ObjectId 유효성 검사
    doc = await col.find_one({"_id": oid})

MongoDB ObjectId로 특정 추론 결과를 조회한다. 잘못된 형식의 ID가 오면 400, 없는 ID면 404를 반환한다.

GET /inference — 목록 조회 (페이지네이션)

@app.get("/inference")
async def list_inference(
    skip: int = Query(0, ge=0),
    limit: int = Query(10, ge=1, le=50),
    model_name: Optional[str] = Query(None),
):

skip/limit 기반 페이지네이션과 model_name 필터링을 지원한다. 결과는 created_at 기준 최신순 정렬이다.

4. MongoDB 연결 — Motor 비동기 드라이버

from motor.motor_asyncio import AsyncIOMotorClient

client = AsyncIOMotorClient(url)
database = client['vision_exam']

def get_collection():
    return database['inference_results']

[문제 발생]

처음에 pymongo.MongoClient를 사용했더니 아래 에러가 발생했다.

TypeError: name must be an instance of str

[원인 분석]

두 가지 문제가 있었다.

동기 드라이버 사용 — FastAPI에서 await col.insert_one(doc)처럼 비동기로 호출하는데, pymongo는 동기 드라이버라서 await와 호환되지 않았다.
잘못된 컬렉션 접근 방식 — 기존 코드에서 client[database][collection]으로 접근했는데, database와 collection 변수가 문자열이 아니라 이미 객체였기 때문에 name must be an instance of str 에러가 발생했다.

[해결 방법]

pymongo.MongoClient → motor.motor_asyncio.AsyncIOMotorClient로 변경
get_collection()에서 database['inference_results']로 직접 컬렉션 반환

5. MongoDB 스키마

데이터베이스: vision_exam / 컬렉션: inference_results

{
  "_id": ObjectId("679e1a2b3c4d5e6f7a8b9c0d"),
  "original_filename": "dog.jpg",
  "saved_path": "uploads/ea4f262aa1034a1b872bb67955e7b509.jpg",
  "model_name": "efficientnet_b0",
  "topk": [
    { "label": "golden retriever", "score": 0.8523 },
    { "label": "Labrador retriever", "score": 0.0712 },
    { "label": "cocker spaniel", "score": 0.0298 }
  ],
  "created_at": ISODate("2025-02-09T12:00:00.000Z")
}

필드	설명
`original_filename`	업로드된 원본 파일명
`saved_path`	서버 저장 경로 (UUID로 이름 변경)
`model_name`	사용된 모델명
`topk`	상위 K개 분류 결과 (label + score)
`created_at`	추론 생성 시각 (UTC)

[ 오늘의 배움 ]

Motor vs PyMongo — FastAPI처럼 비동기 프레임워크에서는 반드시 motor(비동기 MongoDB 드라이버)를 사용해야 한다. pymongo는 동기 드라이버라 await와 함께 쓰면 에러가 발생한다.
TorchVision의 편의 기능 — weights.transforms()와 weights.meta["categories"]를 활용하면 전처리와 라벨 매핑을 직접 구현하지 않아도 된다.
유효성 검사의 중요성 — 모델명, 파일 타입, top_k 범위 등을 사전에 검증해서 불필요한 추론 연산을 방지할 수 있다.
역할 분리 — 모델, DB, 스키마, 라우터를 각각 분리하면 코드 관리가 훨씬 수월해진다.

저작자표시 비영리 변경금지 (새창열림)

'개발 기록실 > 실험 & 구현' 카테고리의 다른 글

[YOLOv8 + RNN] 편의점/매장 이상행동(전도·파손) 탐지 파이프라인 만들기 (0)	2026.03.09
[OpenCV + Machine Learning] Kaggle 주조 제품 불량 이미지를 이용한 Random Forest 분류기 만들기 (0)	2026.03.09
[React-Native] 사진 업로드 시 EXIF 위치 정보 자동 추출 (0)	2026.01.29
[ React-Native ] 카카오 로그인 구현하기 (1)	2026.01.27
A Multi-label Hate Speech Detection Dataset 직접 모델 구현 코드 작성 해보기 (0)	2026.01.22