
이후 구현할 머신러닝 모델 구현을 위해 데이터 셋 로더 함수를 만들어 보았다.
이 함수는 데이터 셋 이미지들(트레이닝, 테스트)을 로딩하여 이미지, 라벨(해당 이미지 숫자) 리스트를 반환하도록 만들어 보았다.

//  MNIST 데이터셋 로더
//  Created by netcanis on 2023/07/20.

import cv2
import os
import numpy as np

def load_dataset(path):
    # 데이터셋 경로
    training_set_path = os.path.join(path, 'training_set')
    test_set_path = os.path.join(path, 'test_set')
    # Load the images from the training set
    training_images = []
    training_labels = []
    for digit_folder in os.listdir(training_set_path):
        if os.path.isdir(os.path.join(training_set_path, digit_folder)):
            label = int(digit_folder)
            for index, image_file in enumerate(os.listdir(os.path.join(training_set_path, digit_folder))):
                if image_file.endswith('.png') or image_file.endswith('.jpg'):
                    image = cv2.imread(os.path.join(training_set_path, digit_folder, image_file))
                    image = cv2.resize(image, (28, 28))

                    # Convert color image to grayscale if necessary
                    if image.shape[2] > 1:
                        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

                    # Print image path
                    print(str(index).zfill(5) + " " + os.path.join(training_set_path, digit_folder, image_file))


    # Load the images from the test set
    test_images = []
    test_labels = []
    for digit_folder in os.listdir(test_set_path):
        if os.path.isdir(os.path.join(test_set_path, digit_folder)):
            label = int(digit_folder)
            for index, image_file in enumerate(os.listdir(os.path.join(test_set_path, digit_folder))):
                if image_file.endswith('.png') or image_file.endswith('.jpg'):
                    image = cv2.imread(os.path.join(test_set_path, digit_folder, image_file))
                    image = cv2.resize(image, (28, 28))

                    # Convert color image to grayscale if necessary
                    if image.shape[2] > 1:
                        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

                    # Print image path
                    print(str(index).zfill(5) + " " + os.path.join(test_set_path, digit_folder, image_file))


    # Convert lists to numpy arrays
    training_images = np.array(training_images)
    training_labels = np.array(training_labels)
    test_images = np.array(test_images)
    test_labels = np.array(test_labels)

    # Print the image shapes
    # Training Images shape: (60000, 28, 28) or (20000, 32, 32)
    # Test Images shape: (10000, 28, 28) or (4000, 32, 32)
    print("Training Images shape:", training_images.shape)
    print("Test Images shape:", test_images.shape)

    return training_images, training_labels, test_images, test_labels

2023.07.19 - [AI] - MNIST 데이터셋 다운로드

2023.07.19 - [AI] - MNIST 데이터셋을 이미지 파일로 복원

2023.07.19 - [AI] - MNIST 데이터셋 로더

2023.07.19 - [AI] - MNIST 모델 테스터

2023.07.19 - [AI] - MINST - SVC(Support Vector Classifier)

2023.07.19 - [AI] - MNIST - RandomForestClassifier

2023.07.19 - [AI] - MNIST - Keras

2023.07.19 - [AI] - MNIST - TensorFlowLite





'개발 > AI,ML,ALGORITHM' 카테고리의 다른 글

MINST - SVC(Support Vector Classifier)  (0) 2023.07.19
MNIST 모델 테스터  (0) 2023.07.19
MNIST 데이터셋을 이미지 파일로 복원  (0) 2023.07.19
MNIST 데이터셋 다운로드  (0) 2023.07.19
Neural Network (XOR)  (0) 2022.11.18
블로그 이미지

