MNIST 데이터셋 로더

개발/AI,ML,ALGORITHM 2023. 7. 19. 11:38

이후 구현할 머신러닝 모델 구현을 위해 데이터 셋 로더 함수를 만들어 보았다.
이 함수는 데이터 셋 이미지들(트레이닝, 테스트)을 로딩하여 이미지, 라벨(해당 이미지 숫자) 리스트를 반환하도록 만들어 보았다.

//
//  MNIST 데이터셋 로더
//
//  Created by netcanis on 2023/07/20.
//


import cv2
import os
import numpy as np

#='data/MNIST'
def load_dataset(path):
    # 데이터셋 경로
    training_set_path = os.path.join(path, 'training_set')
    test_set_path = os.path.join(path, 'test_set')
    
    # Load the images from the training set
    training_images = []
    training_labels = []
    for digit_folder in os.listdir(training_set_path):
        if os.path.isdir(os.path.join(training_set_path, digit_folder)):
            label = int(digit_folder)
            for index, image_file in enumerate(os.listdir(os.path.join(training_set_path, digit_folder))):
                if image_file.endswith('.png') or image_file.endswith('.jpg'):
                    image = cv2.imread(os.path.join(training_set_path, digit_folder, image_file))
                    image = cv2.resize(image, (28, 28))

                    # Convert color image to grayscale if necessary
                    if image.shape[2] > 1:
                        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

                    # Print image path
                    print(str(index).zfill(5) + " " + os.path.join(training_set_path, digit_folder, image_file))

                    training_images.append(image)
                    training_labels.append(label)

    # Load the images from the test set
    test_images = []
    test_labels = []
    for digit_folder in os.listdir(test_set_path):
        if os.path.isdir(os.path.join(test_set_path, digit_folder)):
            label = int(digit_folder)
            for index, image_file in enumerate(os.listdir(os.path.join(test_set_path, digit_folder))):
                if image_file.endswith('.png') or image_file.endswith('.jpg'):
                    image = cv2.imread(os.path.join(test_set_path, digit_folder, image_file))
                    image = cv2.resize(image, (28, 28))

                    # Convert color image to grayscale if necessary
                    if image.shape[2] > 1:
                        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

                    # Print image path
                    print(str(index).zfill(5) + " " + os.path.join(test_set_path, digit_folder, image_file))

                    test_images.append(image)
                    test_labels.append(label)

    # Convert lists to numpy arrays
    training_images = np.array(training_images)
    training_labels = np.array(training_labels)
    test_images = np.array(test_images)
    test_labels = np.array(test_labels)

    # Print the image shapes
    # Training Images shape: (60000, 28, 28) or (20000, 32, 32)
    # Test Images shape: (10000, 28, 28) or (4000, 32, 32)
    print("Training Images shape:", training_images.shape)
    print("Test Images shape:", test_images.shape)

    return training_images, training_labels, test_images, test_labels

2023.07.19 - [AI] - MNIST 데이터셋 다운로드

2023.07.19 - [AI] - MNIST 데이터셋을 이미지 파일로 복원

2023.07.19 - [AI] - MNIST 데이터셋 로더

2023.07.19 - [AI] - MNIST 모델 테스터

2023.07.19 - [AI] - MINST - SVC(Support Vector Classifier)

2023.07.19 - [AI] - MNIST - RandomForestClassifier

2023.07.19 - [AI] - MNIST - Keras

2023.07.19 - [AI] - MNIST - TensorFlowLite

저작자표시 비영리 동일조건 (새창열림)

'개발 > AI,ML,ALGORITHM' 카테고리의 다른 글

MINST - SVC(Support Vector Classifier) (0)	2023.07.19
MNIST 모델 테스터 (0)	2023.07.19
MNIST 데이터셋을 이미지 파일로 복원 (0)	2023.07.19
MNIST 데이터셋 다운로드 (0)	2023.07.19
Neural Network (XOR) (0)	2022.11.18

SKY STORY

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

MNIST 데이터셋 로더

'개발 > AI,ML,ALGORITHM' 카테고리의 다른 글

공지사항

카테고리

태그목록

글 보관함

달력

링크

SKY STORY

LATEST FROM OUR BLOG

LATEST COMMENTS

BLOG VISITORS

티스토리툴바