GoogLeNet (Inception Module)
- Inception Module이 연결되어 있는 구조이다.
- 큰 사이즈의 receptive field가 제공하는 장점은 수용. Weight 파라미터의 수를 줄일 수 있는 네트워크를 구성한다.
- 여러 가지 사이즈의 필터들을 한번에 결합하는 방식 제공 (inception module)
- 서로 다른 사이즈의 필터들을 이용해서 feature들을 추출한 뒤 결합 -> 더 성능 좋은 feature extractor layer 구성 가능.
- 1x1 convolution을 적용하여 파라미터 수를 감소 한다. -> 1x1 convolution은 선행 layer의 특징을 함축적으로 표현하면서 파라미터 수를 줄이는 차원 축소 역할을 수행한다.
1x1 convolution
- 사이즈는 그대로 가는데, 왜 적용하는가? 비선형성을 강화시킨다.
- feature map의 채널 수를 감소(압축)시킨다. -> 즉, 연산량과 파라미터 수 감소시킨다.
import numpy as np
import pandas as pd
import os
Keras는 Inception v1, v2는 Pretrained 모델로 제공하지 않음.
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense , Conv2D , Dropout , Flatten , Activation, MaxPooling2D , GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adam , RMSprop
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.callbacks import ReduceLROnPlateau , EarlyStopping , ModelCheckpoint , LearningRateScheduler
from tensorflow.keras.layers import Concatenate
def inception_module(x, filters_1x1, filters_3x3_reduce, filters_3x3, filters_5x5_reduce, filters_5x5,
filters_pool_proj, name=None):
'''
x: 입력 Tensor
filters_1x1: 단독 1x1 필터수
filters_3x3_reduce: 3x3 Conv 적용 전 1x1 Conv 필터수
filters_3x3: 3x3 Conv 필터수
filters_5x5_reduce: 5x5 Conv 적용 전 1x1 Conv 필터수
filters_5x5: 5x5 Conv 필터수
filters_pool_prj: MaxPooling 적용 후 1x1 Conv 필터수
'''
# 첫번째 1x1 Conv
conv_1x1 = Conv2D(filters_1x1, (1, 1), padding='same', activation='relu')(x)
# 3x3 적용 전 1x1 conv -> 3x3 Conv
conv_3x3 = Conv2D(filters_3x3_reduce, (1, 1), padding='same', activation='relu')(x)
conv_3x3 = Conv2D(filters_3x3, (3, 3), padding='same', activation='relu')(conv_3x3)
# 5x5 적용 전 1x1 Conv -> 3x3 Conv
conv_5x5 = Conv2D(filters_5x5_reduce, (1, 1), padding='same', activation='relu')(x)
conv_5x5 = Conv2D(filters_5x5, (5, 5), padding='same', activation='relu')(conv_5x5)
pool_proj = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(x)
pool_proj = Conv2D(filters_pool_proj, (1, 1), padding='same', activation='relu')(pool_proj)
# 단독 1x1 결과, 3x3 결과, 5x5 결과, pool이후 1x1 결과 feature map을 채널 기준으로 Concat 적용.
output = Concatenate(axis=-1, name=name)([conv_1x1, conv_3x3, conv_5x5, pool_proj])
return output
Inception Module 구조 확인하기
input_tensor = Input(shape=(224, 224, 3))
x = Conv2D(64, (7, 7), padding='same', strides=(2, 2), activation='relu', name='conv_1_7x7/2')(input_tensor)
x = MaxPooling2D((3, 3), padding='same', strides=(2, 2), name='max_pool_1_3x3/2')(x)
x = Conv2D(64, (1, 1), padding='same', strides=(1, 1), activation='relu', name='conv_2a_3x3/1')(x)
x = Conv2D(192, (3, 3), padding='same', strides=(1, 1), activation='relu', name='conv_2b_3x3/1')(x)
x = MaxPooling2D((3, 3), padding='same', strides=(2, 2), name='max_pool_2_3x3/2')(x)
x = inception_module(x, filters_1x1=64,
filters_3x3_reduce=96,
filters_3x3=128,
filters_5x5_reduce=16,
filters_5x5=32,
filters_pool_proj=32,
name='inception_3a')
model = Model(inputs=input_tensor, outputs=x)
model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 224, 224, 3 0 []
)]
conv_1_7x7/2 (Conv2D) (None, 112, 112, 64 9472 ['input_1[0][0]']
)
max_pool_1_3x3/2 (MaxPooling2D (None, 56, 56, 64) 0 ['conv_1_7x7/2[0][0]']
)
conv_2a_3x3/1 (Conv2D) (None, 56, 56, 64) 4160 ['max_pool_1_3x3/2[0][0]']
conv_2b_3x3/1 (Conv2D) (None, 56, 56, 192) 110784 ['conv_2a_3x3/1[0][0]']
max_pool_2_3x3/2 (MaxPooling2D (None, 28, 28, 192) 0 ['conv_2b_3x3/1[0][0]']
)
conv2d_1 (Conv2D) (None, 28, 28, 96) 18528 ['max_pool_2_3x3/2[0][0]']
conv2d_3 (Conv2D) (None, 28, 28, 16) 3088 ['max_pool_2_3x3/2[0][0]']
max_pooling2d (MaxPooling2D) (None, 28, 28, 192) 0 ['max_pool_2_3x3/2[0][0]']
conv2d (Conv2D) (None, 28, 28, 64) 12352 ['max_pool_2_3x3/2[0][0]']
conv2d_2 (Conv2D) (None, 28, 28, 128) 110720 ['conv2d_1[0][0]']
conv2d_4 (Conv2D) (None, 28, 28, 32) 12832 ['conv2d_3[0][0]']
conv2d_5 (Conv2D) (None, 28, 28, 32) 6176 ['max_pooling2d[0][0]']
inception_3a (Concatenate) (None, 28, 28, 256) 0 ['conv2d[0][0]',
'conv2d_2[0][0]',
'conv2d_4[0][0]',
'conv2d_5[0][0]']
==================================================================================================
Total params: 288,112
Trainable params: 288,112
Non-trainable params: 0
__________________________________________________________________________________________________
inception module을 이용하여 GoogLeNet 생성
def create_googlenet(in_shape=(224, 224, 3), n_classes=10):
input_tensor = Input(in_shape)
x = Conv2D(64, (7, 7), padding='same', strides=(2, 2), activation='relu', name='conv_1_7x7/2')(input_tensor)
x = MaxPooling2D((3, 3), padding='same', strides=(2, 2), name='max_pool_1_3x3/2')(x)
x = Conv2D(64, (1, 1), padding='same', strides=(1, 1), activation='relu', name='conv_2a_3x3/1')(x)
x = Conv2D(192, (3, 3), padding='same', strides=(1, 1), activation='relu', name='conv_2b_3x3/1')(x)
x = MaxPooling2D((3, 3), padding='same', strides=(2, 2), name='max_pool_2_3x3/2')(x)
# 첫번째 inception 모듈
x = inception_module(x, filters_1x1=64,
filters_3x3_reduce=96,
filters_3x3=128,
filters_5x5_reduce=16,
filters_5x5=32,
filters_pool_proj=32,
name='inception_3a')
# 두번째 inception 모듈
x = inception_module(x,
filters_1x1=128,
filters_3x3_reduce=128,
filters_3x3=192,
filters_5x5_reduce=32,
filters_5x5=96,
filters_pool_proj=64,
name='inception_3b')
x = MaxPooling2D((3, 3), padding='same', strides=(2, 2), name='max_pool_3_3x3/2')(x)
# 세번째 inception 모듈
x = inception_module(x,
filters_1x1=192,
filters_3x3_reduce=96,
filters_3x3=208,
filters_5x5_reduce=16,
filters_5x5=48,
filters_pool_proj=64,
name='inception_4a')
# 네번째 inception 모듈
x = inception_module(x,
filters_1x1=160,
filters_3x3_reduce=112,
filters_3x3=224,
filters_5x5_reduce=24,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4b')
# 다섯번째 inception 모듈
x = inception_module(x,
filters_1x1=128,
filters_3x3_reduce=128,
filters_3x3=256,
filters_5x5_reduce=24,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4c')
# 여섯번째 inception 모듈
x = inception_module(x,
filters_1x1=112,
filters_3x3_reduce=144,
filters_3x3=288,
filters_5x5_reduce=32,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4d')
# 일곱번째 inception 모듈
x = inception_module(x,
filters_1x1=256,
filters_3x3_reduce=160,
filters_3x3=320,
filters_5x5_reduce=32,
filters_5x5=128,
filters_pool_proj=128,
name='inception_4e')
x = MaxPooling2D((3, 3), padding='same', strides=(2, 2), name='max_pool_4_3x3/2')(x)
# 여덟번째 inception 모듈
x = inception_module(x,
filters_1x1=256,
filters_3x3_reduce=160,
filters_3x3=320,
filters_5x5_reduce=32,
filters_5x5=128,
filters_pool_proj=128,
name='inception_5a')
# 아홉번째 inception 모듈
x = inception_module(x,
filters_1x1=384,
filters_3x3_reduce=192,
filters_3x3=384,
filters_5x5_reduce=48,
filters_5x5=128,
filters_pool_proj=128,
name='inception_5b')
x = GlobalAveragePooling2D(name='avg_pool_5_3x3/1')(x)
x = Dropout(0.5)(x)
output = Dense(n_classes, activation='softmax', name='output')(x)
model = Model(inputs=input_tensor, outputs=output)
model.summary()
return model
model = create_googlenet((224, 224, 3), n_classes=10)
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, 224, 224, 3 0 []
)]
conv_1_7x7/2 (Conv2D) (None, 112, 112, 64 9472 ['input_2[0][0]']
)
max_pool_1_3x3/2 (MaxPooling2D (None, 56, 56, 64) 0 ['conv_1_7x7/2[0][0]']
)
conv_2a_3x3/1 (Conv2D) (None, 56, 56, 64) 4160 ['max_pool_1_3x3/2[0][0]']
conv_2b_3x3/1 (Conv2D) (None, 56, 56, 192) 110784 ['conv_2a_3x3/1[0][0]']
max_pool_2_3x3/2 (MaxPooling2D (None, 28, 28, 192) 0 ['conv_2b_3x3/1[0][0]']
)
conv2d_7 (Conv2D) (None, 28, 28, 96) 18528 ['max_pool_2_3x3/2[0][0]']
conv2d_9 (Conv2D) (None, 28, 28, 16) 3088 ['max_pool_2_3x3/2[0][0]']
max_pooling2d_1 (MaxPooling2D) (None, 28, 28, 192) 0 ['max_pool_2_3x3/2[0][0]']
conv2d_6 (Conv2D) (None, 28, 28, 64) 12352 ['max_pool_2_3x3/2[0][0]']
conv2d_8 (Conv2D) (None, 28, 28, 128) 110720 ['conv2d_7[0][0]']
conv2d_10 (Conv2D) (None, 28, 28, 32) 12832 ['conv2d_9[0][0]']
conv2d_11 (Conv2D) (None, 28, 28, 32) 6176 ['max_pooling2d_1[0][0]']
inception_3a (Concatenate) (None, 28, 28, 256) 0 ['conv2d_6[0][0]',
'conv2d_8[0][0]',
'conv2d_10[0][0]',
'conv2d_11[0][0]']
conv2d_13 (Conv2D) (None, 28, 28, 128) 32896 ['inception_3a[0][0]']
conv2d_15 (Conv2D) (None, 28, 28, 32) 8224 ['inception_3a[0][0]']
max_pooling2d_2 (MaxPooling2D) (None, 28, 28, 256) 0 ['inception_3a[0][0]']
conv2d_12 (Conv2D) (None, 28, 28, 128) 32896 ['inception_3a[0][0]']
conv2d_14 (Conv2D) (None, 28, 28, 192) 221376 ['conv2d_13[0][0]']
conv2d_16 (Conv2D) (None, 28, 28, 96) 76896 ['conv2d_15[0][0]']
conv2d_17 (Conv2D) (None, 28, 28, 64) 16448 ['max_pooling2d_2[0][0]']
inception_3b (Concatenate) (None, 28, 28, 480) 0 ['conv2d_12[0][0]',
'conv2d_14[0][0]',
'conv2d_16[0][0]',
'conv2d_17[0][0]']
max_pool_3_3x3/2 (MaxPooling2D (None, 14, 14, 480) 0 ['inception_3b[0][0]']
)
conv2d_19 (Conv2D) (None, 14, 14, 96) 46176 ['max_pool_3_3x3/2[0][0]']
conv2d_21 (Conv2D) (None, 14, 14, 16) 7696 ['max_pool_3_3x3/2[0][0]']
max_pooling2d_3 (MaxPooling2D) (None, 14, 14, 480) 0 ['max_pool_3_3x3/2[0][0]']
conv2d_18 (Conv2D) (None, 14, 14, 192) 92352 ['max_pool_3_3x3/2[0][0]']
conv2d_20 (Conv2D) (None, 14, 14, 208) 179920 ['conv2d_19[0][0]']
conv2d_22 (Conv2D) (None, 14, 14, 48) 19248 ['conv2d_21[0][0]']
conv2d_23 (Conv2D) (None, 14, 14, 64) 30784 ['max_pooling2d_3[0][0]']
inception_4a (Concatenate) (None, 14, 14, 512) 0 ['conv2d_18[0][0]',
'conv2d_20[0][0]',
'conv2d_22[0][0]',
'conv2d_23[0][0]']
conv2d_25 (Conv2D) (None, 14, 14, 112) 57456 ['inception_4a[0][0]']
conv2d_27 (Conv2D) (None, 14, 14, 24) 12312 ['inception_4a[0][0]']
max_pooling2d_4 (MaxPooling2D) (None, 14, 14, 512) 0 ['inception_4a[0][0]']
conv2d_24 (Conv2D) (None, 14, 14, 160) 82080 ['inception_4a[0][0]']
conv2d_26 (Conv2D) (None, 14, 14, 224) 226016 ['conv2d_25[0][0]']
conv2d_28 (Conv2D) (None, 14, 14, 64) 38464 ['conv2d_27[0][0]']
conv2d_29 (Conv2D) (None, 14, 14, 64) 32832 ['max_pooling2d_4[0][0]']
inception_4b (Concatenate) (None, 14, 14, 512) 0 ['conv2d_24[0][0]',
'conv2d_26[0][0]',
'conv2d_28[0][0]',
'conv2d_29[0][0]']
conv2d_31 (Conv2D) (None, 14, 14, 128) 65664 ['inception_4b[0][0]']
conv2d_33 (Conv2D) (None, 14, 14, 24) 12312 ['inception_4b[0][0]']
max_pooling2d_5 (MaxPooling2D) (None, 14, 14, 512) 0 ['inception_4b[0][0]']
conv2d_30 (Conv2D) (None, 14, 14, 128) 65664 ['inception_4b[0][0]']
conv2d_32 (Conv2D) (None, 14, 14, 256) 295168 ['conv2d_31[0][0]']
conv2d_34 (Conv2D) (None, 14, 14, 64) 38464 ['conv2d_33[0][0]']
conv2d_35 (Conv2D) (None, 14, 14, 64) 32832 ['max_pooling2d_5[0][0]']
inception_4c (Concatenate) (None, 14, 14, 512) 0 ['conv2d_30[0][0]',
'conv2d_32[0][0]',
'conv2d_34[0][0]',
'conv2d_35[0][0]']
conv2d_37 (Conv2D) (None, 14, 14, 144) 73872 ['inception_4c[0][0]']
conv2d_39 (Conv2D) (None, 14, 14, 32) 16416 ['inception_4c[0][0]']
max_pooling2d_6 (MaxPooling2D) (None, 14, 14, 512) 0 ['inception_4c[0][0]']
conv2d_36 (Conv2D) (None, 14, 14, 112) 57456 ['inception_4c[0][0]']
conv2d_38 (Conv2D) (None, 14, 14, 288) 373536 ['conv2d_37[0][0]']
conv2d_40 (Conv2D) (None, 14, 14, 64) 51264 ['conv2d_39[0][0]']
conv2d_41 (Conv2D) (None, 14, 14, 64) 32832 ['max_pooling2d_6[0][0]']
inception_4d (Concatenate) (None, 14, 14, 528) 0 ['conv2d_36[0][0]',
'conv2d_38[0][0]',
'conv2d_40[0][0]',
'conv2d_41[0][0]']
conv2d_43 (Conv2D) (None, 14, 14, 160) 84640 ['inception_4d[0][0]']
conv2d_45 (Conv2D) (None, 14, 14, 32) 16928 ['inception_4d[0][0]']
max_pooling2d_7 (MaxPooling2D) (None, 14, 14, 528) 0 ['inception_4d[0][0]']
conv2d_42 (Conv2D) (None, 14, 14, 256) 135424 ['inception_4d[0][0]']
conv2d_44 (Conv2D) (None, 14, 14, 320) 461120 ['conv2d_43[0][0]']
conv2d_46 (Conv2D) (None, 14, 14, 128) 102528 ['conv2d_45[0][0]']
conv2d_47 (Conv2D) (None, 14, 14, 128) 67712 ['max_pooling2d_7[0][0]']
inception_4e (Concatenate) (None, 14, 14, 832) 0 ['conv2d_42[0][0]',
'conv2d_44[0][0]',
'conv2d_46[0][0]',
'conv2d_47[0][0]']
max_pool_4_3x3/2 (MaxPooling2D (None, 7, 7, 832) 0 ['inception_4e[0][0]']
)
conv2d_49 (Conv2D) (None, 7, 7, 160) 133280 ['max_pool_4_3x3/2[0][0]']
conv2d_51 (Conv2D) (None, 7, 7, 32) 26656 ['max_pool_4_3x3/2[0][0]']
max_pooling2d_8 (MaxPooling2D) (None, 7, 7, 832) 0 ['max_pool_4_3x3/2[0][0]']
conv2d_48 (Conv2D) (None, 7, 7, 256) 213248 ['max_pool_4_3x3/2[0][0]']
conv2d_50 (Conv2D) (None, 7, 7, 320) 461120 ['conv2d_49[0][0]']
conv2d_52 (Conv2D) (None, 7, 7, 128) 102528 ['conv2d_51[0][0]']
conv2d_53 (Conv2D) (None, 7, 7, 128) 106624 ['max_pooling2d_8[0][0]']
inception_5a (Concatenate) (None, 7, 7, 832) 0 ['conv2d_48[0][0]',
'conv2d_50[0][0]',
'conv2d_52[0][0]',
'conv2d_53[0][0]']
conv2d_55 (Conv2D) (None, 7, 7, 192) 159936 ['inception_5a[0][0]']
conv2d_57 (Conv2D) (None, 7, 7, 48) 39984 ['inception_5a[0][0]']
max_pooling2d_9 (MaxPooling2D) (None, 7, 7, 832) 0 ['inception_5a[0][0]']
conv2d_54 (Conv2D) (None, 7, 7, 384) 319872 ['inception_5a[0][0]']
conv2d_56 (Conv2D) (None, 7, 7, 384) 663936 ['conv2d_55[0][0]']
conv2d_58 (Conv2D) (None, 7, 7, 128) 153728 ['conv2d_57[0][0]']
conv2d_59 (Conv2D) (None, 7, 7, 128) 106624 ['max_pooling2d_9[0][0]']
inception_5b (Concatenate) (None, 7, 7, 1024) 0 ['conv2d_54[0][0]',
'conv2d_56[0][0]',
'conv2d_58[0][0]',
'conv2d_59[0][0]']
avg_pool_5_3x3/1 (GlobalAverag (None, 1024) 0 ['inception_5b[0][0]']
ePooling2D)
dropout (Dropout) (None, 1024) 0 ['avg_pool_5_3x3/1[0][0]']
output (Dense) (None, 10) 10250 ['dropout[0][0]']
==================================================================================================
Total params: 5,983,802
Trainable params: 5,983,802
Non-trainable params: 0
__________________________________________________________________________________________________
GoogLeNet으로 CIFAR10 데이터세트 학습 및 성능 테스트
IMAGE_SIZE = 128
BATCH_SIZE = 64
import tensorflow as tf
import numpy as np
import pandas as pd
import random as python_random
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import Sequence
import cv2
import sklearn
def zero_one_scaler(image):
return image/255.0
def get_preprocessed_ohe(images, labels, pre_func=None):
# preprocessing 함수가 입력되면 이를 이용하여 image array를 scaling 적용.
if pre_func is not None:
images = pre_func(images)
# OHE 적용
oh_labels = to_categorical(labels)
return images, oh_labels
# 학습/검증/테스트 데이터 세트에 전처리 및 OHE 적용한 뒤 반환
def get_train_valid_test_set(train_images, train_labels, test_images, test_labels, valid_size=0.15, random_state=2021):
# 학습 및 테스트 데이터 세트를 0 ~ 1사이값 float32로 변경 및 OHE 적용.
train_images, train_oh_labels = get_preprocessed_ohe(train_images, train_labels)
test_images, test_oh_labels = get_preprocessed_ohe(test_images, test_labels)
# 학습 데이터를 검증 데이터 세트로 다시 분리
tr_images, val_images, tr_oh_labels, val_oh_labels = train_test_split(train_images, train_oh_labels, test_size=valid_size, random_state=random_state)
return (tr_images, tr_oh_labels), (val_images, val_oh_labels), (test_images, test_oh_labels )
from tensorflow.keras.utils import Sequence
import cv2
import sklearn
# 입력 인자 images_array labels는 모두 numpy array로 들어옴.
# 인자로 입력되는 images_array는 전체 32x32 image array임.
class CIFAR_Dataset(Sequence):
def __init__(self, images_array, labels, batch_size=BATCH_SIZE, augmentor=None, shuffle=False, pre_func=None):
'''
파라미터 설명
images_array: 원본 32x32 만큼의 image 배열값.
labels: 해당 image의 label들
batch_size: __getitem__(self, index) 호출 시 마다 가져올 데이터 batch 건수
augmentor: albumentations 객체
shuffle: 학습 데이터의 경우 epoch 종료시마다 데이터를 섞을지 여부
'''
# 객체 생성 인자로 들어온 값을 객체 내부 변수로 할당.
# 인자로 입력되는 images_array는 전체 32x32 image array임.
self.images_array = images_array
self.labels = labels
self.batch_size = batch_size
self.augmentor = augmentor
self.pre_func = pre_func
# train data의 경우
self.shuffle = shuffle
if self.shuffle:
# 객체 생성시에 한번 데이터를 섞음.
#self.on_epoch_end()
pass
# Sequence를 상속받은 Dataset은 batch_size 단위로 입력된 데이터를 처리함.
# __len__()은 전체 데이터 건수가 주어졌을 때 batch_size단위로 몇번 데이터를 반환하는지 나타남
def __len__(self):
# batch_size단위로 데이터를 몇번 가져와야하는지 계산하기 위해 전체 데이터 건수를 batch_size로 나누되, 정수로 정확히 나눠지지 않을 경우 1회를 더한다.
return int(np.ceil(len(self.labels) / self.batch_size))
# batch_size 단위로 image_array, label_array 데이터를 가져와서 변환한 뒤 다시 반환함
# 인자로 몇번째 batch 인지를 나타내는 index를 입력하면 해당 순서에 해당하는 batch_size 만큼의 데이타를 가공하여 반환
# batch_size 갯수만큼 변환된 image_array와 label_array 반환.
def __getitem__(self, index):
# index는 몇번째 batch인지를 나타냄.
# batch_size만큼 순차적으로 데이터를 가져오려면 array에서 index*self.batch_size:(index+1)*self.batch_size 만큼의 연속 데이터를 가져오면 됨
# 32x32 image array를 self.batch_size만큼 가져옴.
images_fetch = self.images_array[index*self.batch_size:(index+1)*self.batch_size]
if self.labels is not None:
label_batch = self.labels[index*self.batch_size:(index+1)*self.batch_size]
# 만일 객체 생성 인자로 albumentation으로 만든 augmentor가 주어진다면 아래와 같이 augmentor를 이용하여 image 변환
# albumentations은 개별 image만 변환할 수 있으므로 batch_size만큼 할당된 image_name_batch를 한 건씩 iteration하면서 변환 수행.
# 변환된 image 배열값을 담을 image_batch 선언. image_batch 배열은 float32 로 설정.
image_batch = np.zeros((images_fetch.shape[0], IMAGE_SIZE, IMAGE_SIZE, 3), dtype='float32')
# batch_size에 담긴 건수만큼 iteration 하면서 opencv image load -> image augmentation 변환(augmentor가 not None일 경우)-> image_batch에 담음.
for image_index in range(images_fetch.shape[0]):
#image = cv2.cvtColor(cv2.imread(image_name_batch[image_index]), cv2.COLOR_BGR2RGB)
# 원본 image를 IMAGE_SIZE x IMAGE_SIZE 크기로 변환
image = cv2.resize(images_fetch[image_index], (IMAGE_SIZE, IMAGE_SIZE))
# 만약 augmentor가 주어졌다면 이를 적용.
if self.augmentor is not None:
image = self.augmentor(image=image)['image']
# 만약 scaling 함수가 입력되었다면 이를 적용하여 scaling 수행.
if self.pre_func is not None:
image = self.pre_func(image)
# image_batch에 순차적으로 변환된 image를 담음.
image_batch[image_index] = image
return image_batch, label_batch
# epoch가 한번 수행이 완료 될 때마다 모델의 fit()에서 호출됨.
def on_epoch_end(self):
if(self.shuffle):
#print('epoch end')
# 원본 image배열과 label를 쌍을 맞춰서 섞어준다. scikt learn의 utils.shuffle에서 해당 기능 제공
self.images_array, self.labels = sklearn.utils.shuffle(self.images_array, self.labels)
else:
pass
# CIFAR10 데이터 재 로딩 및 Scaling/OHE 전처리 적용하여 학습/검증/데이터 세트 생성.
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()
print(train_images.shape, train_labels.shape, test_images.shape, test_labels.shape)
(tr_images, tr_oh_labels), (val_images, val_oh_labels), (test_images, test_oh_labels) = \
get_train_valid_test_set(train_images, train_labels, test_images, test_labels, valid_size=0.2, random_state=2021)
print(tr_images.shape, tr_oh_labels.shape, val_images.shape, val_oh_labels.shape, test_images.shape, test_oh_labels.shape)
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 6s 0us/step
170508288/170498071 [==============================] - 6s 0us/step
(50000, 32, 32, 3) (50000, 1) (10000, 32, 32, 3) (10000, 1)
(40000, 32, 32, 3) (40000, 10) (10000, 32, 32, 3) (10000, 10) (10000, 32, 32, 3) (10000, 10)
from tensorflow.keras.applications.inception_v3 import preprocess_input as inception_preprocess
tr_ds = CIFAR_Dataset(tr_images, tr_oh_labels, batch_size=BATCH_SIZE, augmentor=None, shuffle=True, pre_func=inception_preprocess)
val_ds = CIFAR_Dataset(val_images, val_oh_labels, batch_size=BATCH_SIZE, augmentor=None, shuffle=False, pre_func=inception_preprocess)
print(next(iter(tr_ds))[0].shape, next(iter(val_ds))[0].shape)
print(next(iter(tr_ds))[1].shape, next(iter(val_ds))[1].shape)
print(next(iter(tr_ds))[0][0])
(64, 128, 128, 3) (64, 128, 128, 3)
(64, 10) (64, 10)
[[[ 0.28627455 0.36470592 0.38823533]
[ 0.28627455 0.36470592 0.38823533]
[ 0.27058828 0.3411765 0.36470592]
...
[-0.42745095 -0.41960782 -0.4588235 ]
[-0.45098037 -0.4352941 -0.46666664]
[-0.45098037 -0.4352941 -0.46666664]]
[[ 0.28627455 0.36470592 0.38823533]
[ 0.28627455 0.36470592 0.38823533]
[ 0.27058828 0.3411765 0.36470592]
...
[-0.42745095 -0.41960782 -0.4588235 ]
[-0.45098037 -0.4352941 -0.46666664]
[-0.45098037 -0.4352941 -0.46666664]]
[[ 0.30196083 0.3803922 0.4039216 ]
[ 0.30196083 0.3803922 0.4039216 ]
[ 0.28627455 0.35686278 0.3803922 ]
...
[-0.42745095 -0.41960782 -0.4588235 ]
[-0.44313723 -0.4352941 -0.46666664]
[-0.44313723 -0.4352941 -0.46666664]]
...
[[ 0.827451 0.7176471 0.75686276]
[ 0.827451 0.7176471 0.75686276]
[ 0.8117647 0.69411767 0.7254902 ]
...
[-0.46666664 -0.4352941 -0.46666664]
[-0.45098037 -0.42745095 -0.4588235 ]
[-0.45098037 -0.42745095 -0.4588235 ]]
[[ 0.8352941 0.7254902 0.7647059 ]
[ 0.8352941 0.7254902 0.7647059 ]
[ 0.8117647 0.7019608 0.73333335]
...
[-0.47450978 -0.44313723 -0.47450978]
[-0.4588235 -0.4352941 -0.46666664]
[-0.4588235 -0.4352941 -0.46666664]]
[[ 0.8352941 0.7254902 0.7647059 ]
[ 0.8352941 0.7254902 0.7647059 ]
[ 0.8117647 0.7019608 0.73333335]
...
[-0.46666664 -0.44313723 -0.47450978]
[-0.4588235 -0.4352941 -0.46666664]
[-0.4588235 -0.4352941 -0.46666664]]]
gnet_model = create_googlenet(in_shape=(128, 128, 3), n_classes=10)
gnet_model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
# 5번 iteration내에 validation loss가 향상되지 않으면 learning rate을 기존 learning rate * 0.2로 줄임.
rlr_cb = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=3, mode='min', verbose=1)
ely_cb = EarlyStopping(monitor='val_loss', patience=10, mode='min', verbose=1)
history = gnet_model.fit(tr_ds, epochs=30,
#steps_per_epoch=int(np.ceil(tr_images.shape[0]/BATCH_SIZE)),
validation_data=val_ds,
#validation_steps=int(np.ceil(val_images.shape[0]/BATCH_SIZE)),
callbacks=[rlr_cb, ely_cb]
)
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) [(None, 128, 128, 3 0 []
)]
conv_1_7x7/2 (Conv2D) (None, 64, 64, 64) 9472 ['input_3[0][0]']
max_pool_1_3x3/2 (MaxPooling2D (None, 32, 32, 64) 0 ['conv_1_7x7/2[0][0]']
)
conv_2a_3x3/1 (Conv2D) (None, 32, 32, 64) 4160 ['max_pool_1_3x3/2[0][0]']
conv_2b_3x3/1 (Conv2D) (None, 32, 32, 192) 110784 ['conv_2a_3x3/1[0][0]']
max_pool_2_3x3/2 (MaxPooling2D (None, 16, 16, 192) 0 ['conv_2b_3x3/1[0][0]']
)
conv2d_61 (Conv2D) (None, 16, 16, 96) 18528 ['max_pool_2_3x3/2[0][0]']
conv2d_63 (Conv2D) (None, 16, 16, 16) 3088 ['max_pool_2_3x3/2[0][0]']
max_pooling2d_10 (MaxPooling2D (None, 16, 16, 192) 0 ['max_pool_2_3x3/2[0][0]']
)
conv2d_60 (Conv2D) (None, 16, 16, 64) 12352 ['max_pool_2_3x3/2[0][0]']
conv2d_62 (Conv2D) (None, 16, 16, 128) 110720 ['conv2d_61[0][0]']
conv2d_64 (Conv2D) (None, 16, 16, 32) 12832 ['conv2d_63[0][0]']
conv2d_65 (Conv2D) (None, 16, 16, 32) 6176 ['max_pooling2d_10[0][0]']
inception_3a (Concatenate) (None, 16, 16, 256) 0 ['conv2d_60[0][0]',
'conv2d_62[0][0]',
'conv2d_64[0][0]',
'conv2d_65[0][0]']
conv2d_67 (Conv2D) (None, 16, 16, 128) 32896 ['inception_3a[0][0]']
conv2d_69 (Conv2D) (None, 16, 16, 32) 8224 ['inception_3a[0][0]']
max_pooling2d_11 (MaxPooling2D (None, 16, 16, 256) 0 ['inception_3a[0][0]']
)
conv2d_66 (Conv2D) (None, 16, 16, 128) 32896 ['inception_3a[0][0]']
conv2d_68 (Conv2D) (None, 16, 16, 192) 221376 ['conv2d_67[0][0]']
conv2d_70 (Conv2D) (None, 16, 16, 96) 76896 ['conv2d_69[0][0]']
conv2d_71 (Conv2D) (None, 16, 16, 64) 16448 ['max_pooling2d_11[0][0]']
inception_3b (Concatenate) (None, 16, 16, 480) 0 ['conv2d_66[0][0]',
'conv2d_68[0][0]',
'conv2d_70[0][0]',
'conv2d_71[0][0]']
max_pool_3_3x3/2 (MaxPooling2D (None, 8, 8, 480) 0 ['inception_3b[0][0]']
)
conv2d_73 (Conv2D) (None, 8, 8, 96) 46176 ['max_pool_3_3x3/2[0][0]']
conv2d_75 (Conv2D) (None, 8, 8, 16) 7696 ['max_pool_3_3x3/2[0][0]']
max_pooling2d_12 (MaxPooling2D (None, 8, 8, 480) 0 ['max_pool_3_3x3/2[0][0]']
)
conv2d_72 (Conv2D) (None, 8, 8, 192) 92352 ['max_pool_3_3x3/2[0][0]']
conv2d_74 (Conv2D) (None, 8, 8, 208) 179920 ['conv2d_73[0][0]']
conv2d_76 (Conv2D) (None, 8, 8, 48) 19248 ['conv2d_75[0][0]']
conv2d_77 (Conv2D) (None, 8, 8, 64) 30784 ['max_pooling2d_12[0][0]']
inception_4a (Concatenate) (None, 8, 8, 512) 0 ['conv2d_72[0][0]',
'conv2d_74[0][0]',
'conv2d_76[0][0]',
'conv2d_77[0][0]']
conv2d_79 (Conv2D) (None, 8, 8, 112) 57456 ['inception_4a[0][0]']
conv2d_81 (Conv2D) (None, 8, 8, 24) 12312 ['inception_4a[0][0]']
max_pooling2d_13 (MaxPooling2D (None, 8, 8, 512) 0 ['inception_4a[0][0]']
)
conv2d_78 (Conv2D) (None, 8, 8, 160) 82080 ['inception_4a[0][0]']
conv2d_80 (Conv2D) (None, 8, 8, 224) 226016 ['conv2d_79[0][0]']
conv2d_82 (Conv2D) (None, 8, 8, 64) 38464 ['conv2d_81[0][0]']
conv2d_83 (Conv2D) (None, 8, 8, 64) 32832 ['max_pooling2d_13[0][0]']
inception_4b (Concatenate) (None, 8, 8, 512) 0 ['conv2d_78[0][0]',
'conv2d_80[0][0]',
'conv2d_82[0][0]',
'conv2d_83[0][0]']
conv2d_85 (Conv2D) (None, 8, 8, 128) 65664 ['inception_4b[0][0]']
conv2d_87 (Conv2D) (None, 8, 8, 24) 12312 ['inception_4b[0][0]']
max_pooling2d_14 (MaxPooling2D (None, 8, 8, 512) 0 ['inception_4b[0][0]']
)
conv2d_84 (Conv2D) (None, 8, 8, 128) 65664 ['inception_4b[0][0]']
conv2d_86 (Conv2D) (None, 8, 8, 256) 295168 ['conv2d_85[0][0]']
conv2d_88 (Conv2D) (None, 8, 8, 64) 38464 ['conv2d_87[0][0]']
conv2d_89 (Conv2D) (None, 8, 8, 64) 32832 ['max_pooling2d_14[0][0]']
inception_4c (Concatenate) (None, 8, 8, 512) 0 ['conv2d_84[0][0]',
'conv2d_86[0][0]',
'conv2d_88[0][0]',
'conv2d_89[0][0]']
conv2d_91 (Conv2D) (None, 8, 8, 144) 73872 ['inception_4c[0][0]']
conv2d_93 (Conv2D) (None, 8, 8, 32) 16416 ['inception_4c[0][0]']
max_pooling2d_15 (MaxPooling2D (None, 8, 8, 512) 0 ['inception_4c[0][0]']
)
conv2d_90 (Conv2D) (None, 8, 8, 112) 57456 ['inception_4c[0][0]']
conv2d_92 (Conv2D) (None, 8, 8, 288) 373536 ['conv2d_91[0][0]']
conv2d_94 (Conv2D) (None, 8, 8, 64) 51264 ['conv2d_93[0][0]']
conv2d_95 (Conv2D) (None, 8, 8, 64) 32832 ['max_pooling2d_15[0][0]']
inception_4d (Concatenate) (None, 8, 8, 528) 0 ['conv2d_90[0][0]',
'conv2d_92[0][0]',
'conv2d_94[0][0]',
'conv2d_95[0][0]']
conv2d_97 (Conv2D) (None, 8, 8, 160) 84640 ['inception_4d[0][0]']
conv2d_99 (Conv2D) (None, 8, 8, 32) 16928 ['inception_4d[0][0]']
max_pooling2d_16 (MaxPooling2D (None, 8, 8, 528) 0 ['inception_4d[0][0]']
)
conv2d_96 (Conv2D) (None, 8, 8, 256) 135424 ['inception_4d[0][0]']
conv2d_98 (Conv2D) (None, 8, 8, 320) 461120 ['conv2d_97[0][0]']
conv2d_100 (Conv2D) (None, 8, 8, 128) 102528 ['conv2d_99[0][0]']
conv2d_101 (Conv2D) (None, 8, 8, 128) 67712 ['max_pooling2d_16[0][0]']
inception_4e (Concatenate) (None, 8, 8, 832) 0 ['conv2d_96[0][0]',
'conv2d_98[0][0]',
'conv2d_100[0][0]',
'conv2d_101[0][0]']
max_pool_4_3x3/2 (MaxPooling2D (None, 4, 4, 832) 0 ['inception_4e[0][0]']
)
conv2d_103 (Conv2D) (None, 4, 4, 160) 133280 ['max_pool_4_3x3/2[0][0]']
conv2d_105 (Conv2D) (None, 4, 4, 32) 26656 ['max_pool_4_3x3/2[0][0]']
max_pooling2d_17 (MaxPooling2D (None, 4, 4, 832) 0 ['max_pool_4_3x3/2[0][0]']
)
conv2d_102 (Conv2D) (None, 4, 4, 256) 213248 ['max_pool_4_3x3/2[0][0]']
conv2d_104 (Conv2D) (None, 4, 4, 320) 461120 ['conv2d_103[0][0]']
conv2d_106 (Conv2D) (None, 4, 4, 128) 102528 ['conv2d_105[0][0]']
conv2d_107 (Conv2D) (None, 4, 4, 128) 106624 ['max_pooling2d_17[0][0]']
inception_5a (Concatenate) (None, 4, 4, 832) 0 ['conv2d_102[0][0]',
'conv2d_104[0][0]',
'conv2d_106[0][0]',
'conv2d_107[0][0]']
conv2d_109 (Conv2D) (None, 4, 4, 192) 159936 ['inception_5a[0][0]']
conv2d_111 (Conv2D) (None, 4, 4, 48) 39984 ['inception_5a[0][0]']
max_pooling2d_18 (MaxPooling2D (None, 4, 4, 832) 0 ['inception_5a[0][0]']
)
conv2d_108 (Conv2D) (None, 4, 4, 384) 319872 ['inception_5a[0][0]']
conv2d_110 (Conv2D) (None, 4, 4, 384) 663936 ['conv2d_109[0][0]']
conv2d_112 (Conv2D) (None, 4, 4, 128) 153728 ['conv2d_111[0][0]']
conv2d_113 (Conv2D) (None, 4, 4, 128) 106624 ['max_pooling2d_18[0][0]']
inception_5b (Concatenate) (None, 4, 4, 1024) 0 ['conv2d_108[0][0]',
'conv2d_110[0][0]',
'conv2d_112[0][0]',
'conv2d_113[0][0]']
avg_pool_5_3x3/1 (GlobalAverag (None, 1024) 0 ['inception_5b[0][0]']
ePooling2D)
dropout_1 (Dropout) (None, 1024) 0 ['avg_pool_5_3x3/1[0][0]']
output (Dense) (None, 10) 10250 ['dropout_1[0][0]']
==================================================================================================
Total params: 5,983,802
Trainable params: 5,983,802
Non-trainable params: 0
__________________________________________________________________________________________________
/usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/adam.py:105: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
super(Adam, self).__init__(name, **kwargs)
Epoch 1/30
625/625 [==============================] - 54s 67ms/step - loss: 1.8845 - accuracy: 0.2733 - val_loss: 1.5726 - val_accuracy: 0.3993 - lr: 0.0010
Epoch 2/30
625/625 [==============================] - 41s 66ms/step - loss: 1.4460 - accuracy: 0.4575 - val_loss: 1.3108 - val_accuracy: 0.5207 - lr: 0.0010
Epoch 3/30
625/625 [==============================] - 41s 66ms/step - loss: 1.1848 - accuracy: 0.5724 - val_loss: 1.0936 - val_accuracy: 0.6095 - lr: 0.0010
Epoch 4/30
625/625 [==============================] - 41s 66ms/step - loss: 1.0123 - accuracy: 0.6389 - val_loss: 0.9275 - val_accuracy: 0.6649 - lr: 0.0010
Epoch 5/30
625/625 [==============================] - 41s 66ms/step - loss: 0.8781 - accuracy: 0.6899 - val_loss: 0.8437 - val_accuracy: 0.7045 - lr: 0.0010
Epoch 6/30
625/625 [==============================] - 42s 66ms/step - loss: 0.7705 - accuracy: 0.7308 - val_loss: 0.8167 - val_accuracy: 0.7164 - lr: 0.0010
Epoch 7/30
625/625 [==============================] - 41s 66ms/step - loss: 0.6828 - accuracy: 0.7606 - val_loss: 0.7678 - val_accuracy: 0.7380 - lr: 0.0010
Epoch 8/30
625/625 [==============================] - 41s 66ms/step - loss: 0.6149 - accuracy: 0.7875 - val_loss: 0.7480 - val_accuracy: 0.7422 - lr: 0.0010
Epoch 9/30
625/625 [==============================] - 41s 66ms/step - loss: 0.5677 - accuracy: 0.8025 - val_loss: 0.7294 - val_accuracy: 0.7577 - lr: 0.0010
Epoch 10/30
625/625 [==============================] - 41s 66ms/step - loss: 0.5093 - accuracy: 0.8246 - val_loss: 0.7009 - val_accuracy: 0.7657 - lr: 0.0010
Epoch 11/30
625/625 [==============================] - 41s 66ms/step - loss: 0.4678 - accuracy: 0.8368 - val_loss: 0.7487 - val_accuracy: 0.7549 - lr: 0.0010
Epoch 12/30
625/625 [==============================] - 41s 66ms/step - loss: 0.4379 - accuracy: 0.8486 - val_loss: 0.7212 - val_accuracy: 0.7624 - lr: 0.0010
Epoch 13/30
625/625 [==============================] - ETA: 0s - loss: 0.4167 - accuracy: 0.8556
Epoch 00013: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
625/625 [==============================] - 42s 66ms/step - loss: 0.4167 - accuracy: 0.8556 - val_loss: 0.7879 - val_accuracy: 0.7553 - lr: 0.0010
Epoch 14/30
625/625 [==============================] - 41s 66ms/step - loss: 0.1992 - accuracy: 0.9318 - val_loss: 0.7882 - val_accuracy: 0.7984 - lr: 2.0000e-04
Epoch 15/30
625/625 [==============================] - 41s 66ms/step - loss: 0.1116 - accuracy: 0.9624 - val_loss: 0.9173 - val_accuracy: 0.8001 - lr: 2.0000e-04
Epoch 16/30
625/625 [==============================] - ETA: 0s - loss: 0.0678 - accuracy: 0.9774
Epoch 00016: ReduceLROnPlateau reducing learning rate to 4.0000001899898055e-05.
625/625 [==============================] - 42s 68ms/step - loss: 0.0678 - accuracy: 0.9774 - val_loss: 1.1415 - val_accuracy: 0.8005 - lr: 2.0000e-04
Epoch 17/30
625/625 [==============================] - 42s 67ms/step - loss: 0.0293 - accuracy: 0.9921 - val_loss: 1.2672 - val_accuracy: 0.8013 - lr: 4.0000e-05
Epoch 18/30
625/625 [==============================] - 42s 67ms/step - loss: 0.0179 - accuracy: 0.9954 - val_loss: 1.3837 - val_accuracy: 0.8011 - lr: 4.0000e-05
Epoch 19/30
625/625 [==============================] - ETA: 0s - loss: 0.0116 - accuracy: 0.9974
Epoch 00019: ReduceLROnPlateau reducing learning rate to 8.000000525498762e-06.
625/625 [==============================] - 42s 67ms/step - loss: 0.0116 - accuracy: 0.9974 - val_loss: 1.5095 - val_accuracy: 0.8019 - lr: 4.0000e-05
Epoch 20/30
625/625 [==============================] - 42s 67ms/step - loss: 0.0073 - accuracy: 0.9986 - val_loss: 1.5470 - val_accuracy: 0.8002 - lr: 8.0000e-06
Epoch 00020: early stopping
test_ds = CIFAR_Dataset(test_images, test_oh_labels, batch_size=BATCH_SIZE, augmentor=None,
shuffle=False, pre_func=inception_preprocess)
evaluation_result = gnet_model.evaluate(test_ds)
157/157 [==============================] - 4s 25ms/step - loss: 1.6334 - accuracy: 0.7870