1. 순환 신경망(Recurrent Neural Network, RNN)

  • 루프(loop)를 가진 신경망의 한 종류
  • 시퀀스의 원소를 순회하면서 지금가지 처리한 정보를 상태(state)에 저장

https://aditi-mittal.medium.com/understanding-rnn-and-lstm-f7cdf6dfc14e

 

  - 순환 신경망 레이어(RNN Layer)

  • 입력: (timesteps, input_features)
  • 출력: (timesteps, output_features)
# numpy로 RNN 구조 표현
import numpy as np

timesteps = 100
input_features = 32
output_features = 64

inputs = np.random.random((timesteps, input_features))

state_t = np.zeros((output_features, ))

W = np.random.random((output_features, input_features))
U = np.random.random((output_features, output_features))
b = np.random.random((output_features, ))

sucessive_outputs = []

for input_t in inputs:
    output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)
    sucessive_outputs.append(output_t)
    state_t = output_t

final_output_sequence = np.stack(sucessive_outputs, axis = 0)

 

  - 케라스의 순환층

  • SimpleRNN layer
  • 입력: (batch_size, timesteps, input_features)
  • 출력
    • return_sequences로 결정할 수 있음
    • 3D 텐서
      • timesteps의 출력을 모든 전체 sequences를 반환
      • (batch_size, timesteps, output_features)
    • 2D 텐서
      • 입력 sequence에 대한 마지막 출력만 반환
      • (batch_size, output_features)
from tensorflow.keras.layers import SimpleRNN, Embedding
from tensorflow.keras.models import Sequential

model = Sequential()
model.add(Embedding(10000, 32))
model.add(SimpleRNN(32))  # SimpleRNN 안에 return_sequences = True옵션을 추가하면 전체 sequences를 return시켜줌
model.summary()

# 출력 결과
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, None, 32)          320000    
                                                                 
 simple_rnn (SimpleRNN)      (None, 32)                2080      
                                                                 
=================================================================
Total params: 322,080
Trainable params: 322,080
Non-trainable params: 0
_________________________________________________________________
  • 네트워크의 표현력을 증가시키기 위해 여러 개의 순환층을 차례대로 쌓는 것이 유용할 때가 있음
    • 이런 설정에서는 중간층들이 전체 출력 sequences를 반환하도록 설정
model = Sequential()
model.add(Embedding(10000, 32))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(32, return_sequences = True))
model.add(SimpleRNN(32))
model.summary()

# 출력 결과
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_2 (Embedding)     (None, None, 32)          320000    
                                                                 
 simple_rnn_2 (SimpleRNN)    (None, None, 32)          2080      
                                                                 
 simple_rnn_3 (SimpleRNN)    (None, None, 32)          2080      
                                                                 
 simple_rnn_4 (SimpleRNN)    (None, None, 32)          2080      
                                                                 
 simple_rnn_5 (SimpleRNN)    (None, 32)                2080      
                                                                 
=================================================================
Total params: 328,320
Trainable params: 328,320
Non-trainable params: 0
_________________________________________________________________

 

  - LMDB 데이터 적용

  - 데이터 로드

from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

num_words = 10000
max_len = 500
batch_size = 32

(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words = num_words)
print(len(input_train))  # 25000
print(len(input_test))   # 25000

input_train = sequence.pad_sequences(input_train, maxlen = max_len)
input_test = sequence.pad_sequences(input_test, maxlen = max_len)
print(input_train.shape) # (25000, 500)
print(input_test.shape)  # (25000, 500)

 

  - 모델 구성

from tensorflow.keras.layers import Dense

model = Sequential()

model.add(Embedding(num_words, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation = 'sigmoid'))

model.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['acc'])

model.summary()

# 출력 결과
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_3 (Embedding)     (None, None, 32)          320000    
                                                                 
 simple_rnn_6 (SimpleRNN)    (None, 32)                2080      
                                                                 
 dense (Dense)               (None, 1)                 33        
                                                                 
=================================================================
Total params: 322,113
Trainable params: 322,113
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습

history = model.fit(input_train, y_train,
                    epochs = 10,
                    batch_size = 128,
                    validation_split = 0.2)

 

 

  - 시각화

import matplotlib.pyplot as plt

loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'train loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'train accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

model.evaluate(input_test, y_test)

# 출력 결과
loss: 0.6755 - acc: 0.7756
[0.6754735112190247, 0.7755600214004517]
  • 전체 sequences가 아니라 순서대로 500개의 단어만 입력했기 때문에 성능이 낮게 나옴
  • simpleRNN은 긴 sequence를 처리하는데 적합하지 않음

 

 

2. LSTM과 GRU 레이어

  • Simple RNN은 실전에 사용하기엔 너무 단순
  • SimpleRNN은 이론적으로 시간 t에서 이전의 모든 timesteps의 정보를 유지할 수 있지만, 실제로는 긴 시간에 걸친 의존성은 학습할 수 없음
  • 그레디언트 소실 문제(vanishing gradient problem)
    • 이를 방지하기 위해 LSTM, GRU 같은 레이어 등장

 

  - LSTM(Long-Short-Term Memory)

  • 장단기 메모리 알고리즘
  • 나중을 위해 정보를 저장함으로써 오래된 시그널이 점차 소실되는 것을 막아줌

https://colah.github.io/posts/2015-08-Understanding-LSTMs/

 

  - 예제 1) Reyters

  • IMDB와 유사한 데이터셋(텍스트 데이터)
  • 46개의 상호 배타적인 토픽으로 이루어진 데이터셋
    • 다중 분류 문제

  - 데이터셋 로드

from tensorflow.keras.datasets import reuters

num_words = 10000
(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words = num_words)

print(x_train.shape) # (8982,)
print(y_train.shape) # (8982,)
print(x_test.shape)  # (2246,)
print(y_test.shape)  # (2246,)

 

  - 데이터 전처리 및 확인

from tensorflow.keras.preprocessing.sequence import pad_sequences

max_len = 500

pad_x_train = pad_sequences(x_train, maxlen = max_len)
pad_x_test = pad_sequences(x_test, maxlen = max_len)

print(len(pad_x_train[0]))  # 500

pad_x_train[0]

 

  - 모델 구성

  • LSTM 레이어도 SimpleRNN과 같이 return_sequences 인자 사용 가능
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Embedding

model = Sequential()
model.add(Embedding(input_dim = num_words, output_dim = 64))
model.add(LSTM(64, return_sequences = True))
model.add(LSTM(32))
model.add(Dense(46, activation = 'softmax'))

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy',
              metrics = ['acc'])
model.summary()

# 출력 결과
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_1 (Embedding)     (None, None, 64)          640000    
                                                                 
 lstm (LSTM)                 (None, None, 64)          33024     
                                                                 
 lstm_1 (LSTM)               (None, 32)                12416     
                                                                 
 dense (Dense)               (None, 46)                1518      
                                                                 
=================================================================
Total params: 686,958
Trainable params: 686,958
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습

history = model.fit(pad_x_train, y_train,
                    epochs = 20,
                    batch_size = 32,
                    validation_split = 0.2)

 

  - 시각화

import matplotlib.pyplot as plt

loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'train loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'train accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

 

  - 모델 평가

model.evaluate(pad_x_test, y_test)

# 출력 결과
loss: 1.6927 - acc: 0.6336
[1.692732810974121, 0.6335707902908325]

 

  - 예제 2) IMDB 데이터셋

  - 데이터 로드

from tensorflow.keras.datasets import imdb
from tensorflow.kears.preprocessing.sequence import pad_sequences

num_words = 10000
max_len = 500
batch_size = 32

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words = num_words)

pad_x_train = sequence.pad_sequences(x_train, maxlen = max_len)
pad_x_test = sequence.pad_sequences(x_test, maxlen = max_len)

 

  - 모델 구성

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Embedding

model = Sequential()
model.add(Embedding(num_words, 32))
model.add(LSTM(32))
model.add(Dense(1, activation = 'sigmoid'))

model.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['acc'])
model.summray()

# 출력 결과
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_3 (Embedding)     (None, None, 32)          320000    
                                                                 
 lstm_3 (LSTM)               (None, 32)                8320      
                                                                 
 dense_2 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 328,353
Trainable params: 328,353
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습

history = model.fit(pad_x_train, y_train,
                    epochs = 10,
                    batch_size = 128,
                    validation_split = 0.2)

 

  - 시각화

import matplotlib.pyplot as plt

loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['acc']
val_acc = history.history['val_acc']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'train loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'train accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

 

  - 모델 평가

model.evaluate(pad_x_test, y_test)

# 출력 결과
loss: 0.9135 - acc: 0.7898
[0.9135046601295471, 0.7898399829864502]
  • LSTM 쓰기전, SimpleRNN을 썻을 때 loss가 0.6755, acc가 0.7756으로 나온 것에 비해 좋은 결과가 나옴

 

 

3. Cosine 함수를 이용한 순환 신경망

# 코사인 시계열 데이터
import numpy as np

np.random.seed(111)
time = np.arange(30 * 12 + 1)
month_time = (time % 30) / 30
time_series = 20 * np.where(month_time < 0.5,
                            np.cos(2 * np.pi * month_time),
                            np.cos(2 * np.pi * month_time) + np.random.random(361))
plt.figure(figsize = (15, 8))
plt.xlabel('Time')
plt.ylabel('Value')
plt.plot(np.arange(0, 30 * 11 + 1),
         time_series[:30 * 11 + 1],
         color = 'blue', alpha = 0.6, label = 'Train Data')
plt.plot(np.arange(30 * 11, 30 * 12 + 1),
         time_series[30 * 11:],
         color = 'orange', label = 'Test Data')
plt.show()

 

  - 데이터 전처리

def make_data(time_series, n):
    x_train_full, y_train_full = list(), list()

    for i in range(len(time_series)):
        x = time_series[i:(i + n)]
        if (i + n) < len(time_series):
            x_train_full.append(x)
            y_train_full.append(time_series[i + n])
        else:
            break
    
    x_train_full, y_train_full = np.array(x_train_full), np.array(y_train_full)

    return x_train_full, y_train_full

n = 10
x_train_full, y_train_full = make_data(time_series, n)

print(x_train_full.shape) # (351, 10)
print(y_train_full.shape) # (351,)


# 뒤에 1씩 추가
x_train_full = x_train_full.reshape(-1, n, 1)
y_train_full = y_train_full.reshape(-1, n, 1)

print(x_train_full.shape) # (351, 10, 1)
print(y_train_full.shape) # (351, 1)

 

  - 테스트 데이터셋 생성

x_train_full = x_train_full.reshape(-1, n, 1)
y_train_full = y_train_full.reshape(-1, n, 1)

print(x_train_full.shape)
print(y_train_full.shape)


# train 데이터와 test 데이터 분리
x_train = x_train_full[:30 * 11]
y_train = y_train_full[:30 * 11]

x_test = x_train_full[30 * 11:]
y_test = y_train_full[30 * 11:]

print(x_train.shape) # (330, 10, 1)
print(y_train.shape) # (330, 1)
print(x_test.shape)  # (21, 10, 1)
print(y_test.shape)  # (21, 10, 1)

 

  - 데이터 확인

sample_series = np.arange(100)
a, b = make_data(sample_series, 10)

print(a[0])  # [0 1 2 3 4 5 6 7 8 9]
print(b[0])  # 10

 

  - 모델 구성

from tensorflow.keras.layers import SimpleRNN, Flatten, Dense
from tensorflow.keras.models import Sequential

def build_model(n):
    model = Sequential()

    model.add(SimpleRNN(units = 32, activation = 'tanh', input_shape = (n, 1)))
    model.add(Dense(1))

    model.compile(optimizer = 'adam',
                  loss = 'mse')
    return model

model = build_model(10)
model.summary()

# 출력 결과
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 simple_rnn (SimpleRNN)      (None, 32)                1088      
                                                                 
 dense_3 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 1,121
Trainable params: 1,121
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습

model.fit(x_train, y_train,
          epochs = 100, batch_size = 12)

 

  - 예측값 그려보기

prediction = model.predict(x_test)

pred_range = np.arange(len(y_train), len(y_train) + len(prediction))

plt.figure(figsize = (12, 5))
plt.xlabel('Time')
plt.ylabel('Value')
plt.plot(pred_range, y_test.flatten(), color = 'orange', label = 'Ground Truth')
plt.plot(pred_range, prediction.flatten(), color = 'blue', label = 'Prediction')
plt.legend()
plt.show()

 

  - 모델 재구성

  • LSTM 사용
from tensorflow.keras.layers import LSTM

def build_model2(n):
    model = Sequential()

    model.add(LSTM(units = 64, return_sequences = True, input_shape = (n, 1)))
    model.add(LSTM(32))
    model.add(Dense(1))

    model.compile(optimizer = 'adam',
                  loss = 'mse')
    return model

model2 = build_model2(10)
model2.summary()

# 출력 결과
Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_4 (LSTM)               (None, 10, 64)            16896     
                                                                 
 lstm_5 (LSTM)               (None, 32)                12416     
                                                                 
 dense_5 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 29,345
Trainable params: 29,345
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 재학습 및 예측값 그려보기

model2.fit(x_train, y_train,
           epochs = 100, batch_size = 12)

prediction_2 = model_2.predict(x_test)

pred_range = np.arange(len(y_train), len(y_train) + len(prediction_2))

plt.figure(figsize = (12, 5))
plt.xlabel('Time')
plt.ylabel('Value')
plt.plot(pred_range, y_test.flatten(), color = 'orange', label = 'Ground Truth')
plt.plot(pred_range, prediction.flatten(), color = 'r:', label = 'Model1 Prediction')
plt.plot(pred_range, prediction_2.flatten(), color = 'blue', label = 'Model2 Prediction')
plt.legend()
plt.show()

 

  - 모델 재구성

  • GRU 사용(LSTM보다 더 쉬운 구조)
from tensorflow.keras.layers import GRU

def build_model3(n):
    model = Sequential()

    model.add(GRU(units = 30, return_sequences = True, input_shape = (n, 1)))
    model.add(GRU(30))
    model.add(Dense(1))

    model.compile(optimizer = 'adam',
                  loss = 'mse')
    return model

model_3 = build_model3(10)
model_3.summary()

# 출력 결과
Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_6 (LSTM)               (None, 10, 64)            16896     
                                                                 
 lstm_7 (LSTM)               (None, 32)                12416     
                                                                 
 dense_6 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 29,345
Trainable params: 29,345
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 재학습 및 예측값 그려보기

model_3.fit(x_train, y_train,
           epochs = 100, batch_size = 12)

prediction_3 = model_3.predict(x_test)

pred_range = np.arange(len(y_train), len(y_train) + len(prediction_3))

plt.figure(figsize = (12, 5))
plt.xlabel('Time')
plt.ylabel('Value')
plt.plot(pred_range, y_test.flatten(), color = 'orange', label = 'Ground Truth')
plt.plot(pred_range, prediction.flatten(), color = 'r:', label = 'Model1 Prediction')
plt.plot(pred_range, prediction_2.flatten(), color = 'blue', label = 'Model2 Prediction')
plt.plot(pred_range, prediction_2.flatten(), color = 'blue', label = 'Model3 Prediction')
plt.legend()
plt.show()

 

  - Conv1D

  • 텍스트 분류나 시계열 예측같은 간단한 문제, 오디오 생성, 기계 번역 등의 문제에서 좋은 성능
  • timestep의 순서에 민감 X
  • 2D Convolution
    • 지역적 특징을 인식
  • 2D Convolution
    • 문맥을 인식

 

  - Conv1D Layer

  • 입력: (batch_size, timesteps, channels)
  • 출력: (batch_size, timesteps, filters)
  • 필터의 사이즈가 커져도 모델이 급격히 증가하지 않기 때문에 다양한 크기를 사용할 수 있음
  • 데이터의 품질이 좋으면 굳이 크기를 달리하여 여러 개를 사용하지 않아도 될 수도 있음

 

  - MaxPooling1D Layer

  • 다운 샘플링 효과
  • 단지 1차원형태로 바뀐 것 뿐

 

  - GlovalMaxPooling Layer

  • 배치 차원을 제외하고 2차원 형태를 1차원 형태로 바꾸어주는 레이어
  • Flatten layer로 대신 사용가능

 

  - IMDB 데이터셋

  - 데이터 로드 및 전처리

from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.layers import Dense, Embedding, Conv1D, MaxPooling1D, GlobalMaxPooling1D

num_words = 10000
max_len = 500
batch_size = 32

(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words = num_words)

print(len(input_train))  # 25000
print(len(input_test))   # 25000

pad_x_train = pad_sequences(input_train, maxlen = max_len)
pad_x_test = pad_sequences(input_test, maxlen = max_len)

print(pad_x_train.shape) # (25000, 500)
print(pad_x_test.shape)  # (25000, 500)

 

  -모델 구성

def build_model():
    model = Sequential()

    model.add(Embedding(input_dim = num_words, output_dim = 32,
                        input_length = max_len))
    model.add(Conv1D(32, 7, activation = 'relu'))
    model.add(MaxPooling1D(7))
    model.add(Conv1D(32, 5, activation = 'relu'))
    model.add(MaxPooling1D(5))
    model.add(GlobalMaxPooling1D())
    model.add(Dense(1, activation = 'sigmoid'))

    model.compile(optimizer = RMSprop(learning_rate = 1e-4),
                  loss ='binary_crossentropy',
                  metrics = ['accuracy'])
    
    return model

model = build_model()
model.summary()

# 출력 결과
Model: "sequential_13"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_5 (Embedding)     (None, 500, 32)           320000    
                                                                 
 conv1d_2 (Conv1D)           (None, 494, 32)           7200      
                                                                 
 max_pooling1d_2 (MaxPooling  (None, 70, 32)           0         
 1D)                                                             
                                                                 
 conv1d_3 (Conv1D)           (None, 66, 32)            5152      
                                                                 
 max_pooling1d_3 (MaxPooling  (None, 13, 32)           0         
 1D)                                                             
                                                                 
 global_max_pooling1d_1 (Glo  (None, 32)               0         
 balMaxPooling1D)                                                
                                                                 
 dense_12 (Dense)            (None, 1)                 33        
                                                                 
=================================================================
Total params: 332,385
Trainable params: 332,385
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습

history = model.fit(pad_x_train, y_train,
                    epochs = 30,
                    batch_size = 128,
                    validation_split = 0.2)

 

  - 시각화

import matplotlib.pyplot as plt

loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'train loss')
plt.plot(epochs, val_loss, 'r:', label = 'validation loss')
plt.grid()
plt.legend()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'train accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'validation accuracy')
plt.grid()
plt.legend()

model.evaluate(pad_x_test, y_test)

# 출력 결과
loss: 0.3534 - accuracy: 0.8526
[0.35335206985473633, 0.8525999784469604]
  • 과적합이 일어났지만, 다른 optimizer 사용, 규제화를 걸어보는 등 다양하게 시도해볼 수 있음

 데이터 분석을 공부하며 코딩 자체에도 재미를 느꼈습니다. 그래서 자바로 웹개발을 공부하고, 다른 여러 프로그래밍 언어도 공부하였습니다. 그러면서 데이터 분석의 결과를 직접 만든 웹에 대시보드 형태로 띄워보고 싶다는 목표를 가지게 되었습니다. 또, 요즘 휴대폰 하나로 모든 것을 해결할 수 있는 시대인 만큼 휴대폰 앱으로도 확인할 수 있으면 좋을 것이라고 생각하였고, 앱 개발도 공부해보고 싶었습니다. 웹과 앱 모두에서 개발이 가능한 사람이 되고 싶었고 그렇게 선택한 언어가 Dart 기반의 Flutter 였습니다. 그리고 Flutter를 공부하기 위해 선택한 책은 '풀스택 개발이 쉬워지는 다트&플러터'입니다.

 서평 이벤트를 통해 해당 책을 제공받았습니다.

 IT 관련 분야의 도서를 많이 출판하는 영진닷텀에서 나온 책으로 전문성은 물론, 책의 구성도 언어를 공부하기에 최적화된 순서로 되어 있습니다. 책에는 다음과 같은 내용이 포함되어 있었습니다.

  • Dart 언어 문법과 구조 이해
  • Dart로 서버와 클라이언트 개발하기
  • Flutter 래퍼런스 프로그램 개발하기
  • Flutter로 데스트톱, 웹서비스 개발하기
  • 지속 가능한 개발자로 첫 걸음 내딛기

 

 책의 두께도 어마어마 하지만 그만큼 한권으로 많은 양을 공부할 수 있고, 이후에 Flutter를 전문적으로 응용하고 싶을 때도 기본서로 사용할 수 있을 것 같습니다.

 

 장마다 마지막에 연습문제를 넣어두어서 공부했던 개념을 직접 실습해보고 모르는 부분은 더 확실히하며 넘어갈 수 있었습니다.

 

 언어를 사용하며 접할 수 있는 Error에 대한 설명도 나와있고, 코드에 대한 설명도 한줄 한줄 꼼꼼하고 친절하게 설명되어 있어 혼자 공부하는 데에도 적은 시간에 많은 양을 정확하게 공부하는데 도움이 되었습니다.

 

 책의 가장 좋았던 점은 책을 따라가다보면 프로젝트를 직접 구현할 수 있다는 것이었습니다. 언어의 문법이나 개념을 정확히 공부했더라도 실제 프로젝트에 응용할 수 없다면 무용지물이 됩니다. 그만큼 실제 프로젝트에 응용하는 것이 중요하고 어려운 일이지만 '풀스택 개발이 쉬워지는 다트&플러터'와 함께하면 프로젝트에 언어를 적용하는 것도 문제 없어 보입니다.

 

 '풀스택 개발이 쉬워지는 다트&플러터'로 웹, 앱의 풀스택 개발을 제대로 공부할 수 있는 기회를 얻어 정말 좋은 시간이었습니다. 데이터 분석과 접목한 웹, 앱 개발이라는 저의 목표를 달성하는데 큰 도움이 된 책입니다.

9. Keras에서 Word2Vec 직접 학습

  - 데이터 준비

from tensorflow.keras.datasets import imdb

(x_train, y_train), (x_test, y_test) = imdb.load_data()
  • 단어 번호와 단어의 관계를 사전으로 만듦
  • 1번은 문장의 시작, 2번은 사전에 없는 단어(OOV)로 미리 지정
word_index = imdb.get_word_index()
index_word = {idx + 3 : word for word, idx in word_index.items()}

index_word[1] = '<START>'
index_word[2] = '<UNKNOWN>'

' '.join(index_word[i] for i in x_train[0])

# 출력 결과
"<START> this film was just brilliant casting location scenery story direction everyone's really
suited the part they played and you could just imagine being there robert redford's is an
amazing actor and now the same being director norman's father came from the same scottish
island as myself so i loved the fact there was a real connection with this film the witty
remarks throughout the film were great it was just brilliant so much that i bought the film
as soon as it was released for retail and would recommend it to everyone to watch and the fly
fishing was amazing really cried at the end it was so sad and you know what they say if you
cry at a film it must have been good and this definitely was also congratulations to the two
little boy's that played the part's of norman and paul they were just brilliant children are
often left out of the praising list i think because the stars that play them all grown up are
such a big profile for the whole film but these children are amazing and should be praised for
what they have done don't you think the whole story was so lovely because it was true and was
someone's life after all that was shared with us all"
num_words = max(index_word) + 1

 

  - 텍스트를 단어 번호로 바꾸기

texts = []
for data in x_train:
    text = ' '.join(index_word[i] for i in data)
    texts.append(text)

len(texts)  # 25000
  • Tokenizer를 사용해 텍스트를 단어로 바꿈
from keras.preprocessing.text import Tokenizer

tok = Tokenizer()
tok.fit_on_texts(texts)

new_data = tok.texts_to_sequences(texts)
new_data[0][:10]

# 출력 결과
[28, 11, 19, 13, 41, 526, 968, 1618, 1381, 63]
# 모든 데이터 문장을 토큰화하고 위의 문장을 그 토큰으로 바꾼뒤 10개만 출력

 

  - 단어쌍 만들기

from tensorflow.keras.preprocessing.sequence import make_sampling_table, skipgrams

# 전제 토큰 개수
VOCAB_SIZE = len(tok.word_index)
print(VOCAB_SIZE)  # 88581
  • 단어를 무작위로 추출하면 자주 나오는 단어가 더 많이 나오게 됨
  • 이를 방지하기위해 단어를 추출할 확률의 균형을 맞춘 샘플링 표를 생성
table = make_sampling_table(VOCAB_SIZE)
  • 두 단어씩 뽑아 좌우 2단어(window_size = 2)안에 들어있는 경우가 있는지 없는지 확인하며 데이터 생성
couples, labels = skipgrams(data, VOCAB_SIZE, window_size = 2, sampling_table = table)
couples[:5]

# 출력 결과
[[16876, 497], [9685, 21], [16876, 21917], [383, 5452], [2098, 13577]]
  • labels에는 윈도우 안에 들어있는 경우가 있으면 1, 없으면 0
labels[:5]

# 출력 결과
[1, 1, 0, 0, 0]
  • 대상 단어는 word_target으로, 맥락 단어는 word_context로 모음
word_target, word_context = zip(*couples)
  • 배열로 바꿈
word_target = np.asarray(word_target, dtype = 'int32')
word_context = np.asarray(word_context, dtype = 'int32')
labels = np.asarray(labels, dtype = 'int32')

word_target.shape    # (288,)
word_context.shape   # (288,)

 

  - skip-gram 모형

  • skip-gram 모형은 함수형 API를 사용해야 함
from tensorflow.keras.layers import Activation, Dot, Embedding, Flatten, Input, Reshape
from tensorflow.keras.models import Model

def build_model():
    input_target = Input(shape = (1, ))
    input_context = Input(shape = (1, ))

    emb = Embedding(input_dim = VOCAB_SIZE, output_dim = 8)
    target = emb(input_target)
    context = emb(input_context)

    dot = Dot(axes = 2)([target, context])
    flatten = Reshape((1, ))(dot)
    output = Activation('sigmoid')(flatten)
    skipgram = Model(inputs = [input_target, input_context], outputs = output)

    return skipgram

model = build_model()
model.summary()

# 출력 결과
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_3 (InputLayer)           [(None, 1)]          0           []                               
                                                                                                  
 input_4 (InputLayer)           [(None, 1)]          0           []                               
                                                                                                  
 embedding_5 (Embedding)        (None, 1, 8)         708648      ['input_3[0][0]',                
                                                                  'input_4[0][0]']                
                                                                                                  
 dot (Dot)                      (None, 1, 1)         0           ['embedding_5[0][0]',            
                                                                  'embedding_5[1][0]']            
                                                                                                  
 reshape (Reshape)              (None, 1)            0           ['dot[0][0]']                    
                                                                                                  
 activation (Activation)        (None, 1)            0           ['reshape[0][0]']                
                                                                                                  
==================================================================================================
Total params: 708,648
Trainable params: 708,648
Non-trainable params: 0
__________________________________________________________________________________________________

 

  - 모델 컴파일 및 학습

from tensorflow.keras.optimizers import Adam

model.compile(optimizer = Adam(),
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

model.fit([word_target, word_context], labels, epochs = 30)

 

  - 임베딩 레이어 저장 및 로드

emb = model.layers[2]
emb.get_weights()

# 출력 결과
[array([[ 0.01938832,  0.01921825, -0.0462908 , ...,  0.01147114,
         -0.04764376,  0.01121316],
        [-0.01068624, -0.04315212,  0.00839611, ..., -0.02030395,
         -0.02321514, -0.03680412],
        [ 0.00915837,  0.00973357,  0.00904005, ...,  0.01291057,
          0.04295233,  0.0488804 ],
        ...,
        [ 0.01314208,  0.02786795,  0.01130085, ...,  0.03705814,
          0.0427903 ,  0.0109529 ],
        [-0.03585767, -0.04641544, -0.02590518, ..., -0.00451361,
         -0.03019956,  0.01893195],
        [ 0.00769577, -0.02014879, -0.03623866, ..., -0.03457584,
         -0.02138668,  0.02141118]], dtype=float32)]
# 임베딩 레이어 저장
np.save('emb.npy', emb.get_weights()[0])
  • 임베딩 레이어 로드
w = np.load('emb.npy')
  • 임베딩 레이어를 추가할 때 trainable을 False로 하면 추가학습이 이루어 지지 않음
emb_ff = Embedding(input_dim = num_words, output_dim = 8, input_length = 30,
                   weights = [w], trainable = False)

 

 

10. 사전 훈련된 단어 임베딩 사용하기: GloVe 임베딩

  - 원본 IMDB 텍스트 내려받기

import wget
import os
import zipfile

wget.download("http://mng.bz/0tIo")

local_zip = '0tIo'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall()
zip_ref.close()

imdb_dir = "aclImdb"
train_dir = os.path.join(imdb_dir, 'train')

labels = []
texts = []
for label_type in ['neg', 'pos']:
    dir_name = os.path.join(train_dir, label_type)

    for fname in os.listdir(dir_name):
        if fname[-4:] == '.txt':
            f = open(os.path.join(dir_name, fname), encoding = 'utf-8')
            texts.append(f.read())
            f.close()

            if label_type == 'neg':
                labels.append(0)
            else:
                labels.append(1)

texts[0]

# 출력 결과
"Story of a man who has unnatural feelings for a pig. Starts out with a opening scene that is
a terrific example of absurd comedy. A formal orchestra audience is turned into an insane,
violent mob by the crazy chantings of it's singers. Unfortunately it stays absurd the WHOLE
time with no general narrative eventually making it just too off putting. Even those from the
era should be turned off. The cryptic dialogue would make Shakespeare seem easy to a third grader.
On a technical level it's better than you might think with some good cinematography by future
great Vilmos Zsigmond. Future stars Sally Kirkland and Frederic Forrest can be seen briefly."

labels[0]  # 0(부정적인 리뷰)

 

  - 데이터 토큰화

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

max_len = 100
training_samples = 200
validation_samples = 10000
max_words = 10000

tokenizer = Tokenizer(num_words = max_words)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)

word_index = tokenizer.word_index
print(len(word_index))  # 88582
data = pad_sequences(sequences, maxlen = max_len)
labels = np.asarray(labels)

print(data.shape)    # (25000, 100)
print(labels.shape)  # (25000,)
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]

x_train = data[:training_samples]
y_train = labels[:training_samples]
x_val = data[training_samples : training_samples + validation_samples]
y_val = labels[training_samples : training_samples + validation_samples]

print(x_train.shape)  # (200, 100)
print(y_train.shape)  # (200,)
print(x_val.shape)    # (10000, 100)
print(y_val.shape)    # (10000,)

 

  - GloVe 단어 임베딩 내려받기

import wget

wget.download("http://nlp.stanford.edu/data/glove.6B.zip")

# 압축풀기
local_zip = 'glove.6B.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall()
zip_ref.close()

 

  - 임베딩 전처리

  • GloVe 파싱
# 데이터를 라인 단위로 불러오기
glove_dir = "glove.6B"
embeddings_index = {}
f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'), encoding = 'utf8')
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype = 'float32')
    embeddings_index[word] = coefs

f.close()

print(len(embeddings_index))  # 400000
embedding_dim = 100
embedding_mat = np.zeros((max_words, embedding_dim))
for word, i in word_index.items():
    if i < max_words:
        embedding_vector = embeddings_index.get(word)
        if embedding_vector is not None:
            embedding_mat[i] = embedding_vector

embedding_mat

# 출력 결과
array([[ 0.        ,  0.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [-0.038194  , -0.24487001,  0.72812003, ..., -0.1459    ,
         0.82779998,  0.27061999],
       [-0.071953  ,  0.23127   ,  0.023731  , ..., -0.71894997,
         0.86894   ,  0.19539   ],
       ...,
       [ 0.13787   , -0.17727   , -0.62436002, ...,  0.35506001,
         0.33443999,  0.14436001],
       [-0.88968998,  0.55208999, -0.50498998, ..., -0.54351002,
        -0.21874   ,  0.51186001],
       [-0.17381001, -0.037609  ,  0.068837  , ..., -0.097167  ,
         1.08840001,  0.22676   ]])

 

  - 모델 정의

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Flatten, Dense

model = Sequential()

model.add(Embedding(max_words, embedding_dim, input_length = max_len))
model.add(Flatten())
model.add(Dense(32, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))
model.summary()

# 출력 결과
Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_7 (Embedding)     (None, 100, 100)          1000000   
                                                                 
 flatten_2 (Flatten)         (None, 10000)             0         
                                                                 
 dense_2 (Dense)             (None, 32)                320032    
                                                                 
 dense_3 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 1,320,065
Trainable params: 1,320,065
Non-trainable params: 0
_________________________________________________________________
# 가중치 설정
model.layers[0].set_weights([embedding_mat])

# 학습하지 않고 기존의 가중치값 그대로 사용
model.layers[0].trainable = False
model.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

history = model.fit(x_train, y_train,
                    epochs = 10,
                    batch_size = 32,
                    validation_data = (x_val, y_val))

# 모델 저장
model.save_weights('pre_trained_glove_model.h5')

 

  - 시각화

loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'Training Loss')
plt.plot(epochs, val_loss, 'r:', label = 'Validaiton Loss')
plt.legend()
plt.grid()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'Training Accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'Validaiton Accuracy')
plt.legend()
plt.grid()

 

11. 사전 훈련된 단어 임베딩을 사용하지 않고 같은 모델 훈련

model2 = Sequential()

model2.add(Embedding(max_words, embedding_dim, input_length = max_len))
model2.add(Flatten())
model2.add(Dense(32, activation = 'relu'))
model2.add(Dense(1, activation = 'sigmoid'))
model2.summary()

# 출력 결과
Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_8 (Embedding)     (None, 100, 100)          1000000   
                                                                 
 flatten_3 (Flatten)         (None, 10000)             0         
                                                                 
 dense_4 (Dense)             (None, 32)                320032    
                                                                 
 dense_5 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 1,320,065
Trainable params: 1,320,065
Non-trainable params: 0
_________________________________________________________________
model2.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])
history2 = model2.fit(x_train, y_train,
                    epochs = 10,
                    batch_size = 32,
                    validation_data = (x_val, y_val))

loss = history2.history['loss']
val_loss = history2.history['val_loss']
acc = history2.history['accuracy']
val_acc = history2.history['val_accuracy']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'b--', label = 'Training Loss')
plt.plot(epochs, val_loss, 'r:', label = 'Validaiton Loss')
plt.legend()
plt.grid()

plt.figure()
plt.plot(epochs, acc, 'b--', label = 'Training Accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'Validaiton Accuracy')
plt.legend()
plt.grid()

 

  - 테스트 데이터 토큰화

test_dir = os.path.join(imdb_dir, 'test')

labels = []
texts = []
for label_type in ['neg', 'pos']:
    dir_name = os.path.join(test_dir, label_type)

    for fname in os.listdir(dir_name):
        if fname[-4:] == '.txt':
            f = open(os.path.join(dir_name, fname), encoding = 'utf8')
            texts.append(f.read())
            f.close()

            if label_type == 'neg':
                labels.append(0)
            else:
                labels.append(1)

sequences = tokenizer.texts_to_sequences(texts)
x_test = pad_sequences(sequences, maxlen = max_len)
y_test = np.asarray(labels)

print(x_test.shape)  # (25000, 100)
print(y_test.shape)  # (25000,)
model.load_weights('pre_trained_glove_model.h5')
model.evaluate(x_test, y_test)

# 출력 결과
loss: 0.7546 - accuracy: 0.5566
[0.754594087600708, 0.5565599799156189]

1. 용어 설명

  • 토큰(token)
    • 텍스트를 나누는 단위
    • 토큰화(tokenization): 토큰으로 나누는 작업
  • n-gram
    • 문장에서 추출한 N개(또는 그 이하)의 연속된 단어 그룹
    • 같은 개념이 '문자'에도 적용 가능

https://www.sqlservercentral.com/articles/nasty-fast-n-grams-part-1-character-level-unigrams

 

 

2. 문자 수준 원-핫 인코딩

import numpy as np

samples = ['The cat sat on the mat.',
           'The dog ate my homeworks.']

token_index = {}

for sample in samples:
    for word in sample.split():
        if word not in token_index:
            token_index[word] = len(token_index) + 1

max_len = 10
results = np.zeros(shape = (len(samples), max_len,
                            max(token_index.values()) + 1))

# 원-핫 인코딩
for i, sample in enumerate(samples):
    for j, word in list(enumerate(sample.split()))[:max_len]:
        index = token_index.get(word)
        results[i, j, index] = 1.
results

# 출력 결과
array([[[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],  # The
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],  # cat
        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],  # sat
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],  # on
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],  # the
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],  # mat
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],

       [[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],  # The
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],  # dog
        [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],  # ate
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],  # my
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],  # homeworks
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]])

 

 

3. 케라스를 사용한 단어 수준 원-핫 인코딩

  • fit_on_texts()
  • texts_to_sequences()
  • texts_to_matrix()
from tensorflow.keras.preprocessing.text import Tokenizer

samples = ['The cat sat on the mat.',
           'The dog ate my homeworks.']

tokenizer = Tokenizer(num_words = 1000)
tokenizer.fit_on_texts(samples)

sequences = tokenizer.texts_to_sequences(samples)

ohe_results = tokenizer.texts_to_matrix(samples, mode = 'binary')

word_index = tokenizer.word_index
print(len(word_index))

# 출력 결과
9
# 9개의 토큰을 가지고 있음
# 단어의 순서
sequences

# 출력 결과
[[1, 2, 3, 4, 1, 5], [1, 6, 7, 8, 9]]
# 원-핫 인코딩 결과
print(ohe_results.shape)
print(ohe_results)

# 출력 결과
(2, 1000)
[[0. 1. 1. ... 0. 0. 0.]
 [0. 1. 0. ... 0. 0. 0.]]
word_index

# 출력 결과
{'the': 1,
 'cat': 2,
 'sat': 3,
 'on': 4,
 'mat': 5,
 'dog': 6,
 'ate': 7,
 'my': 8,
 'homeworks': 9}
 
 # 단어 인덱스에 따라 sequences의 값이 정해짐

 

  - 토큰화 예제

  • OOV: Out Of Vocabulary
    • 새로운 문장에서 기존에 토큰화한 문장에 존재하지 않으면 OOV로 대체됨
from tensorflow.keras.preprocessing.text import Tokenizer

samples = ["I'm the smartest student.",
           "I'm the best student."]
tokenizer = Tokenizer(num_words = 10, oov_token = '<OOV>')
tokenizer.fit_on_texts(samples)

sequences = tokenizer.texts_to_sequences(samples)

binary_results = tokenizer.texts_to_matrix(samples, mode = 'binary')

print(tokenizer.word_index)

# 출력 결과
# 현재 tokenizer에 대한 word_index
{'<OOV>': 1, "i'm": 2, 'the': 3, 'student': 4, 'smartest': 5, 'best': 6}
binary_results

# 출력 결과
array([[0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 0., 1., 0., 0., 0.]])
  • 테스트
test = ["I'm the fastest student."]
test_seq = tokenizer.texts_to_sequences(test)

print("word index:", tokenizer.word_index)
print("Test Text:", test)
print("Test Seq:", test_seq)

# 출력 결과
word index: {'<OOV>': 1, "i'm": 2, 'the': 3, 'student': 4, 'smartest': 5, 'best': 6}
Test Text: ["I'm the fastest student."]
Test Seq: [[2, 3, 1, 4]]

# fastest는 vocabulary에 없는 oov(out-of-vocabulary) 값이므로 1로 표시됨

 

 

4. 원-핫 단어 벡터와 단어 임베딩

  • 원-핫 단어 벡터
    • 데이터가 희소(sparse)
    • 고차원
  • 단어 임베딩
    • 밀집(dense)
    • 저차원

https://freecontent.manning.com/deep-learning-for-text/

 

 

5. 단어 임베딩

  • 단어 간 벡터 사이의 거리가 가까운, 즉 비슷한 단어들끼리 임베딩
  • 거리 외에 임베딩 공간의 특정 방향도 의미를 가질 수 있음

https://towardsdatascience.com/creating-word-embeddings-coding-the-word2vec-algorithm-in-python-using-deep-learning-b337d0ba17a8

 

  - Embedding Layer

  • 특정 단어를 나타내는 정수 인덱스를 밀집 벡터(dense vector)로 매핑하는 딕셔너리 레이어
  • 입력: (samples, sqquence_length)
  • 출력: (samples, sequences_length, dim)
from tensorflow.keras.layers import Embedding

embedding_layer = Embedding(1000, 64)
embedding_layer

# 출력 결과
<keras.layers.core.embedding.Embedding at 0x265f5b12fa0>
# embedding 객체가 출력됨

 

 

6. 예제: IMDB 데이터

  • 인터넷 영화 데이터베이스(Internet Movie Database)
  • 양극단의 리뷰 5만개로 이루어진 데이터셋
    • 훈련 데이터: 25,000개
    • 테스트 데이터: 25,000개

 

  - modules import

from tensorflow.keras.datasets import imdb
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Dense, Flatten

 

  - 데이터 로드

num_words = 1000
max_len = 20

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words = num_words)

print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

# 출력 결과
(25000,)
(25000,)
(25000,)
(25000,)

 

  - 데이터 확인

  • 긍정: 1
  • 부정: 0
print(x_train[0])
print(y_train[0])

# 출력 결과
# 리뷰 데이터의 sequence와 긍정/부정 결과 출력
[1, 14, 22, 16, 43, 530, 973, 2, 2, 65, 458, 2, 66, 2, 4, 173, 36, 256, 5, 25, 100, 43, 838, 112, 50, 670, 2, 9, 35, 480, 284, 5, 150, 4, 172, 112, 167, 2, 336, 385, 39, 4, 172, 2, 2, 17, 546, 38, 13, 447, 4, 192, 50, 16, 6, 147, 2, 19, 14, 22, 4, 2, 2, 469, 4, 22, 71, 87, 12, 16, 43, 530, 38, 76, 15, 13, 2, 4, 22, 17, 515, 17, 12, 16, 626, 18, 2, 5, 62, 386, 12, 8, 316, 8, 106, 5, 4, 2, 2, 16, 480, 66, 2, 33, 4, 130, 12, 16, 38, 619, 5, 25, 124, 51, 36, 135, 48, 25, 2, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 407, 16, 82, 2, 8, 4, 107, 117, 2, 15, 256, 4, 2, 7, 2, 5, 723, 36, 71, 43, 530, 476, 26, 400, 317, 46, 7, 4, 2, 2, 13, 104, 88, 4, 381, 15, 297, 98, 32, 2, 56, 26, 141, 6, 194, 2, 18, 4, 226, 22, 21, 134, 476, 26, 480, 5, 144, 30, 2, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 2, 88, 12, 16, 283, 5, 16, 2, 113, 103, 32, 15, 16, 2, 19, 178, 32]
1

 

  - 참고) IMDB 데이터셋에서 가장 많이 사용된 단어

word_index = {}

for key, val in imdb.get_word_index().items():
    word_index[val] = key

for i in range(1, 6):
    print(word_index[i])

# 출력 결과
the
and
a
of
to

 

  - 데이터 전처리

  • 모든 데이터를 같은 길이로 맞추기
    • pad_sequence()
      • 데이터가 maxlen보다 길면 데이터를 자름
      • 데이터가 길면 padding 설정
        • pre: 데이터 앞에 0으로 채움
        • post: 데이터 뒤에 0으로 채움
  • 모든 데이터(문장 하나하나)가 같은 길이로 맞춰져야 Embedding 레이어 사용가능
from tensorflow.keras.preprocessing.sequence import pad_sequences

pad_x_train = pad_sequences(x_train, maxlen = max_len, padding = 'pre')
pad_x_test = pad_sequences(x_test, maxlen = max_len, padding = 'pre')

print(len(x_train[0]))
print(len(pad_x_train[0]))

# 출력 결과
218
20
# 최대 길이만큼 줄어듬
print(x_train[0])
print(pad_x_train[0])

# 출력 결과
[1, 14, 22, 16, 43, 530, 973, 2, 2, 65, 458, 2, 66, 2, 4, 173, 36, 256, 5, 25, 100, 43, 838, 112, 50, 670, 2, 9, 35, 480, 284, 5, 150, 4, 172, 112, 167, 2, 336, 385, 39, 4, 172, 2, 2, 17, 546, 38, 13, 447, 4, 192, 50, 16, 6, 147, 2, 19, 14, 22, 4, 2, 2, 469, 4, 22, 71, 87, 12, 16, 43, 530, 38, 76, 15, 13, 2, 4, 22, 17, 515, 17, 12, 16, 626, 18, 2, 5, 62, 386, 12, 8, 316, 8, 106, 5, 4, 2, 2, 16, 480, 66, 2, 33, 4, 130, 12, 16, 38, 619, 5, 25, 124, 51, 36, 135, 48, 25, 2, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 407, 16, 82, 2, 8, 4, 107, 117, 2, 15, 256, 4, 2, 7, 2, 5, 723, 36, 71, 43, 530, 476, 26, 400, 317, 46, 7, 4, 2, 2, 13, 104, 88, 4, 381, 15, 297, 98, 32, 2, 56, 26, 141, 6, 194, 2, 18, 4, 226, 22, 21, 134, 476, 26, 480, 5, 144, 30, 2, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 2, 88, 12, 16, 283, 5, 16, 2, 113, 103, 32, 15, 16, 2, 19, 178, 32]
[ 65  16  38   2  88  12  16 283   5  16   2 113 103  32  15  16   2  19  178  32]

 

  - 모델 구성

model = Sequential()

model.add(Embedding(input_dim = num_words, output_dim = 32, input_length = max_len))
model.add(Flatten())
model.add(Dense(1, activation = 'sigmoid'))

model.summary()

# 출력 결과
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_1 (Embedding)     (None, 20, 32)            32000     
                                                                 
 flatten (Flatten)           (None, 640)               0         
                                                                 
 dense (Dense)               (None, 1)                 641       
                                                                 
=================================================================
Total params: 32,641
Trainable params: 32,641
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 컴파일 및 학습

model.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

history = model.fit(pad_x_train, y_train,
                    epochs = 10,
                    batch_size = 32,
                    validation_split = 0.2)

 

  - 시각화

import matplotlib.pyplot as plt

hist_dict = history.history

plt.plot(hist_dict['loss'], 'b--', label = 'Train Loss')
plt.plot(hist_dict['val_loss'], 'r:', label = 'Validation Loss')
plt.legend()
plt.grid()

plt.figure()
plt.plot(hist_dict['accuracy'], 'b--', label = 'Train Accuracy')
plt.plot(hist_dict['val_accuracy'], 'r:', label = 'Validation Accuracy')
plt.legend()
plt.grid()

plt.show()

 

  - 모델 평가

model.evaluate(pad_x_test, y_test)

# 출력 결과
loss: 0.5986 - accuracy: 0.7085
[0.5986294150352478, 0.7085199952125549]

 

  - 단어의 수를 늘린 후 재학습

num_words = 1000
max_len = 500

pad_x_train_2 = pad_sequences(x_train, maxlen = max_len, padding = 'pre')
pad_x_test_2 = pad_sequences(x_test, maxlen = max_len, padding = 'pre')

print(x_train[0])
print(pad_x_train_2[0])

# 출력 결과
[1, 14, 22, 16, 43, 530, 973, 2, 2, 65, 458, 2, 66, 2, 4, 173, 36, 256, 5, 25, 100, 43, 838, 112, 50, 670, 2, 9, 35, 480, 284, 5, 150, 4, 172, 112, 167, 2, 336, 385, 39, 4, 172, 2, 2, 17, 546, 38, 13, 447, 4, 192, 50, 16, 6, 147, 2, 19, 14, 22, 4, 2, 2, 469, 4, 22, 71, 87, 12, 16, 43, 530, 38, 76, 15, 13, 2, 4, 22, 17, 515, 17, 12, 16, 626, 18, 2, 5, 62, 386, 12, 8, 316, 8, 106, 5, 4, 2, 2, 16, 480, 66, 2, 33, 4, 130, 12, 16, 38, 619, 5, 25, 124, 51, 36, 135, 48, 25, 2, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 407, 16, 82, 2, 8, 4, 107, 117, 2, 15, 256, 4, 2, 7, 2, 5, 723, 36, 71, 43, 530, 476, 26, 400, 317, 46, 7, 4, 2, 2, 13, 104, 88, 4, 381, 15, 297, 98, 32, 2, 56, 26, 141, 6, 194, 2, 18, 4, 226, 22, 21, 134, 476, 26, 480, 5, 144, 30, 2, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 2, 88, 12, 16, 283, 5, 16, 2, 113, 103, 32, 15, 16, 2, 19, 178, 32]
[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   1  14  22  16  43 530
 973   2   2  65 458   2  66   2   4 173  36 256   5  25 100  43 838 112
  50 670   2   9  35 480 284   5 150   4 172 112 167   2 336 385  39   4
 172   2   2  17 546  38  13 447   4 192  50  16   6 147   2  19  14  22
   4   2   2 469   4  22  71  87  12  16  43 530  38  76  15  13   2   4
  22  17 515  17  12  16 626  18   2   5  62 386  12   8 316   8 106   5
   4   2   2  16 480  66   2  33   4 130  12  16  38 619   5  25 124  51
  36 135  48  25   2  33   6  22  12 215  28  77  52   5  14 407  16  82
   2   8   4 107 117   2  15 256   4   2   7   2   5 723  36  71  43 530
 476  26 400 317  46   7   4   2   2  13 104  88   4 381  15 297  98  32
   2  56  26 141   6 194   2  18   4 226  22  21 134 476  26 480   5 144
  30   2  18  51  36  28 224  92  25 104   4 226  65  16  38   2  88  12
  16 283   5  16   2 113 103  32  15  16   2  19 178  32]

# 500이라는 최대 길이 맞추고 남은 공간을 0으로 채움, pre이므로 앞쪽에 채움
model = Sequential()

model.add(Embedding(input_dim = num_words, output_dim = 32, input_length = max_len))
model.add(Flatten())
model.add(Dense(1, activation = 'sigmoid'))

model.compile(optimizer = 'rmsprop',
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

history2 = model.fit(pad_x_train_2, y_train,
                    epochs = 10,
                    batch_size = 32,
                    validation_split = 0.2)

hist_dict_2 = history2.history

plt.plot(hist_dict_2['loss'], 'b--', label = 'Train Loss')
plt.plot(hist_dict_2['val_loss'], 'r:', label = 'Validation Loss')
plt.legend()
plt.grid()

plt.figure()
plt.plot(hist_dict_2['accuracy'], 'b--', label = 'Train Accuracy')
plt.plot(hist_dict_2['val_accuracy'], 'r:', label = 'Validation Accuracy')
plt.legend()
plt.grid()

plt.show()

model.evaluate(pad_x_test_2, y_test)

# 출력 결과
loss: 0.5295 - accuracy: 0.8316
[0.5295160412788391, 0.8316400051116943]

  - 위의 결과도 정확도로 봤을때는 나쁘지 않지만 과적합이 됨

  - 그 이유는

  • 단어 간 관계나 문장 구조 등 의미적 연결 고려 x
  • 시퀀스 전체를 고려한 특성을 학습하는 것은 Embedding 층 위에 RNN층이나 1D 합성곱을 추가하는 것이 좋음

 

 

● 단어 임베딩의 종류

  • LSA
  • Word2Vec
  • Blove
  • FastText
  • etc...

 

 

7. Word2Vec

  • 분류 등과 같이 별도의 레이블 없이 텍스트 자체만 있어도 학습이 가능
  • Word2Vec의 방식(주변 단어의 관계를 이용)
    • CBOW(Continuous Bag-Of-Word)
      • 주변 단어의 임베딩을 더해서 대상 단어를 예측
    • Skip-Gram
      • 대상 단어의 임베딩으로 주변 단어를 예측
      • 일반적으로 CBOW보다 성능이 좋은 편
      • 한번에 여러 단어를 예측해야하기 때문에 비효율적
      • 최근에는 negative sampling이라는 방법 사용

https://www.researchgate.net/figure/CBOW-and-Skip-Gram-neural-architectures_fig14_328160770

 

 

8. 구텐베르크 프로젝트 예제

import requests
import re

 

  - 데이터 다운로드

res = requests.get('https://www.gutenberg.org/files/2591/2591-0.txt')
res

# 출력 결과
<Response [200]>
# 200이면 잘 응답한 것
# 404면 오류 발생한 것

 

  - 데이터 전처리

grimm = res.text[2801:530661]
grimm = re.sub(r'[^a-zA-Z\. ]', ' ', grimm)
sentences = grimm.split('. ')
data = [s.split() for s in sentences]

len(data)  # 3468


data[0]

# 출력 결과
['SECOND',
 'STORY',
 'THE',
 'SALAD',
 'THE',
 'STORY',
 'OF',
 'THE',
 'YOUTH',
 'WHO',
 'WENT',
 'FORTH',
 'TO',
 'LEARN',
 'WHAT',
 'FEAR',
 'WAS',
 'KING',
 'GRISLY',
 'BEARD',
 'IRON',
 'HANS',
 'CAT',
 'SKIN',
 'SNOW',
...
 'tree',
 'which',
 'bore',
 'golden',
 'apples']
# gensim 패키지로부터 Word2Vec을 불러오기
from gensim.models.word2vec import Word2Vec
# sg인자에 0을 넘겨주면 CBOW, 1을 넘겨주면 Skip-gram
# 최소 3번은 등장한 단어, 동시 처리의 수는 4개
model = Word2Vec(data, sg = 1, vector_size = 100, window = 3, min_count = 3, workers = 4)

 

  - 모델 저장 및 로드

# 저장
model.save('word2vec.model')

# 로드
pretrained_model = Word2Vec.load('word2vec.model')

 

  - 단어를 벡터로 변환

  • wv
pretrained_model.wv['princess']

# 출력 결과
array([-0.19268924,  0.17087255, -0.13460916,  0.20450976,  0.03542079,
       -0.31665406,  0.13296   ,  0.54076153, -0.18337499, -0.21417093,
        0.02725333, -0.31845513,  0.01819889,  0.10720193,  0.16601542,
       -0.19728081,  0.05753807, -0.12273175, -0.17903367, -0.22576232,
        0.2438455 ,  0.13664703,  0.18498562, -0.1679803 ,  0.07735273,
       -0.00432668, -0.00775897, -0.08363435, -0.12566872, -0.07055762,
        0.02887373, -0.08917326,  0.17351009, -0.18784055, -0.20769958,
        0.19657052,  0.01372425, -0.074237  , -0.10052767, -0.11275681,
        0.06725535, -0.09701315,  0.02844668,  0.05958825, -0.02586031,
       -0.01711333, -0.11226629, -0.08671231,  0.1945969 ,  0.01690222,
        0.07196116, -0.08172472, -0.05373074, -0.14637838,  0.16281295,
        0.06222549,  0.10643765,  0.07477342, -0.16238536,  0.03527208,
       -0.04292673,  0.04597842,  0.13826323, -0.19217554, -0.25257504,
        0.10983958,  0.03293723,  0.4319519 , -0.21335553,  0.24770555,
       -0.00888118,  0.02231867,  0.17330043, -0.10485211,  0.35415375,
       -0.08000654,  0.01478033, -0.03938808, -0.06453493,  0.02249427,
       -0.21435274, -0.01287377, -0.2137464 ,  0.21174915, -0.1006554 ,
        0.00902446,  0.05607878,  0.16368881,  0.13859129, -0.01395336,
        0.09382439,  0.08065708, -0.056269  ,  0.09765122,  0.188912  ,
        0.1668056 , -0.01361183, -0.14287405, -0.11452819, -0.20357099],
      dtype=float32)

# 'princess'라는 단어를 벡터로 변환한 값

 

  - 유추 또는 유비(analogy)

  • wv.similarity()에 두 단어를 넣어주면 코사인 유사도를 구할 수 있음
pretrained_model.wv.similarity('king', 'prince')

# 출력 결과
0.8212076
  • wv.most_similar()에 단어를 넘겨주면 가장 유사한 단어를 추출할 수 있음
pretrained_model.wv.most_similar('king')

# 출력 결과
[('daughter', 0.9241937398910522),
 ('son', 0.9213796257972717),
 ('woman', 0.9177201390266418),
 ('man', 0.897368848323822),
 ('queen', 0.8747967481613159),
 ('miller', 0.8610494136810303),
 ('old', 0.8595746755599976),
 ('young', 0.8504902124404907),
 ('wolf', 0.8450464010238647),
 ('But', 0.8406485319137573)]
  • wv.most_similar()에 positive와 negetive라는 옵션을 넘길 수 있음
# 'man + princess - woman'을 벡터 계산을 한 값을 출력
# man이고 princess인데 woman이 아닌 단어
pretrained_model.wv.most_similar(positive = ['man', 'princess'], negative = ['woman'])

# 출력 결과
[('bird', 0.9595717787742615),
 ('prince', 0.9491060376167297),
 ('cook', 0.9410891532897949),
 ('bride', 0.9401964545249939),
 ('huntsman', 0.9375050067901611),
 ('mouse', 0.9356588125228882),
 ('cat', 0.9344455003738403),
 ('giant', 0.9341970682144165),
 ('gardener', 0.9327394366264343),
 ('maid', 0.9326624870300293)]

 

  - gensim으로 학습된 단어 임베딩을 Keras에서 불러오기 

from keras.models import Sequential
from keras.layers import Embedding

num_words, emb_dim = pretrained_model.wv.vectors.shape

print(num_words)
print(emb_dim)

# 출력 결과
2446
100

 

  - gensim으로 학습된 단어 임베딩을 Keras의 임베딩 레이어의 가중치로 설정

emb = Embedding(input_dim = num_words, output_dim = emb_dim,
                trainable = False, weights = [pretrained_model.wv.vectors])

model = Sequential()
model.add(emb)

model.summary()

# 출력 결과
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_3 (Embedding)     (None, None, 100)         244600    
                                                                 
=================================================================
Total params: 244,600
Trainable params: 0
Non-trainable params: 244,600
_________________________________________________________________
# princess에 대한 결과 벡터
i = pretrained_model.wv.index_to_key.index('princess')

model.predict([i])

# 출력 결과
array([[-0.19268924,  0.17087255, -0.13460916,  0.20450976,  0.03542079,
        -0.31665406,  0.13296   ,  0.54076153, -0.18337499, -0.21417093,
         0.02725333, -0.31845513,  0.01819889,  0.10720193,  0.16601542,
        -0.19728081,  0.05753807, -0.12273175, -0.17903367, -0.22576232,
         0.2438455 ,  0.13664703,  0.18498562, -0.1679803 ,  0.07735273,
        -0.00432668, -0.00775897, -0.08363435, -0.12566872, -0.07055762,
         0.02887373, -0.08917326,  0.17351009, -0.18784055, -0.20769958,
         0.19657052,  0.01372425, -0.074237  , -0.10052767, -0.11275681,
         0.06725535, -0.09701315,  0.02844668,  0.05958825, -0.02586031,
        -0.01711333, -0.11226629, -0.08671231,  0.1945969 ,  0.01690222,
         0.07196116, -0.08172472, -0.05373074, -0.14637838,  0.16281295,
         0.06222549,  0.10643765,  0.07477342, -0.16238536,  0.03527208,
        -0.04292673,  0.04597842,  0.13826323, -0.19217554, -0.25257504,
         0.10983958,  0.03293723,  0.4319519 , -0.21335553,  0.24770555,
        -0.00888118,  0.02231867,  0.17330043, -0.10485211,  0.35415375,
        -0.08000654,  0.01478033, -0.03938808, -0.06453493,  0.02249427,
        -0.21435274, -0.01287377, -0.2137464 ,  0.21174915, -0.1006554 ,
         0.00902446,  0.05607878,  0.16368881,  0.13859129, -0.01395336,
         0.09382439,  0.08065708, -0.056269  ,  0.09765122,  0.188912  ,
         0.1668056 , -0.01361183, -0.14287405, -0.11452819, -0.20357099]],
      dtype=float32)

● 케라스 전이학습(tramsfer learning)

https://medium.com/the-official-integrate-ai-blog/transfer-learning-explained-7d275c1e34e2

  • 새로운 모델을 만들때 기존에 학습된 모델을 사용
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, BatchNormalization, Activation
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import *


# 예시로 학습된 vgg 데이터 불러오기
vgg16 = VGG16(weights = 'imagenet',
              input_shape = (32, 32, 3), include_top = False)

model = Sequential()
model.add(vgg16)

model.add(Flatten())
model.add(Dense(256))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dense(10, activation = 'softmax'))

model.summary()

# 출력 결과
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 vgg16 (Functional)          (None, 1, 1, 512)         14714688  
                                                                 
 flatten (Flatten)           (None, 512)               0         
                                                                 
 dense (Dense)               (None, 256)               131328    
                                                                 
 batch_normalization (BatchN  (None, 256)              1024      
 ormalization)                                                   
                                                                 
 activation (Activation)     (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 10)                2570      
                                                                 
=================================================================
Total params: 14,849,610
Trainable params: 14,849,098
Non-trainable params: 512
_________________________________________________________________
  • vgg16 이외에 MobileNet, ResNet50, Xceoption 모델 등이 존재하여 전이 학습에 이용가능

 

1. 예제: Dogs vs Cats

 

  - modules import

import tensorflow as tf
from tensorflow.keras.preprocessing.image import array_to_img, img_to_array, load_img, ImageDataGenerator
from tensorflow.keras.layers import Conv2D, Flatten, MaxPool2D, Input, Dropout, Dense
from tensorflow.keras import Model
from tensorflow.keras.optimizers import Adam

import os
import zipfile
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

 

  - 데이터 로드

# 외부에서 데이터 가져오기
import wget

wget.download("https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip")


# 압축 해제
local_zip = 'cats_and_dogs_filtered.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
# 현재 폴더에 압축해제
zip_ref.extractall()
zip_ref.close()


# 압축해제된 폴더를 기본 경로로 지정, 폴더 내의 train과 validation 폴더에 각각 접근
base_dir = 'cats_and_dogs_filtered'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')


# 압축해제된 폴더 내의 train cat, validation cat, train dog, validation dog 폴더에 각각 접근
train_cats_dir = os.path.join(train_dir, 'cats')
train_dogs_dir = os.path.join(train_dir, 'dogs')

validation_cats_dir = os.path.join(validation_dir, 'cats')
validation_dogs_dir = os.path.join(validation_dir, 'dogs')

train_cat_frames = os.listdir(train_cats_dir)
train_dog_frames = os.listdir(train_dogs_dir)

 

  - 이미지 보강된 데이터 확인

# ImageDataGenerator 정의
datagen = ImageDataGenerator(
    rotation_range = 40,
    width_shift_range = 0.2,
    height_shift_range = 0.2,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True,
    fill_mode = 'nearest'
)


# 이미지 로드
img_path = os.path.join(train_cats_dir, train_cat_frames[2])
img = load_img(img_path, target_size = (150, 150))
x = img_to_array(img)
x = x.reshape((1, ) + x.shape)

i = 0
for batch in datagen.flow(x, batch_size = 1):
    plt.figure(i)
    imgplot = plt.imshow(array_to_img(batch[0]))
    i += 1
    if i % 5 == 0:
        break

 

  - 학습, 검증 데이터셋의 Data Generator

train_datagen = ImageDataGenerator(
    rescale = 1. / 255,
    rotation_range = 40,
    width_shift_range = 0.2,
    height_shift_range = 0.2,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True
)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size = (150, 150),
    batch_size = 20,
    class_mode = 'binary'
)

val_datagen = ImageDataGenerator(rescale = 1. / 255)

validation_generator = val_datagen.flow_from_directory(
    validation_dir,
    target_size = (150, 150),
    batch_size = 20,
    class_mode = 'binary'
)


# 출력 결과
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.

 

  - 모델 구성 및 컴파일

model = Sequential()
model.add(Conv2D(32, (3, 3), activation = 'relu', input_shape = (150, 150, 3)))
model.add(MaxPool2D(2, 2))
model.add(Conv2D(64, (3, 3), activation = 'relu'))
model.add(MaxPool2D(2, 2))
model.add(Conv2D(128, (3, 3), activation = 'relu'))
model.add(MaxPool2D(2, 2))
model.add(Conv2D(128, (3, 3), activation = 'relu'))
model.add(MaxPool2D(2, 2))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))

model.compile(loss = 'binary_crossentropy',
              optimizer = Adam(learning_rate = 1e-4),
              metrics = ['acc'])

model.summary()

# 출력 결과
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 74, 74, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 36, 36, 64)       0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 17, 17, 128)      0         
 2D)                                                             
                                                                 
 conv2d_3 (Conv2D)           (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 7, 7, 128)        0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 6272)              0         
                                                                 
 dropout (Dropout)           (None, 6272)              0         
                                                                 
 dense_2 (Dense)             (None, 512)               3211776   
                                                                 
 dense_3 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습 및 학습 과정 시각화

history = model.fit(train_generator,
                    steps_per_epoch = 100,
                    epochs = 30,
                    batch_size = 256,
                    validation_data = validation_generator,
                    validation_steps = 50,
                    verbose = 2)

# 시각화
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))

plt.plot(epochs, loss, 'b--', label = 'Train Loss')
plt.plot(epochs, val_loss, 'b--', label = 'Validation Loss')
plt.grid()
plt.legend()

plt.plot(epochs, acc, 'b--', label = 'Train Accuracy')
plt.plot(epochs, val_acc, 'b--', label = 'Validation Accuracy')
plt.grid()
plt.legend()

plt.show()

 

  - 모델 저장

model.save('cats_and_dogs_model.h5')

 

  - 사전 훈련된 모델 사용

from tensorflow.keras.optimizers import RMSprop

conv_base = VGG16(weights = 'imagenet',
                  input_shape = (150, 150, 3), include_top = False)

def build_model_with_pretrained(convbase):
    model = Sequential()
    model.add(conv_base)
    model.add(Flatten())
    model.add(Dense(256, activation = 'relu'))
    model.add(Dense(1, activation = 'sigmoid'))

    model.compile(loss = binary_crossentropy,
                  optimizer = RMSprop(learning_rate = 2e-5),
                  metrics = ['accuracy'])
    return model
  • 파라미터 수 확인
model.build_model_with_pretrained(conv_base)
model.summary()

# 출력 결과
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 vgg16 (Functional)          (None, 4, 4, 512)         14714688  
                                                                 
 flatten_2 (Flatten)         (None, 8192)              0         
                                                                 
 dense_4 (Dense)             (None, 256)               2097408   
                                                                 
 dense_5 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 16,812,353
Trainable params: 16,812,353
Non-trainable params: 0
_________________________________________________________________

 

  - 레이어 동결

  • 훈련하기 전, 합성곱 기반 레이어들의 가중치 학습을 막기 위해 이를 동결
# 동결 전
print(len(model.trainable_weights))

# 출력 결과
30


# 동결 후
conv_base.trainable = False
print(len(model.trainable_weights))

# 출력 결과
4

 

  - 모델 컴파일

  • trainable 속성을 변경했기 때문에 다시 모델을 컴파일 해야함
model.compile(loss = 'binary_crossentropy',
              optimizer = RMSprop(learning_rate = 2e-5),
              metrics = ['accuracy'])

 

  - 이미지 제너레이터

train_datagen = ImageDataGenerator(
    rescale = 1. / 255,
    rotation_range = 40,
    width_shift_range = 0.2,
    height_shift_range = 0.2,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True
)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size = (150, 150),
    batch_size = 20,
    class_mode = 'binary'
)

val_datagen = ImageDataGenerator(rescale = 1. / 255)

validation_generator = val_datagen.flow_from_directory(
    validation_dir,
    target_size = (150, 150),
    batch_size = 20,
    class_mode = 'binary'
)

# 출력 결과
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.

 

  - 모델 재학습

history2 = model.fit(train_generator,
                    steps_per_epoch = 100,
                    epochs = 30,
                    batch_size = 256,
                    validation_data = validation_generator,
                    validation_steps = 50,
                    verbose = 2)

acc = history2.history['accuracy']
val_acc = history2.history['val_accuracy']
loss = history2.history['loss']
val_loss = history2.history['val_loss']
epochs = range(len(acc))

plt.plot(epochs, loss, 'b--', label = 'Train Loss')
plt.plot(epochs, val_loss, 'r:', label = 'Validation Loss')
plt.grid()
plt.legend()

plt.plot(epochs, acc, 'b--', label = 'Train Accuracy')
plt.plot(epochs, val_acc, 'r:', label = 'Validation Accuracy')
plt.grid()
plt.legend()

plt.show()

 

  - 모델 저장

model.save('cats_and_dogs_with_pretrained_model.h5')

 

 

2. Feature Map 시각화

  - 모델 구성

import numpy as np
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image


# 저장된 모델 로드
model = load_model('cats_and_dogs_model.h5')
model.summary()

# 출력 결과
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 74, 74, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 36, 36, 64)       0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 17, 17, 128)      0         
 2D)                                                             
                                                                 
 conv2d_3 (Conv2D)           (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 7, 7, 128)        0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 6272)              0         
                                                                 
 dropout (Dropout)           (None, 6272)              0         
                                                                 
 dense_2 (Dense)             (None, 512)               3211776   
                                                                 
 dense_3 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
img_path = 'cats_and_dogs_filtered/validation/dogs/dog.2000.jpg'

img = image.load_img(img_path, target_size = (150, 150))
img_tensor = image.img_to_array(img)
img_tensor = img_tensor[np.newaxis, ...]
img_tensor /= 255.
print(img_tensor.shape)

# 출력 결과
(1, 150, 150, 3)
plt.imshow(img_tensor[0])
plt.show()

# 레이어 중 일부만(8개) 출력
conv_output = [layer.output for layer in model.layer[:8]]
conv_output

# 출력 결과
[<KerasTensor: shape=(None, 148, 148, 32) dtype=float32 (created by layer 'conv2d')>,
 <KerasTensor: shape=(None, 74, 74, 32) dtype=float32 (created by layer 'max_pooling2d')>,
 <KerasTensor: shape=(None, 72, 72, 64) dtype=float32 (created by layer 'conv2d_1')>,
 <KerasTensor: shape=(None, 36, 36, 64) dtype=float32 (created by layer 'max_pooling2d_1')>,
 <KerasTensor: shape=(None, 34, 34, 128) dtype=float32 (created by layer 'conv2d_2')>,
 <KerasTensor: shape=(None, 17, 17, 128) dtype=float32 (created by layer 'max_pooling2d_2')>,
 <KerasTensor: shape=(None, 15, 15, 128) dtype=float32 (created by layer 'conv2d_3')>,
 <KerasTensor: shape=(None, 7, 7, 128) dtype=float32 (created by layer 'max_pooling2d_3')>]
activation_model = Model(inputs = [model.input], outputs = conv_output)
activations = activation_model.predict(img_tensor)
len(activations)

# 출력 결과
8

 

  - 시각화

print(activations[0].shape)
plt.matshow(activations[0][0, :, :, 7], cmap = 'viridis')
plt.show()

# 출력 결과
(1, 148, 148, 32)

print(activations[0].shape)
plt.matshow(activations[0][0, :, :, 10], cmap = 'viridis')
plt.show()

# 출력 결과
(1, 148, 148, 32)

 

  - 중간의 모든 활성화에 대해 시각화

# 각 layer에서 이미지의 변환과정을 시각화
layer_names = []
for layer in model.layers[:8]:
    layer_names.append(layer.name)

images_per_row = 16

for layer_name, layer_activation in zip(layer_names, activations):
    num_features = layer_activation.shape[-1]

    size = layer_activation.shape[1]

    num_cols = num_features // images_per_row
    display_grid = np.zeros((size * num_cols, size * images_per_row))

    for col in range(num_cols):
        for row in range(images_per_row):
            channel_image = layer_activation[0, :, :, col * images_per_row + row]
            channel_image -= channel_image.mean()
            channel_image /= channel_image.std()
            channel_image *= 64
            channel_image += 128
            channel_image =np.clip(channel_image, 0, 255).astype('unit8')
            display_grid[col * size : (col + 1) * size, row * size : (row + 1) * size] = channel_image
        
    scale = 1. / size

    plt.figure(figsize = (scale * display_grid.shape[1],
                          scale * display_grid.shape[0]))
    plt.title(layer_name)
    plt.grid(False)
    plt.imshow(display_grid, aspect = 'auto', cmap = 'viridis')

plt.show()

● CIFAR 10

  • 50,000개의 학습 데이터, 10,000개의 테스트 데이터로 구성
  • 데이터 복잡도가 MNIST보다 훨씬 높은 특징이 있음
    • 신경망이 특징을 검출하기 어려움

1. modules import

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Input, Dropout, BatchNormalization
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical

from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt
import numpy as np

 

 

2. 데이터 로드 및 전처리

(x_train_full, y_train_full), (x_test, y_test) = cifar10.load_data()
print(x_train_full.shape, y_train_full.shape)
print(x_test.shape, y_test.shape)

# 출력 결과
(50000, 32, 32, 3) (50000, 1)
(10000, 32, 32, 3) (10000, 1)


# 정답 데이터의 값은 레이블로 되어있음
print(y_test[0])

# 출력 결과
[3]


# 예시 데이터
np.random.seed(777)

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'sheep', 'truck']

sample_size = 9
random_idx = np.random.randint(60000, size = sample_size)

plt.figure(figsize = (5, 5))
for i, idx in enumerate(random_idx):
    plt.subplot(3, 3, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.imshow(x_train_full[i])
    plt.xlabel(class_names[int(y_train_full[i])])

plt.show()

  • 32 * 32 이미지라 화질이 낮음
# x 데이터 정규화
x_mean = np.mean(x_train_full, axis = (0, 1, 2))
x_std = np.std(x_train_full, axis = (0, 1, 2))
x_train_full = (x_train_full - x_mean) / x_std
x_test = (x_test - x_mean) / x_std


# 학습데이터와 검증데이터 분리
x_train, x_val, y_train, y_val = train_test_split(x_train_full, y_train_full, test_size = 0.3)


# 전처리한 데이터 형태 출력
print(x_train.shape)
print(y_train.shape)

print(x_val.shape)
print(y_val.shape)

print(x_test.shape)
print(y_test.shape)

# 출력 결과
(35000, 32, 32, 3)
(35000, 1)
(15000, 32, 32, 3)
(15000, 1)
(10000, 32, 32, 3)
(10000, 1)

 

 

3. 모델 구성 및 컴파일

def model_build():
    model = Sequential()

    input = Input(shape = (32, 32, 3))

    output = Conv2D(filters = 32, kernel_size = 3, padding = 'same', activation = 'relu')(input)
    output = MaxPool2D(pool_size = (2, 2), strides = 2, padding = 'same')(output)

    output = Conv2D(filters = 64, kernel_size = 3, padding = 'same', activation = 'relu')(output)
    output = MaxPool2D(pool_size = (2, 2), strides = 2, padding = 'same')(output)

    output = Conv2D(filters = 128, kernel_size = 3, padding = 'same', activation = 'relu')(output)
    output = MaxPool2D(pool_size = (2, 2), strides = 2, padding = 'same')(output)

    output = Flatten()(output)
    output = Dense(256, activation = 'relu')(output)
    output = Dense(128, activation = 'relu')(output)
    output = Dense(10, activation = 'softmax')(output)

    model = Model(inputs = [input], outputs = [output])

    model.compile(optimizer = Adam(learning_rate = 1e-4),
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['accuracy'])
    return model
model = model_build()
model.summary()

# 출력 결과
Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_3 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 conv2d_3 (Conv2D)           (None, 32, 32, 32)        896       
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 16, 16, 32)       0         
 2D)                                                             
                                                                 
 conv2d_4 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 8, 8, 64)         0         
 2D)                                                             
                                                                 
 conv2d_5 (Conv2D)           (None, 8, 8, 128)         73856     
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 4, 4, 128)        0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 2048)              0         
                                                                 
 dense_3 (Dense)             (None, 256)               524544    
                                                                 
 dense_4 (Dense)             (None, 128)               32896     
                                                                 
 dense_5 (Dense)             (None, 10)                1290      
                                                                 
=================================================================
Total params: 651,978
Trainable params: 651,978
Non-trainable params: 0
_________________________________________________________________

 

 

4. 모델 학습 및 평가

history = model.fit(x_train, y_train,
                    epochs = 30,
                    batch_size = 256,
                    validation_data = (x_val, y_val))

 

 

5. 학습 과정 시각화

plt.figure(figsize = (12, 4))

plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], 'b--', label = 'loss')
plt.plot(history.history['val_loss'], 'r:', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], 'b--', label = 'accuracy')
plt.plot(history.history['val_accuracy'], 'r:', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

  - 해당 모델은 성능이 좋지 않음

  - 규제화, 드롭아웃 등 과대적합을 방지하는 기술 필요

def model_build2():
    model = Sequential()

    input = Input(shape = (32, 32, 3))

    output = Conv2D(filters = 32, kernel_size = 3, padding = 'same', activation = 'relu')(input)
    output = BatchNormalization()(output)
    output = MaxPool2D(pool_size = (2, 2), strides = 2, padding = 'same')(output)

    output = Conv2D(filters = 64, kernel_size = 3, padding = 'same', activation = 'relu')(output)
    output = BatchNormalization()(output)
    output = MaxPool2D(pool_size = (2, 2), strides = 2, padding = 'same')(output)

    output = Conv2D(filters = 128, kernel_size = 3, padding = 'same', activation = 'relu')(output)
    output = BatchNormalization()(output)
    output = MaxPool2D(pool_size = (2, 2), strides = 2, padding = 'same')(output)
    output = Dropout(0.5)(output)

    output = Flatten()(output)
    output = Dense(256, activation = 'relu')(output)
    output = Dropout(0.5)(output)
    output = Dense(128, activation = 'relu')(output)
    output = Dense(10, activation = 'softmax')(output)

    model = Model(inputs = [input], outputs = [output])

    model.compile(optimizer = Adam(learning_rate = 1e-4),
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['accuracy'])
    return model
model2 = model_build2()
model2.summary()

Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_4 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 conv2d_6 (Conv2D)           (None, 32, 32, 32)        896       
                                                                 
 batch_normalization (BatchN  (None, 32, 32, 32)       128       
 ormalization)                                                   
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 16, 16, 32)       0         
 2D)                                                             
                                                                 
 conv2d_7 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 batch_normalization_1 (Batc  (None, 16, 16, 64)       256       
 hNormalization)                                                 
                                                                 
 max_pooling2d_7 (MaxPooling  (None, 8, 8, 64)         0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 8, 8, 128)         73856     
                                                                 
 batch_normalization_2 (Batc  (None, 8, 8, 128)        512       
 hNormalization)                                                 
                                                                 
 max_pooling2d_8 (MaxPooling  (None, 4, 4, 128)        0         
 2D)                                                             
                                                                 
 dropout (Dropout)           (None, 4, 4, 128)         0         
                                                                 
 flatten_2 (Flatten)         (None, 2048)              0         
                                                                 
 dense_6 (Dense)             (None, 256)               524544    
                                                                 
 dropout_1 (Dropout)         (None, 256)               0         
                                                                 
 dense_7 (Dense)             (None, 128)               32896     
                                                                 
 dense_8 (Dense)             (None, 10)                1290      
                                                                 
=================================================================
Total params: 652,874
Trainable params: 652,426
Non-trainable params: 448
_________________________________________________________________

 

 

6. 모델 학습 및 평가

history2 = model2.fit(x_train, y_train,
                      epochs = 30,
                      batch_size = 256,
                      validation_data = (x_val, y_val))

 

 

7. 학습 과정 시각화

plt.figure(figsize = (12, 4))

plt.subplot(1, 2, 1)
plt.plot(history2.history['loss'], 'b--', label = 'loss')
plt.plot(history2.history['val_loss'], 'r:', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history2.history['accuracy'], 'b--', label = 'accuracy')
plt.plot(history2.history['val_accuracy'], 'r:', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

  • 검증데이터의 결과가 많이 개선됨

1. modules import 

%load_ext tensorboard
import datetime
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets.fashion_mnist import load_data
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Dropout, Input, Flatten

 

 

2. 데이터 로드 및 전처리

(x_train, y_train), (x_test, y_test) = load_data()

x_train = x_train[..., np.newaxis]
x_test = x_test[..., np.newaxis]

x_train = x_train / 255.
x_test = x_test / 255.

print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

# 출력 결과
(60000, 28, 28, 1)
(60000,)
(10000, 28, 28, 1)
(10000,)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

 

3. 모델 구성 및 컴파일

def build_model():
    model = Sequential()

    input = Input(shape = (28, 28, 1))
    output = Conv2D(filters = 32, kernel_size = (3, 3))(input)
    output = Conv2D(filters = 64, kernel_size = (3, 3))(output)
    output = Conv2D(filters = 64, kernel_size = (3, 3))(output)
    output = Flatten()(output)
    output = Dense(units = 128, activation = 'relu')(output)
    output = Dense(units = 64, activation = 'relu')(output)
    output = Dense(units = 10, activation = 'softmax')(output)

    model = Model(inputs = [input], outputs = [output])

    model.compile(optimizer = 'adam',
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['acc'])
    return model

model_1 = build_model()
model_1.summary()

# 출력 결과
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 conv2d_1 (Conv2D)           (None, 24, 24, 64)        18496     
                                                                 
 conv2d_2 (Conv2D)           (None, 22, 22, 64)        36928     
                                                                 
 flatten (Flatten)           (None, 30976)             0         
                                                                 
 dense (Dense)               (None, 128)               3965056   
                                                                 
 dense_1 (Dense)             (None, 64)                8256      
                                                                 
 dense_2 (Dense)             (None, 10)                650       
                                                                 
=================================================================
Total params: 4,029,706
Trainable params: 4,029,706
Non-trainable params: 0
_________________________________________________________________

 

 

4. 모델 학습

hist_1 = model_1.fit(x_train, y_train,
                     epochs = 25,
                     validation_split = 0.3,
                     batch_size = 128)

 

 

5. 학습 결과 시각화

plt.figure(figsize = (12, 4))
plt.subplot(1, 2, 1)
plt.plot(hist_1.history['loss'], 'b--', label = 'loss')
plt.plot(hist_1.history['val_loss'], 'r:', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(hist_1.history['acc'], 'b--', label = 'accuracy')
plt.plot(hist_1.history['val_acc'], 'r:', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

 

 

6. 모델 평가

model_1.evaluate(x_test, y_test)

# 출력 결과
loss: 1.1168 - acc: 0.8566
[1.116817831993103, 0.8565999865531921]

 

 

7. 모델 재구성(학습 파라미터 수 비교)

def build_model_2():
    model = Sequential()

    input = Input(shape = (28, 28, 1))
    output = Conv2D(filters = 32, kernel_size = (3, 3))(input)
    output = MaxPool2D(strides = (2, 2))(output)
    output = Conv2D(filters = 64, kernel_size = (3, 3))(output)
    output = MaxPool2D(strides = (2, 2))(output)
    output = Conv2D(filters = 64, kernel_size = (3, 3))(output)
    output = MaxPool2D(strides = (2, 2))(output)
    output = Flatten()(output)
    output = Dense(units = 128, activation = 'relu')(output)
    output = Dropout(0.3)(output)
    output = Dense(units = 64, activation = 'relu')(output)
    output = Dropout(0.3)(output)
    output = Dense(units = 10, activation = 'softmax')(output)

    model = Model(inputs = [input], outputs = [output])

    model.compile(optimizer = 'adam',
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['acc'])
    return model

model_2 = build_model_2()
model_2.summary()

# 출력 결과
Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_3 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d_6 (Conv2D)           (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 13, 13, 32)       0         
 )                                                               
                                                                 
 conv2d_7 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 3, 3, 64)          36928     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 1, 1, 64)         0         
 2D)                                                             
                                                                 
 flatten_2 (Flatten)         (None, 64)                0         
                                                                 
 dense_6 (Dense)             (None, 128)               8320      
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense_7 (Dense)             (None, 64)                8256      
                                                                 
 dropout_1 (Dropout)         (None, 64)                0         
                                                                 
 dense_8 (Dense)             (None, 10)                650       
                                                                 
=================================================================
Total params: 72,970
Trainable params: 72,970
Non-trainable params: 0
_________________________________________________________________
  • 학습 파라미터 수가 줄어듦

 

 

8. 모델 재학습

hist_2 = model_2.fit(x_train, y_train,
                     epochs = 25,
                     validation_split = 0.3,
                     batch_size = 128)

# 재학습 결과 시각화
plt.figure(figsize = (12, 4))
plt.subplot(1, 2, 1)
plt.plot(hist_2.history['loss'], 'b--', label = 'loss')
plt.plot(hist_2.history['val_loss'], 'r:', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(hist_2.history['acc'], 'b--', label = 'accuracy')
plt.plot(hist_2.history['val_acc'], 'r:', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

  • 처음 모델보다 학습데이터에 오버피팅이 덜 된 모습

 

9. 모델 재평가

model_2.evaluate(x_test, y_test)

# 출력 결과
loss: 0.4026 - acc: 0.8830
[0.4026452302932739, 0.8830000162124634]

 

 

10. 모델 성능 높이기(많은 레이어 쌓기)

from tensorflow.keras.layers import BatchNormalization, ReLU

def build_model_3():
    model = Sequential()

    input = Input(shape = (28, 28, 1))
    output = Conv2D(filters = 32, kernel_size = 3, activation = 'relu', padding = 'same')(input)
    output = Conv2D(filters = 64, kernel_size = 3, activation = 'relu', padding = 'valid')(output)
    output = MaxPool2D(strides = (2, 2))(output)
    output = Dropout(0.5)(output)

    output = Conv2D(filters = 128, kernel_size = 3, activation = 'relu', padding = 'same')(output)
    output = Conv2D(filters = 256, kernel_size = 3, activation = 'relu', padding = 'valid')(output)
    output = MaxPool2D(strides = (2, 2))(output)
    output = Dropout(0.5)(output)

    output = Flatten()(output)
    output = Dense(units = 256, activation = 'relu')(output)
    output = Dropout(0.5)(output)
    output = Dense(units = 100, activation = 'relu')(output)
    output = Dropout(0.5)(output)
    output = Dense(units = 10, activation = 'softmax')(output)

    model = Model(inputs = [input], outputs = [output])

    model.compile(optimizer = 'adam',
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['acc'])
    return model

model_3 = build_model_3()
model_3.summary()

# 출력 결과
Model: "model_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_4 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d_9 (Conv2D)           (None, 28, 28, 32)        320       
                                                                 
 conv2d_10 (Conv2D)          (None, 26, 26, 64)        18496     
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 13, 13, 64)       0         
 2D)                                                             
                                                                 
 dropout_2 (Dropout)         (None, 13, 13, 64)        0         
                                                                 
 conv2d_11 (Conv2D)          (None, 13, 13, 128)       73856     
                                                                 
 conv2d_12 (Conv2D)          (None, 11, 11, 256)       295168    
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 5, 5, 256)        0         
 2D)                                                             
                                                                 
 dropout_3 (Dropout)         (None, 5, 5, 256)         0         
                                                                 
 flatten_3 (Flatten)         (None, 6400)              0         
                                                                 
 dense_9 (Dense)             (None, 256)               1638656   
                                                                 
 dropout_4 (Dropout)         (None, 256)               0         
                                                                 
 dense_10 (Dense)            (None, 100)               25700     
                                                                 
 dropout_5 (Dropout)         (None, 100)               0         
                                                                 
 dense_11 (Dense)            (None, 10)                1010      
                                                                 
=================================================================
Total params: 2,053,206
Trainable params: 2,053,206
Non-trainable params: 0
_________________________________________________________________

 

  - 모델 학습 및 결과 시각화

hist_3 = model_3.fit(x_train, y_train,
                     epochs = 25,
                     validation_split = 0.3,
                     batch_size = 128)

  - 과적합은 되지 않았지만 층을 늘려도 좋은 성능을 낼 수 있음

plt.figure(figsize = (12, 4))
plt.subplot(1, 2, 1)
plt.plot(hist_3.history['loss'], 'b--', label = 'loss')
plt.plot(hist_3.history['val_loss'], 'r:', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(hist_3.history['acc'], 'b--', label = 'accuracy')
plt.plot(hist_3.history['val_acc'], 'r:', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

model_3.evaluate(x_test, y_test)

# 출력 결과
loss: 0.2157 - acc: 0.9261
[0.21573999524116516, 0.9261000156402588]

 

 

11. 모델 성능 높이기(이미지 보강, Image Augmentation)

from tensorflow.keras.preprocessing.image import ImageDataGenerator

image_generator = ImageDataGenerator(
    rotation_range = 10,
    zoom_range = 0.2,
    share_range = 0.6,
    width_shift_range = 0.1,
    height_shift_range = 0.1,
    horizontal_flip = True,
    vertival_flip = False
)

augment_size = 200

print(x_train.shape)
print(x_train[0].shape)

# 출력 결과
(60000, 28, 28, 1)
(28, 28, 1)
x_augment = image_generator.flow(np.tile(x_train[0].reshape(28 * 28 * 1), augment_size).reshape(28 * 28 * 1),
                                 np.zeros(augment_size), batch_size = augment_size, shuffle = False).next()[0]

plt.figure(figsize = (10, 10))
for i in range(1, 101):
    plt.subplot(10, 10, i)
    plt.axis('off')
    plt.imshow(x_augment[i - 1].reshape(28, 28), cmap = 'gray')

  • 위의 코드를 사용해 학습에 사용할 데이터 추가
from tensorflow.keras.preprocessing.image import ImageDataGenerator

image_generator = ImageDataGenerator(
    rotation_range = 15,
    zoom_range = 0.1,
    share_range = 0.6,
    width_shift_range = 0.15,
    height_shift_range = 0.1,
    horizontal_flip = True,
    vertival_flip = False
)

augment_size = 30000

random_mask = np.random.randint(x_train.shape[0], size = augment_size)
x_augmented = x_train[random_mask].copy()
y_augmented = y_train[random_mask].copy()

x_augmented = image_generator.flow(x_augmented, np.zeros(augment_size),
                                   batch_size = augment_size, shuffle = False).next()[0]
x_train = np.concatenate((x_train, x_augmented))
y_train = np.concatenate((y_train, y_augmented))

# 생성한 augment 30000개가 더 추가됨
print(x_train.shape)

# 출력 결과
(90000, 28, 28, 1)

 

  - 모델 학습 및 결과 시각화

model_4 = build_model_3()
model_4.summary()

# 출력 결과
Model: "model_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_5 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d_13 (Conv2D)          (None, 28, 28, 32)        320       
                                                                 
 conv2d_14 (Conv2D)          (None, 26, 26, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 13, 13, 64)       0         
 2D)                                                             
                                                                 
 dropout_6 (Dropout)         (None, 13, 13, 64)        0         
                                                                 
 conv2d_15 (Conv2D)          (None, 13, 13, 128)       73856     
                                                                 
 conv2d_16 (Conv2D)          (None, 11, 11, 256)       295168    
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 5, 5, 256)        0         
 2D)                                                             
                                                                 
 dropout_7 (Dropout)         (None, 5, 5, 256)         0         
                                                                 
 flatten_4 (Flatten)         (None, 6400)              0         
                                                                 
 dense_12 (Dense)            (None, 256)               1638656   
                                                                 
 dropout_8 (Dropout)         (None, 256)               0         
                                                                 
 dense_13 (Dense)            (None, 100)               25700     
                                                                 
 dropout_9 (Dropout)         (None, 100)               0         
                                                                 
 dense_14 (Dense)            (None, 10)                1010      
                                                                 
=================================================================
Total params: 2,053,206
Trainable params: 2,053,206
Non-trainable params: 0
_________________________________________________________________
hist_4 = model_4.fit(x_train, y_train,
                     epochs = 25,
                     validation_spli = 0.3,
                     batch_size = 128)

plt.figure(figsize = (12, 4))
plt.subplot(1, 2, 1)
plt.plot(hist_4.history['loss'], 'b--', label = 'loss')
plt.plot(hist_4.history['val_loss'], 'r:', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(hist_4.history['acc'], 'b--', label = 'accuracy')
plt.plot(hist_4.history['val_acc'], 'r:', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

model_4.evaluate(x_test, y_test)

# 출력 결과
loss: 0.2023 - acc: 0.9313
[0.2023032009601593, 0.9312999844551086]

 

  - 학습 인자를 이전과 다르게 주면서 학습하면 더 잘 나올 것

1. 주요 레이어

  - Conv2D

  • tensorflow.keras.layers.Conv2D
  • tf.nn.conv2d
import tensorflow as tf
from tensorflow.keras.layers import Conv2D

import matplotlib.pyplot as plt

import numpy as np
from sklearn.datasets import load_sample_image

china = load_sample_image('china.jpg') / 255.
print(china.dtype)
print(china.shape)

# 출력 결과
float64
(427, 640, 3)


plt.imshow(china)
plt.show

flower = load_sample_image('flower.jpg') / 255.
print(flower.dtype)
print(flower.shape)

# 출력 결과
float64
(427, 640, 3)


plt.imshow(flower)
plt.show()

images = np.array([china, flower])
batch_size, height, width, channels = images.shape
print(images.shape)

# 출력 결과
(2, 427, 640, 3)
# 필터 적용
filters = np.zeros(shape = (7, 7, channels, 2), dtype = np.float32)
# 수직선 추가
filters[:, 3, :, 0] = 1
# 수평선 추가
filters[3, :, :, 1] = 1

print(filters.shape)

# 출력 결과
(7, 7, 3, 2)
# 텐서플로우로 conv2d 사용하는 방법
outputs = tf.nn.conv2d(images, filters, strides = 1, padding = 'SAME')
print(outputs.shape)
plt.imshow(outputs[0, :, :, 1], cmap = 'gray')
plt.show()

# 출력 결과
(2, 427, 640, 2)

plt.imshow(outputs[0, :, :, 0], cmap = 'gray')
plt.show()

# keras로 conv2d 사용하는 방법
conv = Conv2D(filters = 32, kernel_size = 3, strides = 1,
              padding = 'same', activation = 'relu')

 

  - MaxPool2D

  • 텐서플로우 저수준 딥러닝 API
    • tf.nn.max_pool
    • 사용자가 사이즈를 맞춰줘야함
    • keras의 모델의 층으로 사용하고  싶으면 Lambda 층으로 감싸줘야함
  • Keras 고수준 API
    • keras.layers.MaxPool2D
import tensorflow as tf
from tensorflow.keras.layers import MaxPool2D, Lambda

output = tf.nn.max_pool(images,
                        ksize = (1, 1, 1, 3),
                        strides = (1, 1, 1, 3),
                        padding = 'VALID')

# 텐서플로우에서 max pool 사용하는 방법
output_keras = Lambda(
    lambda X: tf.nn.maxpool(X, ksize = (1, 1, 1, 3), strides = (1, 1, 1, 3), padding = 'VALID')
)


# 케라스에서 max pool 사용하는 방법
max_pool = MaxPool2D(pool_size = 2)
flower = load_sample_image('flower.jpg') / 255.
print(flower.dtype)
print(flower.shape)

# 출력 결과
float64
(427, 640, 3)


# 차원 추가
flower = np.expand_dims(flower, axis = 0)
flower.shape

# 출력 결과
(1, 427, 640, 3)


# pool size를 2로 maxpool 적용으로 데이터 수는 1/2
output = Conv2D(filters = 32, kernel_size = 3, strides = 1, padding = 'SAME', activation = 'relu')(flower)
output = MaxPool2D(pool_size = 2)(output)
output.shape

# 출력 결과
TensorShape([1, 213, 320, 32])
plt.imshow(output[0, :, :, 8], cmap = 'gray')
plt.show()

사이즈가 줄어든 만큼 원본보다 해상도가 줄어듦

 

  - AvgPool2D

  • 텐서플로우 저수준 딥러닝 API
    • tf.nn.avg_pool
  • 케라스 고수준 API
    • keras.layers.AvgPool2D
from tensorflow.keras.layers import AvgPool2D

# 원본
flower.shape

# 출력 결과
(1, 427, 640, 3)


# AvgPool 적용(데이터 크기 1/2)
output = Conv2D(filters = 32, kernel_size = 3, strides = 1, padding = 'SAME', activation = 'relu')(flower)
output = AvgPool2D(pool_size = 2)(output)
output.shape

# 출력 결과
TensorShape([1, 213, 320, 32])
plt.imshow(output[0, :, : , 8], cmap = 'gray')
plt.show()

 

  - GlobalAvgPool2D(전역 평균 풀링 층)

  • keras.layers.GlobalAvgPool2D()
  • 특징 맵 각각의 평균값을 출력하는 것이므로, 특성맵에 있는 대부분의 정보를 잃음
  • 출력층에는 유용할 수 있음
from tensorflow.keras.layers import GlobalAvgPool2D

output = Conv2D(filters = 32, kernel_size=  3, strides = 1, padding = 'SAME', activation = 'relu')(flower)
output = GlobalAvgPool2D()(output)
output.shape

# 출력 결과
TensorShape([1, 32])

 

 

2. 예제로 보는 CNN 구조와 학습

● 일반적인 구조

  - modules import

%load_ext tensorboard

import datetime
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, AvgPool2D, Dropout

from tensorflow.keras import datasets
from tensorflow.keras.utils import to_categorical, plot_model

 

  - 데이터 로드 및 전처리

(x_train, y_train), (x_test, y_test) = datasets.fashion_mnist.load_data()

# 원본 데이터 형태
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

# 출력 결과
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)


# x 데이터에 축 하나씩 추가
x_train = x_train[:, :, :, np.newaxis]
x_test = x_test[:, :, :, np.newaxis]
print(x_train.shape)
print(x_test.shape)

# 출력 결과
(60000, 28, 28, 1)
(10000, 28, 28, 1)


# y 데이터 카테고리화
num_classes = 10

y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)
print(y_train.shape)
print(y_test.shape)

# 출력 결과
print(y_train.shape)
print(y_test.shape)


# x 데이터 표준화
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255.
x_test /= 255.

 

  -  CNN을 위한 간단한 모델

def build():
    model = Sequential([Conv2D(64, 7, activation = 'relu', padding = 'same', input_shape = [28, 28, 1]),
                        MaxPool2D(pool_size = 2),
                        Conv2D(128, 3, activation = 'relu', padding = 'same'),
                        MaxPool2D(pool_size = 2),
                        Conv2D(256, 3, activation = 'relu', padding = 'SAME'),
                        MaxPool2D(pool_size = 2),
                        Flatten(),
                        Dense(128, activation = 'relu'),
                        Dropout(0.5),
                        Dense(64, activation = 'relu'),
                        Dropout(0.5),
                        Dense(10, activation = 'softmax')])
    return model

 

  - 모델 컴파일

model = build()
model.compile(optimizer = 'adam',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])
model.summary()

# 출력 결과
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_11 (Conv2D)          (None, 28, 28, 64)        3200      
                                                                 
 max_pooling2d_9 (MaxPooling  (None, 14, 14, 64)       0         
 2D)                                                             
                                                                 
 conv2d_12 (Conv2D)          (None, 14, 14, 128)       73856     
                                                                 
 max_pooling2d_10 (MaxPoolin  (None, 7, 7, 128)        0         
 g2D)                                                            
                                                                 
 conv2d_13 (Conv2D)          (None, 7, 7, 256)         295168    
                                                                 
 max_pooling2d_11 (MaxPoolin  (None, 3, 3, 256)        0         
 g2D)                                                            
                                                                 
 flatten_2 (Flatten)         (None, 2304)              0         
                                                                 
 dense_6 (Dense)             (None, 128)               295040    
                                                                 
 dropout_4 (Dropout)         (None, 128)               0         
                                                                 
 dense_7 (Dense)             (None, 64)                8256      
                                                                 
 dropout_5 (Dropout)         (None, 64)                0         
                                                                 
 dense_8 (Dense)             (None, 10)                650       
                                                                 
=================================================================
Total params: 676,170
Trainable params: 676,170
Non-trainable params: 0
_________________________________________________________________
plot_model(model)

 

  - Hyper Parameters

callbacks = [tf.keras.callbacks.TensorBoard(log_dir = './logs')]
EPOCHS = 20
BATCH_SIZE = 200
VERBOSE = 1

 

  - 모델 학습(GPU 추천)

  • validation_split을 통해 검증 데이터셋을 생성
hist = model.fit(x_train, y_train,
                 epochs = EPOCHS,
                 batch_size = BATCH_SIZE,
                 validation_split = 0.3,
                 callbacks = callbacks,
                 verbose = VERBOSE)

 

  • 텐서보드로 확인
log_dir = '.logs' + datetime.datetime.now().strftime('%Y%m%d-%H%M%S')
%tensorboard --logdir logs/

 

● LeNet-5(코드 출처: https://datahacker.rs/lenet-5-implementation-tensorflow-2-0/)

  • CNN의 초창기 모델
  • 필기체 인식을 위한 모델

https://www.researchgate.net/figure/The-LeNet-5-Architecture-a-convolutional-neural-network_fig4_321586653

  - modules import

import datetime
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, AvgPool2D, Dropout

from tensorflow.keras import datasets
from tensorflow.keras.utils import to_categorical, plot_model

from sklearn.model_selection import train_test_split

 

  - 데이터 로드 및 전처리

(x_train_full, y_train_full), (x_test, y_test) = datasets.mnist.load_data()

x_train, x_val ,y_train, y_val = train_test_split(x_train_full, y_train_full, test_size = 0.3, random_state = 777)

x_train = x_train[..., np.newaxis]
x_val = x_val[..., np.newaxis]
x_test = x_test[..., np.newaxis]

num_classes = 10
y_train = to_categorical(y_train, num_classes)
y_val = to_categorical(y_val, num_classes)
y_test = to_categorical(y_test, num_classes)

x_train = x_train.astype('float32')
x_val = x_val.astype('float32')
x_test = x_test.astype('float32')

x_train /= 255.
x_val /= 255.
x_test /= 255.

print(x_train.shape)
print(y_train.shape)
print(x_val.shape)
print(y_val.shape)
print(x_test.shape)
print(y_test.shape)

# 출력 결과
(42000, 28, 28, 1)
(42000, 10)
(18000, 28, 28, 1)
(18000, 10)
(10000, 28, 28, 1)
(10000, 10)

 

  -  모델 구성 및 컴파일

class LeNet(Sequential):
    def __init__(self, input_shape, nb_classes):
        super().__init__()

        self.add(Conv2D(6, kernel_size = (5, 5), strides = (1, 1), activation = 'tanh', input_shape = input_shape, padding = 'SAME'))
        self.add(AvgPool2D(pool_size = (2, 2), strides = (2, 2), padding = 'valid'))
        self.add(Conv2D(16, kernel_size = (5, 5), strides = (1, 1), axtivation = 'tanh', padding = 'valid'))
        self.sdd(AvgPool2D(pool_size = (2, 2), strides = (2, 2), padding = 'valid'))
        self.add(Flatten())
        self.add(Dense(120, activation = 'tanh'))
        self.add(Dense(84, activation = 'tanh'))
        self.add(Dense(nb_classes, activation = 'softmax'))

        self.compile(optimizer = 'adam',
                     loss = 'categorical_crossentropy',
                     metrics = ['accuracy'])

model = LeNet(input_shape = (28, 28, 1), nb_classes = 10)
model.summary()

# 출력 결과
Model: "le_net_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_17 (Conv2D)          (None, 28, 28, 6)         156       
                                                                 
 average_pooling2d_3 (Averag  (None, 14, 14, 6)        0         
 ePooling2D)                                                     
                                                                 
 conv2d_18 (Conv2D)          (None, 10, 10, 16)        2416      
                                                                 
 average_pooling2d_4 (Averag  (None, 5, 5, 16)         0         
 ePooling2D)                                                     
                                                                 
 flatten_3 (Flatten)         (None, 400)               0         
                                                                 
 dense_9 (Dense)             (None, 120)               48120     
                                                                 
 dense_10 (Dense)            (None, 84)                10164     
                                                                 
 dense_11 (Dense)            (None, 10)                850       
                                                                 
=================================================================
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0
_________________________________________________________________
plot_model(model, show_shapes = True)

 

  - Hyper Parameters

EPOCHS = 20
BATHC_SIZE = 128
VERBOSE = 1

 

  - 모델 학습

hist = model.fit(x_train, y_train,
                 epochs = EPOCHS,
                 batch_size = BATCH_SIZE,
                 validation_data = (x_val, y_val),
                 verbose = VERBOSE)

 

  - 학습 결과 시각화

plt.figure(figsize = (12, 6))

plt.subplot(1, 2, 1)
plt.plot(hist.history['loss'], 'b-', label = 'loss')
plt.plot(hist.history['val_loss'], 'm--', label = 'val_loss')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(hist.history['accuracy'], 'g-', label = 'accuracy')
plt.plot(hist.history['val_accuracy'], 'r-', label = 'val_accuracy')
plt.xlabel('Epochs')
plt.grid()
plt.legend()

plt.show()

 

  - 모델 평가

model.evaluate(x_test, y_test)

# 출력 결과
313/313 [==============================] - 3s 7ms/step - loss: 0.0564 - accuracy: 0.9854
[0.0564129501581192, 0.9854000210762024]

+ Recent posts