Overfitting?

Train data set์—์„œ๋Š” accuracy๊ฐ€ ๋†’๊ฒŒ ๋‚˜์˜ค์ง€๋งŒ,

Test data set์—์„œ accuracy๊ฐ€ ๋‚ฎ๊ฒŒ ๋‚˜์˜ค๋Š” ํ˜„์ƒ

Layer์ˆ˜๊ฐ€ ๋Š˜์–ด๋‚  ์ˆ˜๋ก training err๋Š” ๋–จ์–ด์ง„๋‹ค.

๊ทธ๋Ÿฐ๋ฐ test ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์—ˆ์„ ๊ฒฝ์šฐ err๊ฐ€ ์˜ฌ๋ผ๊ฐ„๋‹ค.

Solutions for overfitting

  1. ํ›ˆ๋ จ๋ฐ์ดํ„ฐ๋ฅผ ๋” ๊ตฌํ•œ๋‹ค.
  2. Feature๋ฅผ ์ค„์ธ๋‹ค(Machine Learning algorithm)
  3. Regularization(Weight)

Regularization

Weight๊ฐ€ ๋„ˆ๋ฌด ๋“ค์ญ‰๋‚ ์ญ‰ํ•ด์ง€๋Š”, ์ฆ‰ overfitting์ด ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์„ ๊ทœ์ œํ•˜๋Š” ๊ฒƒ.

๊ฐ’์ด ์ƒ๋Œ€์ ์œผ๋กœ ๊ณ ๋ฅด๊ฒŒ ๋ถ„ํฌํ•˜๋„๋ก w๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด overfitting์„ ๋ฐฉ์ง€ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

Regularization (mathematics) - Wikipedia

Dropout

์™„์ „ ์—ฐ๊ฒฐ๋ง์œผ๋กœ๋ถ€ํ„ฐ ๋Š์–ด๋ฒ„๋ฆฌ์ž!

๊ฐ๊ฐ์˜ ๋‰ด๋Ÿฐ์€ ํŠน์ • ๋ถ€๋ถ„์„ ๋‹ด๋‹นํ•˜๋Š” ์ „๋ฌธ๊ฐ€๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋˜๋Š”๋ฐ,

์ด๊ฑธ ๋ชจ๋‘ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ๋ณด๋‹ค, ์ค‘๊ฐ„ ์ค‘๊ฐ„ ๋Š์€ ๋’ค ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด ๋ณด๋‹ค ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

ํ•™์Šตํ•  ๋•Œ๋งŒ! dropout์„ ์‚ฌ์šฉํ•œ๋‹ค.

ํ…Œ์ŠคํŠธ ์‹œ์—๋Š” ๋ชจ๋“  ์ „๋ฌธ๊ฐ€๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ ํ‰๊ฐ€ํ•œ๋‹ค!

dropout_rate = tf.placeholder("float")
_L1 = tf.nn.relu(tf.add(tf.matmul(X, W1), B1))
L1 = tf.nn.dropout(_L1, dropout_rate)
...
# Train ํ•  ๋•Œ,
sess.run(Train, feed_dict={X:batch_xs, Y: batch_ys, dropout_rate: 0.7})
 
# ํ‰๊ฐ€ํ•  ๋•Œ,
print "Accuracy", accuracy.eval({X:X_testset, Y:Y_testset, dropout_rate: 1})

ํ‰๊ฐ€ ์‹œ์—๋Š” dropout = 1๋กœ ํ•œ๋‹ค.

Ensemble

๋…๋ฆฝ์ ์ธ ๋ชจ๋ธ n๊ฐœ์˜ ๊ฒฐ๊ณผ๋ฌผ์„ ์ง‘ํ•ฉํ•˜์—ฌ ์ตœ์ข… ๊ฒฐ๊ณผ๋ฌผ์„ ์ œ์ถœํ•œ๋‹ค!