[강의2] ⭐ 귀납적 학습 알고리즘 정의

학습 알고리즘 정의

Learning algorithm: Inferring a function from training dataset

[부교재] learning task is to determine a hypothesis h identical to the target concept c over the entire set of instances X, the only information available about c is its value over the training examples.

function에 새로운 input을 입력하여 output을 예측(추정)한다.

학습 알고리즘 분석

구분 \ 관점	집합	함수	통계
목적	Find a hypothesis h such that h(x) = c(x) for all x in X(Instance Space)	target function의 알려지지 않은 모수 θ를 찾는다	모집단 분포(확률함수)의 알려지지 않은 모수 θ를 찾는다.
입력	가설공간 H (hypothesis representation)
데이터셋 D
searching strategy	모델 g(x	θ)
데이터셋 D
손실함수 L(r, g(x	θ))	확률분포(확률변수, 확률함수) X, p(x; θ)
표본 X_i
추정량?
과정	searching strategy에 따라 D에 제일 fit한 h를 선택	손실함수 L(r, g(x	θ))의 합인 손실 E(θ
출력	가설 h	모델의 모수 θ	?

[부교재] Hypothesis representation defines a continuously parameterized space of potential hypotheses. (e.g., linear functions, logical descriptions, decision trees, artificial neural networks)

[주교재] Model selection is choosing between possible H.

[주교재] Generalization: How well a model trained on the training set predicts the right output for new instances.

[부교재 2.7절] Inductive bias의 엄밀한 정의

데이터셋 D를 제외한 가설공간 H, searching strategy는 inductive bias에 해당한다.

학습 알고리즘 목적 → 과정되는 원리

Inductive learning hypothesis: training dataset에서 손실이 최소인 가설이 target function과 가장 유사하다고 가정한다.

→ training dataset에서 손실이 최소인 가설을 학습 알고리즘의 결과로 선택한다.

[부교재] The best hypothesis regarding unseen instances is the hypothesis that best fits the observed training data.

통계에서는 표본(sample)에서 얻은 통계량(statistic)으로 모집단의 모수를 추정한다.