2.7 Model Selection and Generalization

Inductive bias의 필요성

데이터셋에 consistent한 가설을 선택하는 것이 학습 알고리즘이라고 하자. instance space의 모든 instance에 대한 label이 주어지지 않는 한, 주어진 dataset에 consistent한 가설은 반드시 여러 개이다(Version Space 크기 > 1). 즉, 가설 공간 내에 consistent한 가설을 고르는 것만으로는 학습 결과로 유일한 가설을 출력할 수 없다. 그러나 새로운 데이터에 대한 concept을 예측하기 위해선 반드시 가설이 하나여야 한다. 그러므로 데이터셋만으로는 가설 공간에서 오직 하나의 해(가설)을 찾기에 불충분하다고 결론 내린다.

ill-posed problem: 무수히 해가 많은 문제

As we see more training examples, we know more about the underlying function, and we carve out more hypotheses that are inconsistent from the hypothesis class, but we still are left with many consistent hypotheses.

So because learning is ill-posed, and data by itself is not sufficient to find the solution, we should make some extra assumptions to have a unique solution with the data we have.

The set of assumptions we make to have learning possible is called the inductive bias of the learning algorithm.

Inductive bias 개념과 모델 선택 정의

<aside> 📌

inductive bias는 가설 공간에서 하나의 가설만 선택(=학습)할 수 있게 만들어주는 가정들의 집합이다.

</aside>

inductive bias 예시

가설 공간 제약 (모델 형태 제시)
- family car에 대한 가설은 그래프 상에서 직사각형 형태이다.
- 선형 회귀에서 선형 함수를 가정한다.
- 몸무게는 정규분포를 따른다.
가설 선택 기준 (손실 함수 설정)
- 가장 큰 margin을 가진 직사각형을 선택한다.
- 모든 선 중에서 오차 제곱을 최소화하는 선을 선택한다.
- MLE, MAP, Bayesian Estimation

모델 선택 정의

We know that each hypothesis class has a certain capacity and can learn only certain functions. The class of functions that can be learned(=hypothesis class) can be extended by using a hypothesis class with larger capacity, containing more complex hypotheses.

가설 공간의 제약은 가설의 형태를 제한한다. 가설 공간이 커질수록 가설의 복잡도도 증가한다.

회귀) 가설 공간에서 다항식의 높은 차수 허용 → 가설인 다항 함수가 복잡해짐

Thus learning is not possible without inductive bias, and now the question is how to choose the right bias. This is called model selection, which is choosing between possible $H$(hypothesis spaces).

<aside> 📌

model selection := 가능한 가설 공간 중에서 하나를 선택하는 것 (hypothesis space selection)

</aside>

2.7 Model Selection and Generalization

Inductive bias의 필요성

Inductive bias 개념과 모델 선택 정의

inductive bias 예시

모델 선택 정의

[부교재 2.7절] Inductive bias의 엄밀한 정의

일반화 제고를 위한 모델 선택의 중요성

Generalization 정의