Оцените презентацию от 1 до 5 баллов!
Тип файла:
ppt / pptx (powerpoint)
Всего слайдов:
24 слайда
Для класса:
1,2,3,4,5,6,7,8,9,10,11
Размер файла:
432.00 kB
Просмотров:
68
Скачиваний:
0
Автор:
неизвестен
Слайды и текст к этой презентации:
№1 слайд![INTRODUCTION TO Machine](/documents_6/65d7e1c484af2424f10981cf45b3e963/img0.jpg)
Содержание слайда: INTRODUCTION TO
Machine Learning
ETHEM ALPAYDIN
© The MIT Press, 2004
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml
№2 слайд![](/documents_6/65d7e1c484af2424f10981cf45b3e963/img1.jpg)
№3 слайд![Introduction Questions](/documents_6/65d7e1c484af2424f10981cf45b3e963/img2.jpg)
Содержание слайда: Introduction
Questions:
Assessment of the expected error of a learning algorithm: Is the error rate of 1-NN less than 2%?
Comparing the expected errors of two algorithms: Is k-NN more accurate than MLP ?
Training/validation/test sets
Resampling methods: K-fold cross-validation
№4 слайд![Algorithm Preference Criteria](/documents_6/65d7e1c484af2424f10981cf45b3e963/img3.jpg)
Содержание слайда: Algorithm Preference
Criteria (Application-dependent):
Misclassification error, or risk (loss functions)
Training time/space complexity
Testing time/space complexity
Interpretability
Easy programmability
Cost-sensitive learning
№5 слайд![Resampling and K-Fold](/documents_6/65d7e1c484af2424f10981cf45b3e963/img4.jpg)
Содержание слайда: Resampling and
K-Fold Cross-Validation
The need for multiple training/validation sets
{Xi,Vi}i: Training/validation sets of fold i
K-fold cross-validation: Divide X into k, Xi,i=1,...,K
Ti share K-2 parts
№6 слайд![Cross-Validation times fold](/documents_6/65d7e1c484af2424f10981cf45b3e963/img5.jpg)
Содержание слайда: 5×2 Cross-Validation
5 times 2 fold cross-validation (Dietterich, 1998)
№7 слайд![Bootstrapping Draw instances](/documents_6/65d7e1c484af2424f10981cf45b3e963/img6.jpg)
Содержание слайда: Bootstrapping
Draw instances from a dataset with replacement
Prob that we do not pick an instance after N draws
that is, only 36.8% is new!
№8 слайд![Measuring Error Error rate of](/documents_6/65d7e1c484af2424f10981cf45b3e963/img7.jpg)
Содержание слайда: Measuring Error
Error rate = # of errors / # of instances = (FN+FP) / N
Recall = # of found positives / # of positives
= TP / (TP+FN) = sensitivity = hit rate
Precision = # of found positives / # of found
= TP / (TP+FP)
Specificity = TN / (TN+FP)
False alarm rate = FP / (FP+TN) = 1 - Specificity
№9 слайд![ROC Curve](/documents_6/65d7e1c484af2424f10981cf45b3e963/img8.jpg)
Содержание слайда: ROC Curve
№10 слайд![Interval Estimation X xt t](/documents_6/65d7e1c484af2424f10981cf45b3e963/img9.jpg)
Содержание слайда: Interval Estimation
X = { xt }t where xt ~ N ( μ, σ2)
m ~ N ( μ, σ2/N)
№11 слайд![](/documents_6/65d7e1c484af2424f10981cf45b3e963/img10.jpg)
№12 слайд![Hypothesis Testing Reject a](/documents_6/65d7e1c484af2424f10981cf45b3e963/img11.jpg)
Содержание слайда: Hypothesis Testing
Reject a null hypothesis if not supported by the sample with enough confidence
X = { xt }t where xt ~ N ( μ, σ2)
H0: μ = μ0 vs. H1: μ ≠ μ0
Accept H0 with level of significance α if μ0 is in the 100(1- α) confidence interval
Two-sided test
№13 слайд![One-sided test H vs. H gt](/documents_6/65d7e1c484af2424f10981cf45b3e963/img12.jpg)
Содержание слайда: One-sided test: H0: μ ≤ μ0 vs. H1: μ > μ0
Accept if
Variance unknown: Use t, instead of z
Accept H0: μ = μ0 if
№14 слайд![Assessing Error H p p vs. H p](/documents_6/65d7e1c484af2424f10981cf45b3e963/img13.jpg)
Содержание слайда: Assessing Error:
H0: p ≤ p0 vs. H1: p > p0
Single training/validation set: Binomial Test
If error prob is p0, prob that there are e errors or less in N validation trials is
№15 слайд![Normal Approximation to the](/documents_6/65d7e1c484af2424f10981cf45b3e963/img14.jpg)
Содержание слайда: Normal Approximation to the Binomial
Number of errors X is approx N with mean Np0 and var Np0(1-p0)
№16 слайд![Paired t Test Multiple](/documents_6/65d7e1c484af2424f10981cf45b3e963/img15.jpg)
Содержание слайда: Paired t Test
Multiple training/validation sets
xti = 1 if instance t misclassified on fold i
Error rate of fold i:
With m and s2 average and var of pi
we accept p0 or less error if
is less than tα,K-1
№17 слайд![Comparing Classifiers H vs. H](/documents_6/65d7e1c484af2424f10981cf45b3e963/img16.jpg)
Содержание слайда: Comparing Classifiers:
H0: μ0 = μ1 vs. H1: μ0 ≠ μ1
Single training/validation set: McNemar’s Test
Under H0, we expect e01= e10=(e01+ e10)/2
№18 слайд![K-Fold CV Paired t Test Use](/documents_6/65d7e1c484af2424f10981cf45b3e963/img17.jpg)
Содержание слайда: K-Fold CV Paired t Test
Use K-fold cv to get K training/validation folds
pi1, pi2: Errors of classifiers 1 and 2 on fold i
pi = pi1 – pi2 : Paired difference on fold i
The null hypothesis is whether pi has mean 0
№19 слайд![cv Paired t Test Use cv to](/documents_6/65d7e1c484af2424f10981cf45b3e963/img18.jpg)
Содержание слайда: 5×2 cv Paired t Test
Use 5×2 cv to get 2 folds of 5 tra/val replications (Dietterich, 1998)
pi(j) : difference btw errors of 1 and 2 on fold j=1, 2 of replication i=1,...,5
№20 слайд![cv Paired F Test](/documents_6/65d7e1c484af2424f10981cf45b3e963/img19.jpg)
Содержание слайда: 5×2 cv Paired F Test
№21 слайд![Comparing L gt Algorithms](/documents_6/65d7e1c484af2424f10981cf45b3e963/img20.jpg)
Содержание слайда: Comparing L>2 Algorithms: Analysis of Variance (Anova)
Errors of L algorithms on K folds
We construct two estimators to σ2 .
One is valid if H0 is true, the other is always valid.
We reject H0 if the two estimators disagree.
№22 слайд![](/documents_6/65d7e1c484af2424f10981cf45b3e963/img21.jpg)
№23 слайд![](/documents_6/65d7e1c484af2424f10981cf45b3e963/img22.jpg)
№24 слайд![Other Tests Range test](/documents_6/65d7e1c484af2424f10981cf45b3e963/img23.jpg)
Содержание слайда: Other Tests
Range test (Newman-Keuls):
Nonparametric tests (Sign test, Kruskal-Wallis)
Contrasts: Check if 1 and 2 differ from 3,4, and 5
Multiple comparisons require Bonferroni correction If there are m tests, to have an overall significance of α, each test should have a significance of α/m.
Regression: CLT states that the sum of iid variables from any distribution is approximately normal and the preceding methods can be used.
Other loss functions ?