File size: 14,654 Bytes
23dc964
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
When 0:00:00 - 0:00:02
我看到这些画面:"上面写着"设置申请"的符号"
我发现这些内容:"黑圆,里面有白色"
我检测到这些标签:"黑色 | 商标 | 圆 | 螺旋 | 标志  | 漩涡  | 象征  | 文本 ."
我识别到这些文字:"Setting up your ML application Bias/Variance deeplearning.ai 我注意到,几乎所有真正优秀的机器学习参与者 I've noticed that almost all the really good machine learning"
我听到有人说:" I've noticed that almost all the really good machine learning practitioners "


When 0:00:02 - 0:00:07
我看到这些画面:"上面写着"设置一个Ai申请"的符号"
我发现这些内容:"黑圆,里面有白色"
我检测到这些标签:"黑色 | 商标 | 圆 | 螺旋 | 标志  | 漩涡  | 象征  | 文本 ."
我识别到这些文字:"Setting upyour ML application Bias/Variance deeplearning.ai 对偏差和方差的处理都是非常有经验的 practitionerstend tobeverysophisticated inunderstandingofBias andVariance."
我听到有人说:" I've noticed that almost all the really good machine learning practitioners  tend to have a very sophisticated understanding of bias and variance. "


When 0:00:07 - 0:00:10
我看到这些画面:"一个男人坐在电脑屏幕前 上面有一张男人的照片"
我发现这些内容:"黑头发 黑色画框"
我检测到这些标签:"出现  | 电脑 | 电脑屏幕 | 正装衬衫 | 形象 | 标志  | 男人 | 新闻  | 坐/放置/坐落 | 视频 ."
我识别到这些文字:"Settingupyour Bias/Variance deeplearning.ai 偏差和方差的处理往往非常容易入门,但是非常难以精通 Bias andVarianceisoneofthoseconcepts that'seasily learnedbut difficultto-master."
我听到有人说:" tend to have a very sophisticated understanding of bias and variance.  Bias and variance is one of those concepts that's "


When 0:00:10 - 0:00:41
我看到这些画面:"一个人站在电脑屏幕前 有一幅圆形图"
我发现这些内容:"穿黑黑外套 黑相框的男人"
我检测到这些标签:"出现  | 电脑 | 电脑屏幕 | 桌子/表格 | 形象 | 标志  | 男人 | 新闻  | 坐/放置/坐落 | 扬声器 | 视频 ."
我识别到这些文字:"Settingupyour MLapplication Bias/Variance deeplearning.ai 偏差-方差困境(或偏差-方差权衡)的讨论很少 lessdiscussion ofwhat'scalled the bias-variance trade-off."
我听到有人说:" Bias and variance is one of those concepts that's  easy to learn but difficult to master.  Even if you think you've seen the basic concepts of  bias and variance is often more nuanced to it than you'd expect.  In the deep learning era,  another trend is that there's been less discussion  of what's called the bias-variance trade-off.  You might have heard of this thing called the bias-variance trade-off.  But in the deep learning era,  there's less of a trade-off.  So we'd still talk about bias,  we still talk about variance,  but we just talk less about the bias-variance trade-off.  Let's see what this means.  Let's see the dataset that looks like this. "


When 0:00:41 - 0:01:12
我看到这些画面:"显示不同类型数据图表的图表"
我发现这些内容:"时钟上的二手"
我检测到这些标签:"图表 | 数字 | 线条 | 故事情节 | 指/朝向 | 矩形  | 坡  | 对称 ."
我识别到这些文字:"Bias and Variance X X high bias "just right" highvariance unduf 或者换一种说法,这是欠拟合 whatwesay that thisisunderfitting thedata. AndrewNg"
我听到有人说:" Let's see the dataset that looks like this.  If you fill a straight line to the data,  maybe you get a logistic regression fit to that.  This is not a very good fit to the data,  and so there's a class of high bias,  or we say that this is underfitting the data.  On the opposite end,  if you fit in  an incredibly complex classifier,  maybe a deep neural network or a neural network with a lot of hidden units,  maybe you can fit the data perfectly,  but that doesn't look like a great fit either. "


When 0:01:12 - 0:01:57
我看到这些画面:"图表图表,显示不同类型的低音和差异"
我发现这些内容:"调"
我检测到这些标签:"图表 | 数字 | 线条 | 故事情节 | 指/朝向."
我识别到这些文字:"Bias and Variance high bias “justright' highvariance 像图上的这个有两个特征值 So in a2Dexamplelikethis, AndrewNg"
我听到有人说:" but that doesn't look like a great fit either.  So this is a classifier with high variance,  and this is overfitting the data.  And there might be some classifier in between with  a medium level of complexity that maybe fits a curve like that,  that looks like a much more reasonable fit to the data.  And so that's the, you know,  you call that, you know, just right, right?  Somewhere in between.  So in a 2D example like this with just two features,  x1 and x2, you can plot the data and visualize bias and variance.  In high dimensional problems,  you can't plot the data and visualize the decision boundary.  Instead, there are a couple of different metrics that we'll  look at to try to understand bias and variance.  So continuing our example of cat picture classification,  where that's a positive example and that's a negative example. "


When 0:01:57 - 0:02:07
我看到这些画面:"一张有字幕的田野中猫和狗的照片"
我发现这些内容:"棕色白狗 灰猫 向窗外看"
我检测到这些标签:"动物 ."
我识别到这些文字:"Bias andVariance Cat classification Train set error: Dev set error: 训练集误差和开发集误差 thetrainseterror andthedevsetorthedevelopmentseterror. AndrewNg"
我听到有人说:" where that's a positive example and that's a negative example.  The two key,  the two key numbers to look at to understand bias and  variance will be the training set error and the deficit,  or the development set error.  So for the sake of argument, "


When 0:02:07 - 0:02:33
我看到这些画面:"一张有字幕的田野中猫和狗的照片"
我发现这些内容:"棕色和白色的狗, 一只灰色的猫向窗外看"
我检测到这些标签:"动物  | 猫 ."
我识别到这些文字:"Bias and Variance Cat classification Train set error: Dev set error: 所以我们可以理解为,你的训练集误差是1%,而对于开发集误差 So let's say, your training set error is 1% and your dev set error is, AndrewNg"
我听到有人说:" So for the sake of argument,  let's say that, you know,  recognizing cats and pictures is something that people can do nearly perfectly, right?  And so let's say your training sets error is 1%,  and your deficit error is,  for the sake of argument, let's say it's 11%.  So in this example, you're doing very well on  the training set but you're doing relatively poorly on the development set. "


When 0:02:33 - 0:03:52
我看到这些画面:"田里一只猫和狗的照片"
我发现这些内容:"棕色和白色的狗, 一只灰色的猫向窗外看"
我检测到这些标签:"动物  | 猫 ."
我识别到这些文字:"Bias and Variance Cat classification Train set error: 1/ 1 5 % Dev set error: 1 1/0 我把这个训练集误差写在第一行 I'm writing your training set error in the top row, AndrewNg"
我听到有人说:" the training set but you're doing relatively poorly on the development set.  So this looks like you might have overfit the training set,  that somehow you're not generalizing well to  this whole dark cross validation set or the development set.  And so if you have an example like this,  we will say this has high variance.  So by looking at the training set error and the development set error,  you know, you would be able to render a diagnosis of your algorithm having,  high variance.  Now, let's say that you measure your training set and your deficit error,  and you get a different result.  Let's say that your training set error is 15%.  I'm writing your training set error in the top row,  and your deficit error is 16%.  In this case, assuming that humans achieve,  you know, roughly 0% error,  that humans can look at these pictures and just tell if it's a cat or not,  then it looks like the algorithm is not even doing very well on the training set.  So if it's not even fitting the training data as seen that well,  then this is underfitting the data,  and so this algorithm has high bias.  But in contrast, this is actually generalizing at a reasonable level to the deficit.  Whereas performance of deficit is only 1% worse than performance in the training set.  So this algorithm has a problem of high bias because, "


When 0:03:52 - 0:04:37
我看到这些画面:"一张猫和狗的照片"
我发现这些内容:"棕色和白色的狗, 一只灰色的猫向窗外看"
我检测到这些标签:"动物  | 狗 | 形象 | 宠物 ."
我识别到这些文字:"Bias andVariance Cat classification Train set error: 1/ 5% Is·1. Dev set error: 1 1/ 6 /。 30.1. Hn:O。 在这种情况下,我可以判断出这个算法是高偏差的 In this case, I would diagnose this algorithm as having high bias, AndrewNg"
我听到有人说:" So this algorithm has a problem of high bias because,  well, it's not even training,  it's not even fitting the training set well.  This is similar.  It's not even fitting the training set well.  This is similar to the leftmost plot we had on the previous slide.  Now, here's another example.  Let's say that you have 15% training set error,  so that's pretty high bias.  But when you evaluate on a deficit,  it does even worse.  Maybe it does, you know, 30%.  In this case, I would diagnose this algorithm as having high bias,  because it's not doing that well on the training set,  and high variance.  So this is, you know, really the worst of both worlds.  And one last example, if you have, you know, 0.5 training set error and 1% deficit error,  then, well, maybe your users are quite happy that you have a cat class 5 with only 1% error, "


When 0:04:37 - 0:08:46
我看到这些画面:"一张猫和狗的照片"
我发现这些内容:"棕色和白色的狗, 一只灰色的猫向窗外看"
我检测到这些标签:"动物  | 狗 | 形象 | 宠物 ."
我识别到这些文字:"Bias and Variance Cat classification Is·1. G.S.1. Trainseterror S Dev set error? 1 1/0 30.1. 5836 O/。 和高方差的形态 My high variance looks like, AndrewNg"
我听到有人说:" then, well, maybe your users are quite happy that you have a cat class 5 with only 1% error,  then this would have, you know, low bias and low variance.  One subtlety that I'll just briefly mention,  but we'll leave to a later video to discuss in detail,  is that this analysis is predicated on the assumption that human level performance gets, you know,  gets nearly 0% error,  or more generally that the optimal error,  sometimes called Bayes error,  for the, so the Bayesian optimal error is nearly 0%.  I don't want to go into detail on this in this particular video,  but it turns out that if the optimal error or the Bayes error were much higher,  say it were 15%, then if you look at this classifier,  15% is actually perfectly reasonable for a training set,  and you wouldn't say that's high bias.  And it was at pretty low variance.  So the case of how to analyze bias and variance when no classifier can do very well,  for example, if you have really blurry images so that,  you know, even a human or just no system could possibly do very well,  then maybe Bayes error is much higher,  and then there's some details of how this analysis would change.  But leaving aside this subtlety for now,  the takeaway is that,  you know,  by looking at your training set error,  you can get a sense of how well you're fitting at least the training data,  and so that tells you if you have a bias problem.  And then looking at how much higher your error goes,  when you go from the training set to the depth set,  that should give you a sense of how bad is the variance problem.  So you're doing a good job generalizing from the training set to the depth set.  That gives you a sense of your variance.  All this is under the assumption that the Bayes error is quite small,  and that your training and your depth sets are drawn from the same distribution.  If those assumptions are violated,  there's a more sophisticated analysis you could do,  which we'll talk about in a later video.  Now, on the previous slide,  you saw what high bias,  high variance looks like,  and I guess you had a sense of what a good classifier looks like.  What does high bias and high variance looks like?  It's kind of the worst of both worlds.  So you remember we said that a classifier like this,  the linear classifier has high bias because it underfits.  the data.  So this would be a classifier that is mostly linear,  and therefore underfits the data.  We're drawing this in purple.  But if somehow your classifier does some weird things,  then it's actually overfitting parts of the data as well.  So the classifier that I drew in purple has both high bias and high variance.  It has high bias because by being a mostly linear classifier,  it's just not fitting,  you know,  this,  quadratic light shape that well.  But by having too much flexibility in the middle,  it somehow gets this example and this example over fits those two examples as well.  So this classifier kind of has high bias because it was mostly linear,  but you needed maybe a curve function,  a quadratic function,  and it has high variance because it had too much flexibility to fit,  you know, those two mislabeled,  although a lot of examples in the middle as well.  In case this seems contrived,  well, it is, this example is a little bit  more contrived in two dimensions,  but with very high dimensional inputs,  you actually do get things with high bias in some regions and high variance in some regions.  And so it is possible to get classifiers like this in high dimensional inputs that seem less contrived.  So to summarize,  you've seen how by looking at your algorithm's error on the training set and your algorithm's error on the dev set,  you can try to diagnose whether it has problem of high bias or high variance or maybe both or maybe neither.  And depending on whether your algorithm suffers from bias or variance,  it turns out that there are different things you could try.  So in the next video,  I want to present to you a,  what I call a basic recipe for machine learning that lets you more  systematically try to improve your algorithm depending on whether it has high bias or high variance issues.  So let's go on to the next video. "