2018年AP统计学考前复习

标签:
ap计算机ap经济学ap微积分ap物理ap统计学 |
AP统计学的课程大纲分为四大部分
第一部分:数据分析,占考试的20-30%。这一部分学生应当掌握数据整理与展示的基本方法,并且通过图表来探索数据分布的一些初步规律。
第二部分:包括试验设计占考试的10-15%。这一部分要学会如何用设计实验和组织观察,通过两种不同的手段来获取数据。
第三部分:概率论,这一部分需要掌握如何计算事件发生的概率,期中随机变量的分布是本章的重点。该部分占考试的20-30%。
第四部分:考试的30-40%,这一章是整个统计考试的重点与难点,并且是所占比重最大的一部分。包括区间估计和假设检验,这里尤其要注意每种方法的应用条件和书写的规范性。
重点内容包括数据的分布,相关性与回归,样本的选择,实验的设计,二项分布,正态分布,中心极限定理,置信区间,显著性检验等。
第一部分 探索数据(Exploring data)
(1)
点状图 (Dotplot),茎叶图 (Stemplot),直方图 (Histogram),累计频数图 (Cumulative Frequency Distribution),盒式图(Boxplot)
中心与离散 (Center andSpread),聚类与间隙 (Cluster and Gap),极值 (Outlier),形状 (Shape)
(2)
均值与中位数 (Mean and Median)
极差,四分位距,方差与标准差 (Range,Interquartile Range, Variance and Standard Deviation)
四分位数,百分位数,Z值 (Quartiles, Percentiles, Z-score)
变换单位的影响 (Effect of Changing Units)
(3)
比较中心,离散,聚类,间隙,极值与形状(Comparingcenter, spread, clusters, gaps, outliers and shape)
(4)
散点图,相关性与线性(Scatterplot, Correlation and Linearity)
最小二乘回归线(Least squares regression line)
残差图,极值和强影响点(Residual plot, outliers, influential points)
线性转换(Transformations to achieve linearity)
(5)
双向表的边际频数和联合频数(Marginaland joint frequencies for two-way tables)
相对条件频数与关联(Conditional relative frequencies and association)
第二部分 抽样和实验设计(Sampling and experimentation)
(6)
普查,抽样调查,观察研究(Census, Sample survey, Observational study)
实验(Experiment)
(7)
设计合理且易于实施的调查的特征(Characteristicsof a well-designed and well-conducted survey)
总体,样本,随机选择(Populations, Samples, RandomSelection)
调查中偏误的来源(Sources of bias in sampling and surveys)
随机抽样方法(Sampling methods)
(8)
设计合理且易于实施的实验的特征(Characteristicsof a well-designed and well-conducted experiment)
处理,对照组,实验单位,随机分配,复制(Treatment,control group, experimental units, randomization, replication)
安慰剂效应,盲法 (Placebo effect, Blinding)
完全随机化设计(Completely randomized design)
随机分组设计,包括配对设计(Randomized block design, including matched pairs design)
结果的一般化(Generalizability of results)
第三部分 预测模式(Anticipating patterns)
(9)
大数定理(Law of Large Numbers)
加法法则(Addition rule),乘法法则(Multiplication rule),条件概率(ConditionalProbability),独立 (Independence)
离散随机变量及其概率分布,包括二项分布和几何分布(Discrete random variables and their probability distributions,including binomial and geometric)
随机行为与概率分布的模拟(Simulationof random behavior and probability distributions)
随机变量的期望值与标准差(Mean andstandard deviation of a random variable)
独立与非独立(Independence and dependence)
独立随机变量的和与差的平均值与标准差(Meanand standard deviation for sums and differences of independent randomvariables)
正态分布的属性(Properties of normal distribution)
使用正态分布表(Using tables of normal distribution)
正态分布作为衡量模型(Normaldistribution as a model for measurements)
中心极限定理(Central Limit Theorem)
样本比例的抽样分布(Sampling distribution of a sample proportion)
样本均值的抽样分布(Sampling distribution of a sample mean)
两独立样本比例差的抽样分布(Samplingdistribution of a difference between two independent sample proportions)
两独立样本均值差的抽样分布(Samplingdistribution of a difference between two independent sample means)
抽样分布的模拟(Simulation of sampling distributions)
t分布,卡方分布(T-distribution,Chi-square Distribution)
第四部分 统计推断(Statistical Inference)
置信区间的概念(Meaning of Confidence Interval)
比例的置信区间(Confidence Interval for a Proportion)
两比例差的置信区间(ConfidenceInterval for a difference of Two Proportions)
均值的置信区间(Confidence Interval for a Mean)
两均值差的置信区间,成对与不成对(Confidence Interval for a difference of Two Means, Paired andUnpaired)
最小二乘回归线斜率的置信区间(ConfidenceInterval for the Slope of a Least Squares Regression Line)
显著性检验的逻辑(Logic of significance testing)
零假设与备择假设,P-值(Null and alternative hypothesis,P-value)
单侧检验与双侧检验(One- and two-sided test)
第一类错误与第二类错误,显著性检验的效力(TypeI and Type II errors, Power of the test)
比例的检验(Large sample test for a proportion)
两比例差的检验(Large sampletest for a difference between two proportions)
均值的检验(Test for a mean)两均值差的检验,成对与不成对(Test for a difference between two means, paired and unpaired)
最小二乘回归线斜率的检验(Test forthe slope of a least squares regression line)
拟合度,比例齐性与独立性的卡方检验(Chi-squaretest for goodness of fit, independence, and homogeneity of proportions)
2.AP统计学考试题型及所占比重
第一部分:单选题 |
共40题 |
原始分记分原则 |
单选题 |
40题 (答题时间90分钟) |
40题*1分/题 * 1.25=50分 |
第二部分:简答题 |
共6题 |
|
A部分 |
|
5题*4分/题* 1.875=37.5分
|
B部分 |
|
|
总计 |
全部答题时间180分钟 |
100分 |
从2011年开始的AP统计学考试原始分与5分制的转化表:
AP统计学原始分 |
AP统计学5分制的分数 |
68-100分 |
5分 |
53-67分 |
4分 |
40-52分 |
3分 |
29-39分 |
2分 |
0-28分 |
1分 |
二、考试样题和解题思路
AP统计学考试题型分为选择题(multiple-choice)和解答题(free-response question)两种。
选择题部分40道题,每题有5个选项,作答时间90分钟;
解答题部分6道问答题,为5+1的模式,其中前5道题难度中等,建议每道题10到15分钟做完,最后一道题难度稍大且问题数量增加,建议作答时间为20到30分钟。
1.选择题:
(1) Each person in a simple random sample of 2,000received a survey, and 317 people returned their survey. How could nonresponsecause the results of the survey to be biased?
(A)
(B)
(C)
(D)
(E)
说明: 这是一个随机的样本,样本中的大部分人都没有对调研进行回复。作为调研人员, 如果仅仅根据回复的人的意见作分析, 可能分析结果会与真实情况偏差很大。因为万一没有回复的人与回复的人的意见不同, 将导致占比例很大的没有回复的人的意见无法在调研结果中得到体现。此题答案是(E)。
(2) Exercise psychologists are investigating therelationship between lean body mass (in kilograms) and the resting metabolicrate (in calories per day) in sedentary males.
Predictor
Constant
Mass
Based on the computeroutput above, which of the following is the best interpretation of the value ofthe slope of the regression line?
(A)For each additional kilogram of leanbody mass, the resulting metabolic rate increases on average by 22.563 caloriesper day.
(B)For each additional kilogram of leanbody mass, the resulting metabolic rate increases on average by 264 caloriesper day.
(C)For each additional kilogram of leanbody mass, the resulting metabolic rate increases on average by 144.9 caloriesper day.
(D)For each additional calories per dayfor the resting metabolic rate, the lean body mass increases on average by22.563 kilograms.
(E)For each additional calories per dayfor the resting metabolic rate, the lean body mass increases on average by264.0 kilograms.
说明: 此题考查对斜率的解释。根据计算机统计软件的输出结果, 截距是264.0, 斜率是22.563。在解释斜率时,我们通常都说随着X变化一个单位, 与之相关的y变化多少个单位。在这个题里, lean body mass是X, 所以(A)是答案。
(3) A large company is considering opening afranchise in St. Louis and wants to estimate the mean household income for thearea using a simple random sample of households. Based on information from apilot study, the company assumes that the standard deviation of householdincomes is σ = $7,200. Of the following, which is the least number ofhouseholds that should be surveyed to obtain an estimate that is within $200 ofthe true mean household income with 95 percent confidence?
(A)75
(B)1,300
(C)5,200
(D)5,500
(E)7,700
说明: 此题考查样本大小的计算. 在95%的置信水平下, 要求把Margin of Error控制在$200之内。所用公式为Margin of Error = zσ/√n,95%的置信水平对应的z值是1.96, σ= $7,200是已知, Margin of Error是$200,只剩一个未知数n (样本大小)就可以求出来了, 答案是(C)。
4) A candy company claims that 10 percent of itscandies are blue. A random sample of 200 of these candies is taken, and 16 arefound to be blue. Which of the following tests would be most appropriate forestablishing whether the candy company needs to change its claim?
(A)
(B)
(C)
(D)
(E)
说明: 此题考查显著性检验的方法. 公司声称10%的糖果是蓝色的, 对这个假设进行检验是对一个总体的比例进行检验, 所以答案是(B).
(5)In a certain game, a fair die is rolled anda player gains 20 points if the die shows a “6”. If the die does not show a “6”,the player loses 3 points. If the die were to be rolled 100 times, what wouldbe the expected total gain or loss for the player?
(A) Again of about 1,700 points
(B) Again of about 583 points
(C) Again of about 83 points
(D) Aloss of about 250 points
(E) Aloss of about 300 points
说明:此题考查随机变量的期望值。在掷骰子的时候,显示6个点的概率是1/6, 显示其它结果的概率当然就是5/6。我们直接用随机变量的期望值公式就可以算出掷一次骰子的期望值是20x 1/6 + (-3) x 5/6 = 0.83。重复100次期望收益就是83,答案是(c)。
(6) Consider n pairs of numbers (x1,
y1),(x2,
y2),
…, and (xn,
yn).
The meanand standard deviation of the x-values
are
(A) y= -5 + 3x
(B) y= 3x
(C) y= 5 + 2.5x
(D) y= 8.5 + 0.3x
(E) y= 10 + 0.4x
说明: 在推导最小二乘回归线时, 曾经得出一个结论, 最小二乘回归线一定会通过(,)这点。因此我们将x = 5分别带入5个选项,看y是否等于10,结果排除了(B)(C)(E)。这时我们还没有用到另外两个已知条件,即x和y的标准差。与x和y的标准差相关的公式。虽然我们不知道r, 但是r的范围在-1和1之间是隐含条件,我们可以推出b的范围在-2.5和2.5之间。所以答案是(D)。
(7) Descriptive Statistics
Variable
Score
Some descriptive statistics for a setof test scores are shown above. For this test, a certain student has astandardized score of z = -1.2。What score did thisstudent receive on the test?
(A)266.28
(B)779.42
(C)1008.42
(D)1083.38
(E)1311.98
说明:题中给了学生成绩的统计结果,z值用来衡量数据分布的相对位置。 根据
2.简答题:
AP统计问答题是综合题,经常由几个小题共同组成,考查的知识涉及几个章节。
AP统计学考试六道解答题的分配比较固定,一般来说第一章的描述性统计一道题,线性回归一道题,第二章实验设计与抽样一道题,第四章区间估计一道题或两道题,假设检验一道题,最后一道题往往是包含以上所有的部分。
问答题切记一定要先看问题进行考点定位。并且一定记住:某一个问项的题干只能往前找,而不能使用该问题下面的信息。
以2014年AP统计学考试大题部分为例,考查知识点结构如下
题目 |
a小题 |
b小题 |
c小题 |
1 |
描述统计 |
描述统计 |
假设检验 |
2 |
概率计算 |
假设检验 |
抽样方法设计 |
3 |
概率计算 |
抽样分布 |
概率计算 |
4 |
描述统计 |
抽样方法设计 |
|
5 |
假设检验- |
||
6 |
线性回归 |
http://s1/mw690/001jUFkKzy7kw32Ktoca0&690
虽然AP统计学考试各个知识所占比例相对稳定,但是在考试深度不变的前提下,要求学生对知识点的掌握有相当的广度。
以统计推断中的假设检验为例,有如下几种分类
其中https://mmbiz.qpic.cn/mmbiz/GqC8DE4aRicrIRqWiaG5af15hcSqFjnwJfKrWKOyXrwdkJHFkbJx1TqDuPNTkReuWlib6KAemyk1plT7AQlqUvT0A/640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1和pair-test相对于其他检验来说,是属于略微冷门的知识点,很对同学在复习的时候以为考的可能性很低,就战略性放弃了,这是不可取的。
调查AP统计学以往出题,发现2013年AP统计学考试考查了https://mmbiz.qpic.cn/mmbiz/GqC8DE4aRicrIRqWiaG5af15hcSqFjnwJfKrWKOyXrwdkJHFkbJx1TqDuPNTkReuWlib6KAemyk1plT7AQlqUvT0A/640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1,2014年考查了pair-test,试题本身都不难,属于常规题型,套用答题模版即可轻松解决。
由此可见,AP统计学是实用型学科,并不考查学术本身,但是要求学生在各种真实背景下(题目中所描述的背景)解决实际问题,这时候知识点的广度就变得尤为重要,同学们在复习的时候,对于那些可以出大题,但又不常考的知识点,要格外小心。
(1)People with acrophobia (fear ofheights) sometimes enroll in therapy sessions to help them overcome this fear.Typically, seven or eight therapy sessions are needed before improvement isnoticed. A study was conducted to determine whether the drug D-cycloserine,used in combination with fewer therapy sessions, would help people withacrophobia overcome this fear.
Each of 27 people whoparticipated in the study received a pill before each of two therapy sessions.Seventeen of the 27 people were randomly assigned to receive a D-cycloserinepill, and the remaining 10 people received a placebo. After the two therapysessions, none of the 27 people received additional pills or therapy. Threemonths after the administration of the pills and the two therapy sessions, eachof the 27 people was evaluated to see if he or she had improved.
(a)Was this study an experiment or anobservational study? Provide an explanation to support your answer.
(b)When the data were analyzed, theD-cycloserine group showed statistically significantly more improvement thanthe placebo group did. Based on this result, would the researchers be justifiedin concluding that the D-cycloserine and two therapy sessions are as beneficialas eight therapy sessions without the pill?
(c)A newspaper article that summarized theresults of this study did not explain how it was determined which peoplereceived D-cycloserine and which received the placebo, and no randomization wasused. Explain why such a method of assignment might lead to an incorrectconclusion.
说明:
(a)
The study was an experiment becausetreatment (D-cycloserine or placebo) were imposed by the researchers on thepeople with acrophobia.
这是实验, 因为研究员要求有恐高症的人服药或安慰剂。
(b)
No, the experiment was designed tocompare the D-cycloserine group with a control group that received the placebo.The researchers can conclude that the D-cycloserine pill and two therapysessions show significantly more improvement than a placebo and two therapysessions. However, there is no basis for comparison with another group ofpeople with acrophobia who received eight therapy sessions and no pill.
在实验中, 并没有采用八个心理疗程的实验组, 所以无法比较服药加2个心理疗程与不服药加8个心理疗程的效果。
(c)
One example is that if the therapistswere allowed to choose who received placebo and who received D-cycloserine,they might assign the people with more severe acrophobia to one of the groupsand the people with less severe acrophobia to the other group. Thus, theimprovement after only two therapy sessions could be related to the initialseverity of the acrophobia rather than to the effects of D-cycloserine.
如果分组时没有采用随机分配的方法, 可能会产生如下的状况, 比如: 恐高症严重的人都被分配到了某一个组, 恐高症不严重的人都被分配到了另一个组。在分析治疗方案的效果时, 两个组的差异可能不是由治疗方案导致的, 而是由两组人恐高症的严重与否造成的。
(2)An airline claims that there is a 0.10probability that a coach-class ticket holder who flies frequently will beupgraded to first class on any flight. This outcome is independent from flightto flight. Sam is a frequent flier who always purchases coach-class tickets.
(a)What is the probability that Sam’sfirst upgrade will occur after the third flight?
(b)What is the probability that Sam willbe upgraded exactly 2 times in his next 20 flights?
(c)Sam will take 104 flights next year.Would you be surprised if Sam receives more than 20 upgrades to first classduring the year? Justify your answer.
说明:
(a)
Let Y denote the number of flights Sammust make until he receives his first upgrade. The random variable Y follows ageometric distribution with p = 0.1. The probability that Sam’s upgrade will occurafter his third flight is calculated below.
P(Y≥4) =1 – P(Y≤3)
这道题考查几何概率分布。先计算升舱发生在第一次飞行的概率, 再计算升舱发生在第二次飞行和第三次飞行的概率, 最后用1减去前面算出的3个概率, 就得出升舱发生在第三次飞行之后的概率。
(b)
Let P denote the probability that Samwill be upgraded to first class on a particular flight. Let X denote the numberof upgrades Sam will receive in 20 flights. The random variable X follows abinomial distribution with n = 20 independent trials and p = 0.1. Theprobability that Sam will be upgraded exactly 2 times in his next 20 flights iscalculated as follows.
P(X=2)
此题考查二项分布, 直接套用二项分布的公式即可。
(c)
Let X denote the number of upgrades Samwill receive in 104 flights. The random variable X follows a binomialdistribution with n = 104 independent trials and p = 0.1. Thus,
P(X>20)
Because this probability is so small,it is very unlikely that Sam would receive more than 20 upgrades in 104 flightsif the airline’s claim is correct. This would be expected to happen less than 1percent of the time, indicating that one should be surprised if Sam receivesmore than 20 upgrades during the next year.
根据二项分布, 在104次飞行中, 升舱发生超过二十次的概率是0.0014。 因为发生的概率很低, 一旦出现我们会十分惊讶。
转载傅莹老师微信