For the most part, this chapter covers specification of models
by example, but it is worth discussing some general points about
the specification of smooths in model formulae up front. There are
4 types of smooth that can be used and mixed.
s() is used for univariate smooths (section 5.3, p. 201),
isotropic smooths of several variables (section 5.5, 214) and
random effects (section 3.5.2, 154).
te() is used to specify tensor product smooths constructed
from any singly penalized marginal smooths usable with
s() , according to section 5.6 (p.227). Examples
are provided in sections 7.2.3, 7.4 and 7.7.1, for example.
ti() is used to specify tensor product interactions with the
marginal smooths (and their lower order interactions) excluded,
facilitating smooth ANOVA models as discussed in section 5.6.3 (p.
232), and exemplified in section7.3.
t2() is used to specify the alternative tensor product smooth
construction discussed in section 5.6.5 (p.235), which is
especially useful for generalized additive mixed modelling with the
gamm4 package described in section 7.7.The first
arguments to all these functions are the covariates of the
smooth.Some further arguments control the details of the smoother.
The most important are bs is a short character string specifying
the type of basis. e.g. "cr" for cubic regression
spline, "ds" for Duchon spline, etc. It may be a
vector in the tensor product case, if different types of basis are
required for different marginals.
k is the basis dimension, or marginal basis dimension (tensor
case). It can also be a vector in the tensor case, specifying a
dimension for each marginal.m specifies the order of basis and
penalty, in a basis specific manner.
id labels the smooth. Smooths sharing a label all have the
same smoothing parameter (assuming that they are of the same
smoother type).
“by” is the name of a variable by which the smooth should be
multiplied (metric case), or each level of which should have a
separate copy of the smooth (factor case).
The last two items on the list require further explanation. An
example formula with a smooth id is
y ~ s(x) + s(z1,id="foo") + s(z2,id="foo") . This
forces the smooths of z1 and z2
to have the same smoothing parameter: for this to really make sense
they are also forced to have the same basis and penalty,provided
this is possible.
by variables are the means for implementing ‘varying
coefficient models’,such as that used in section 7.5.3. Suppose,
for example, that we have metric variables x and z and want to
specify a linear predictor term ‘f(x i )z i ’ where f is a smooth
function. The model formula entry for this would be
s(x,by=z) .
Only one by variable is allowed per smooth, but any smooth
with multiple covariates (specified by s, te, ti
or t2 ) can also have a by
variable. Note that, provided the by variable
takes more than one value, such terms are identifiable without a
sum-to-zero constraint, and so they are left unconstrained.
Metric by variables combined with a summation convention are
the means by which linear functionals of smooths can be
incorporated into the linear predictor. Examples are provided in
sections 7.4.2 and 7.11.1. The idea is that if the covariates of
the smooth and the by variable are all matrices,
then a summation across rows is implied. For example if
X, Z and L are all matrices
then s(X, Z,by=L) specifies the term ∑ k f(X ik ,
Z ik )L ik in the linear predictor.
Tensor terms also support the convention.
by variables also facilitate ‘smooth-factor’ interactions, in
which we have a separate smooth of one or more covariates for each
level of a factor by variable. For example, suppose we have metric
variables x and z and factor variable g with three levels. Let g(i)
denote the level of g corresponding to observation i. Then
te(z,x,by=g) would contribute the terms ‘f g(i)
(x i , z i )’ to the model linear predictor. That is, which of
three separate smooth functions of x and z contributes to the
linear predictor depends on which of the three levels of g applies
for observation i. Again s, te, ti or
t2 terms all work in the same way regardless of
the number of their covariates. To avoid confounding problems the
smooths are all subject to sum to zero constraints, which usually
means that the main effect of g should also be included in the
model specification. For example, g + te(z,
x,by=g) . Factor by variables can not be mixed with the summation
convention. † When there are several factor by variables then
identifiability can get tricky, and it can then be useful to employ
ordered factor by variables. If a factor
by is an ordered factor then no smooth is
generated for its first level.
Often we would like all the smooths generated by a factor
by to have the same smoothing parameter. The id
mechanism allows this. For example te(z,x,by=g,id="a") causes the
smooths for each level of g to share the same
id , and hence all to have the same smoothing
parameter.
在很大程度上,本章讨论了模型的规范,但在前面的模型公式中,值得讨论一些关于“平滑”规范的一般观点。有四种类型的光滑可以被使用和混合。
s()用于单变量平滑,各变量的各向同性的平滑度和随机效果。
te()用于指定的任何一个单独的惩罚的边缘平滑物来指定张量产品的平滑度。
ti()被用来指定与边缘平滑(以及它们的低阶交互)的张量产品交互,从而促进了平滑的方差模型。
t2()用于替代张量产品平滑结构,这对于广义的加法混合模型和gamm4包特别有用。所有这些函数的第一个参数是光滑的协变。一些进一步的参数控制了更平滑的细节。最重要的是,bs是一个简短的字符字符串,指定了基的类型。例如cr用于对三次回归样条,“ds”用于Duchon分布来说,如果不同的边际需要不同的基础,它可能是张量产品的一个矢量。
k是基维,或者是边际基维(张量情况)。它也可以是张量的一个向量,为每个边界指定一个维数。
m以一种特定的方式指定了基础和惩罚的顺序。
id标签光滑。共享一个标签的平滑参数都具有相同的平滑参数(假设它们是相同的平滑类型)。
by是一个变量的名称,该变量的名称应该是平滑的(度量单位),或者每一层都应该有一个单独的平滑的副本(因素数)。
最后两项需要进一步解释,一个平滑id的例子是y~s(x)+s(z1,id=“foo”)+s(z2,id=“foo”)。这使得z1和z2的平滑度得到了相同的平滑参数,如果这是可能的话,而且这是很有意义的,它们也必须有相同的基础和惩罚。
by是实现“可变系数模型”的方法,例如,假设我们有度规变量x和z,并且想要指定一个线性的预测函数f(xi)zi,在这里f是一个光滑的函数。这个的模型公式是s(x,by=z)。
每个变量只有一个变量是允许的,但是任何平滑的多变量(由s、te、ti或t2指定)也可以有一个变量。注意,如果变量的值超过一个值,那么这些术语在没有一个sum0到0约束的情况下是可识别的,因此它们是不受约束的。
由summarion convention约定结合的度量方法是线性函数的线性函数可以合并到线性预测器的方法。其思想是,如果光滑的和由变量的协变是所有的矩阵,那么就隐含了一个跨行的总和。例如,如果X,Z和L都是矩阵,那么s(X,Z,by=L)在线性预测器中指定了
,张量项也支持这个约定。
By变量也促进了“平滑因子”的相互作用,在这个过程中,我们有一个或多个变量的单独平滑度。例如,假设我们有度量变量x和z,而因子变量g有三个级别。让g(i)表示g对应于观察i的水平,然后te(z,x,by=g)将贡献f g(i)(x
i,z
i)到模型线性预测器中。也就是说,x和z三个独立的光滑函数的作用是线性预测函数取决于三种g中的哪一种适用于观察i,s,te,ti或t2的术语都以相同的方式工作,而不管它们的协变量的数量。为了避免混淆问题,“平滑”的所有内容都被归为零约束,这通常意味着g的主要影响也应该包含在模型规范中。例如,g+te(z,x,by=g)变量的因子不能与求和约定混合在一起。当变量有几个因素时,可识别性就会变得很棘手,然后用变量来使用有序因子就很有用了。如果一个因子是一个有序因子,那么它的第一级就不会产生平滑。
加载中,请稍候......