我有一个 ggplot,其中使用了 geom_smooth(method='gam') 函数。我想知道是否有一种方法可以提取模型参数,例如解释的系数和偏差。
类似于 mgcv::gam() 输出的summary():
> summary(gam)
Family: gaussian
Link function: identity
Formula:
mAODscale ~ s(numDate, bs = "cr")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.041461 0.002198 18.86 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(numDate) 8.731 8.979 74.54 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.379 Deviance explained = 38.4%
GCV = 0.0053239 Scale est. = 0.0052765 n = 1092
数据
test <- structure(list(numDate = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4,
5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11,
11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16,
16, 17, 17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21,
22, 22, 22, 23, 23, 23, 24, 24, 24, 25, 25, 25, 26, 26, 26, 27,
27, 27, 28, 28, 28, 29, 29, 29, 30, 30, 30, 31, 31, 31, 32, 32,
32, 33, 33, 33, 34), mAODscale = c(0.0388813764134284, 0.0461877433707656,
0.096886995092774, 0.0371438764134382, 0.0394419100374392, 0.0893994950927635,
0.0413855430800965, 0.0418085767040992, 0.101328661759439, 0.0428313764134316,
0.0403835767041016, 0.0978328284261067, 0.0444813764134295, 0.0566460767040979,
0.136811995092771, 0.0404647097467716, 0.0461127433707702, 0.104611995092768,
0.0391105430800991, 0.0388294100374367, 0.0883828284261057, 0.0377355430801032,
0.0334335767040983, 0.0744786617594428, 0.0346647097467638, 0.0316752433707705,
0.0691911617594343, 0.0365730430800966, 0.0329127433707725, 0.0721203284261094,
0.0337897097467703, 0.0271960767041008, 0.0568703284261005, 0.0321188764134348,
0.0226835767040967, 0.0467286617594311, 0.0389522097467676, 0.0317169100374315,
0.0724911617594444, 0.0374147097467699, 0.0301335767041024, 0.066378661759444,
0.0359688764134347, 0.0245710767041061, 0.0492828284261009, 0.0340355430801083,
0.0208835767040938, 0.0401828284261114, 0.033193876413435, 0.0170627433707722,
0.0363119950927739, 0.0320272097467722, 0.0147294100374324, 0.0298161617594417,
0.0305563764134433, 0.0123752433707693, 0.0200744950927714, 0.0294522097467649,
0.00966691003743847, 0.0144453284261061, 0.029439709746768, 0.00845441003743019,
0.0127494950927769, 0.029218876413438, 0.00767941003742578, 0.0118203284260971,
0.0283438764134303, 0.00608774337077023, 0.00807032842610056,
0.027393235387791, 0.00498582029383954, 0.00612513611841337,
0.0261313764134314, 0.004612743370771, 0.00541616175944171, 0.0260813764134298,
0.00472524337077118, 0.00372449509276862, 0.0256522097467666,
0.00535024337077061, 0.00356616175943714, 0.0250313764134376,
0.0161044100374284, 0.00060366175944182, 0.0241147097467689,
0.025246076704093, -0.00540050490722876, 0.0233022097467739,
0.022454410037426, -0.00630883824055672, 0.0208772097467715,
0.0152252433707645, -0.00819217157389573, 0.0194897097467646,
-0.00307475662923196, -0.00896300490722979, 0.0178938764134386,
0.00323774337076088, -0.00836717157389444, 0.0153022097467641
)), row.names = c(NA, -100L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x000002cacef91f60>)
使用 geom_smooth() 进行编码和绘图
注意:这仅使用我的数据集的子集。
p <- ggplot(test50, aes(x=numDate, y=mAODscale)) +
geom_point() +
stat_smooth(method='gam',
formula=y~s(x,bs="cs",fx=TRUE,k=10))
geom_smooth
在 中指定模型时,您可以在公式中ggplot
使用符号x
和来表示映射到 x 和 y 美学的变量(如 中指定的)。在你的情况下,是并且是。y
aes
x
numDate
y
mAODscale
由于在运行模型之前数据中没有进行任何转换(除了变量名称的更改),因此您所需要做的就是将公式中的符号 和 替换为原始数据框中适当的
x
变量y
名称。然后使用该公式运行gam
. 该模型的摘要就是您正在寻找的结果:模型本身不存储在 ggplot 对象中。它是即时构建的,并在此
ggplot_build
过程中被丢弃。我们可以看到,如果我们设置on并让它在调用时将其输出对象的摘要打印到控制台,则直接调用会gam
给出与所使用的模型相同的结果:ggplot
trace
gam
gam
现在,当我们打印 ggplot 时,我们应该得到用于ggplot
创建平滑的摘要:可以看到这和我们直接调用是一样的,只是变量名变成了
x
andy
。untrace(gam)
完成后不要忘记。创建于 2023-10-03,使用reprex v2.0.2