我使用下面的代码来生成类数据帧的数字列的平均值和标准差。我期望新列的名称为 Age_mean、Age_std,而不是 Age.Age_mean、Age.Age_std。不知道为什么年龄。另外添加到列名称中
这是我使用的数据和代码
class <- structure(list(Name = c("Alfred", "Alice", "Barbara", "Carol",
"Henry", "James", "Jane", "Janet", "Jeffrey", "John", "Joyce",
"Judy", "Louise", "Mary", "Philip", "Robert", "Ronald", "Thomas",
"William"), Sex = c("M", "F", "F", "F", "M", "M", "F", "F", "M",
"M", "F", "F", "F", "F", "M", "M", "M", "M", "M"), Age = c(14,
13, 13, 14, 14, 12, 12, 15, 13, 12, 11, 14, 12, 15, 16, 12, 15,
11, 15), Height = c(69, 56.5, 65.3, 62.8, 63.5, 57.3, 59.8, 62.5,
62.5, 59, 51.3, 64.3, 56.3, 66.5, 72, 64.8, 67, 57.5, 66.5),
Weight = c(112.5, 84, 98, 102.5, 102.5, 83, 84.5, 112.5,
84, 99.5, 50.5, 90, 77, 112, 150, 128, 133, 85, 112)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -19L))
out1 <- list()
calc_mean <- function(data, vars, out) {
vars <- syms(vars)
print(vars)
for (v in vars){
print(v)
out1[[v]] <- data %>% summarise('{{v}}_mean' := mean(!! v), '{{v}}_std' := sd(!! v))
out2 <- data.frame(out1)
assign(out, out2, envir =.GlobalEnv)
}
return(out2)
}
calc_mean(data=class, vars=c('Age','Height'), out='want')
输出
Age.Age_mean Age.Age_std Height.Height_mean Height.Height_std
1 13.31579 1.492672 62.33684 5.127075
添加它们可能是因为变量值(“年龄”和“身高”)。一种始终有效的方法是事后重命名输出,但这是另一种方法,使用所有整齐的动词而不
for
使用循环all_of
。输出
请注意,我不会命名数据类,因为它可能与
class
数据
'{{v}}_mean' := mean(!! v)
当您分配名称时,您的命名发生在此处,而在此处data.frame(out1)
当 tibbles 列表被转换为数据帧时。这是之后的中间输出
data.frame(out1)
这被投射到例如
Age.Age_mean
您可以使用
summarise('mean' := mean(!! v), 'std' := sd(!! v))
以避免重复的名称,也可以names(out1) <- NULL
在后面添加summarize
以通过代码获得所需的结果。