将复制活动的序列号添加到 Blob

Question

Katherine Chau

Asked: 2024-03-20 03:34:49 +0800 CST2024-03-20 03:34:49 +0800 CST 2024-03-20 03:34:49 +0800 CST

如何在地块上保持条件值相同的颜色？

772

我正在制作几张图，其中为每种植物物种绘制了 10 种细菌物种，并根据细菌所属的科为条形图着色。虽然我可以做得很好，但我希望在不同的地块上保持不同家族的颜色相同，因为稍后我想将它们组合成一个多面的图形，并且展示属于同一家族的颜色相同的细菌属是有意义的。

这是我的数据和代码的示例：

dput(top10_cintybus)
  structure(list(Order = c("Enterobacterales", "Enterobacterales", 
"Enterobacterales", "Sphingomonadales", "Enterobacterales", "Bacillales", 
"Hyphomicrobiales", "Bacillales", "Xanthomonadales", "Hyphomicrobiales"
), Family = c("Enterobacteriaceae", "Erwiniaceae", "Yersiniaceae", 
"Sphingomonadaceae", "Morganellaceae", "Bacillaceae", "Methylobacteriaceae", 
"Bacillaceae", "Xanthomonadaceae", "Rhizobiaceae"), MKC132 = c(0L, 
0L, 27L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), MKC146 = c(33L, 8L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L), MKC227 = c(11L, 0L, 0L, 0L, 6L, 
2L, 0L, 3L, 0L, 0L), MKC231 = c(37L, 0L, 0L, 20L, 0L, 0L, 3L, 
0L, 2L, 2L), MKC242 = c(0L, 7L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 0L
), MKC276 = c(9L, 0L, 7L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), MKC351 = c(6L, 
19L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Total = c(96, 34, 34, 20, 
6, 4, 3, 3, 2, 2), Genus = structure(c(10L, 8L, 9L, 7L, 6L, 5L, 
4L, 3L, 2L, 1L), levels = c("Allorhizobium", "Xanthomonas", "Bacillus", 
"Methylobacterium", "Heyndrickxia", "Arsenophonus", "Sphingomonas", 
"Pantoea", "Serratia", "???.1"), class = "factor")), row.names = c(NA, 
10L), class = "data.frame")

dput(top10_alappa)
 structure(list(Order = c("Enterobacterales", "Xanthomonadales", 
"Enterobacterales", "Enterobacterales", "Enterobacterales", "Enterobacterales", 
"Enterobacterales", "Hyphomicrobiales", "Enterobacterales", "Burkholderiales"
), Family = c("Erwiniaceae", "Xanthomonadaceae", "Enterobacteriaceae", 
"Erwiniaceae", "Enterobacteriaceae", "Enterobacteriaceae", "Enterobacteriaceae", 
"Rhizobiaceae", "Morganellaceae", "Oxalobacteraceae"), MKC154 = c(11L, 
0L, 36L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), MKC167A = c(14105L, 0L, 
5810L, 13055L, 1223L, 2316L, 1276L, 0L, 550L, 13L), MKC167B = c(18L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), MKC214 = c(83842L, 5L, 22936L, 
175L, 6828L, 94L, 0L, 7L, 0L, 0L), MKC226 = c(0L, 0L, 11L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L), MKC233 = c(55L, 0L, 13L, 14L, 0L, 0L, 
0L, 0L, 0L, 0L), MKC314 = c(0L, 0L, 0L, 3L, 0L, 0L, 0L, 0L, 0L, 
0L), MKC364 = c(32L, 8L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), MKC366 = c(38L, 
46599L, 0L, 10L, 62L, 0L, 0L, 549L, 0L, 512L), Total = c(98101, 
46612, 28806, 13257, 8113, 2410, 1276, 556, 550, 525), Genus = structure(10:1, levels = c("Janthinobacterium", 
"Morganella", "Allorhizobium", "Citrobacter", "Siccibacter", 
"Enterobacter", "Erwinia", "???.1", "Stenotrophomonas", "Pantoea"
), class = "factor")), row.names = c(NA, 10L), class = "data.frame")

我的代码为这些数据帧之一绘制图形作为示例。我可以按家庭对条形进行着色，但如果可能的话，我希望有一种方法可以在数据集中保持家庭相同。请注意，“家庭”是一个因素列。

species="Cintybus"
ggplot(data=top10_cintybus,aes(x=Genus, y=Total, fill=Family)) + 
  geom_bar(stat = "identity", colour="black") +
  coord_flip() +
  theme_bw() +
  scale_y_continuous(trans="log10", limits=c(1,1000000), labels=scales::comma,
                     expand=expansion(mult=c(0,.05))) +
  #theme(text=element_text(size=18)) +
  theme(axis.title.x = element_text(size=18)) +
  theme(axis.text.x = element_text(colour="black", face="bold", size=15)) +
  theme(axis.title.y = element_text(size=18, vjust=2.5)) +
  theme(axis.text.y = element_text(face="bold.italic", colour="black", size=15)) +
  geom_text(aes(label=format(Total, big.mark=",")), hjust=1.5, 
            color="white", fontface="plain", size=5) +
  xlab("Genus") + ylab("Abundance") +
  ggtitle(species) + theme(plot.title = element_text(size=18))

如果我为其他数据帧运行上述代码，这是一个图像。

例如，“Pantoea”属于欧文氏菌科，但该科在两个图中的图例中具有不同的颜色。我想避免编写一个很长的列表，其中我手动为每个系列添加颜色，因为在其他一些数据集中，我有 100 个系列，并且前 10 个系列将跨数据集混合。

2 个回答

Voted

shirewoman2 · Answer 1 · 2024-03-20T04:11:26+08:00

shirewoman2

2024-03-20T04:11:26+08:002024-03-20T04:11:26+08:00

创建一个颜色字符向量，其中名称来自数据中具有要一致着色的值的列，并确保该向量包含所有可能的值。前任：

 MyColors <- c(Enterobacteriaceae = "#333333",
          Erwiniaceae = "#8B8378",
          Yersiniaceae = "#CD2626", 
          Sphingomonadaceae = "#FF8C00",
          Morganellaceae = "#00CD00", 
          Bacillaceae = "#43CD80", 
          Methylobacteriaceae = "#5F9EA0", 
          Bacillaceae = "#1874CD", 
          Xanthomonadaceae = "#27408B", 
          Rhizobiaceae = "#68228B")

然后，当您创建 ggplot2 图表时，使用scale_fill_manual 调用这些颜色。

 scale_fill_manual(values = MyColors)

您可以对几乎所有 ggplot2 美学使用相同的方法，因此您可以对线型或点形状使用相同的方法。

2

zephryl · Answer 2 · 2024-03-20T04:16:51+08:00

首先要注意的是：由于总共有数百个家庭，因此所有“前 10 名”的家庭联合体数量也可能相当高。每一个都需要独特的颜色，并且很难区分它们。（即使是分类变量的 10 种颜色，就像您的示例数据一样，也会推动它）。

那是说...

既然您提到无论如何您最终都会制作一个多面图，最简单的解决方案可能是使用facet_wrap()，这将生成一致的组合图例。tidytext::reorder_within()在和的帮助下，scale_x_reordered()保持 y 轴良好排序：

library(dplyr)
library(ggplot2)
library(tidytext)

bind_rows(Alappa = top10_alappa, Cintybus = top10_cintybus, .id = "Species") %>% 
  mutate(Genus = reorder_within(Genus, by = Total, within = Species)) %>% 
  ggplot(aes(x=Genus, y=Total, fill=Family)) + 
  geom_col(colour="black") +   # note geom_col() is equivalent to geom_bar(stat = "identity")
  geom_text(aes(label=format(Total, big.mark=",")), hjust=1.5, 
            color="white", fontface="plain", size=5) +
  coord_flip() +
  scale_x_reordered() +
  scale_y_continuous(trans="log10", limits=c(1,1000000), labels=scales::comma,
                     expand=expansion(mult=c(0,.05))) +
  facet_wrap(vars(Species), scales = "free", ncol = 1) +
  theme_bw() +
  theme(   # note you can put all your arguments in a single theme() call
    axis.title.x = element_text(size=18),
    axis.text.x = element_text(colour="black", face="bold", size=15),
    axis.title.y = element_text(size=18, vjust=2.5), 
    axis.text.y = element_text(face="bold.italic", colour="black", size=15)
  ) +
  xlab("Genus") + ylab("Abundance")

另一种方法是使用scale_fill_manual()为每个可能的值分配颜色。由于您可能有很多值，因此您可以自动执行此操作：

library(ggplot2)
library(scales)

families <- list(top10_alappa, top10_cintybus) |>
  lapply(\(dat) dat$Family) |> 
  unlist() |>
  unique() |>
  sort()

family_pal <- viridis_pal(option = "H")(length(families)) 
names(family_pal) <- families

然后添加scale_fill_manual(values = family_pal)到每个图的原始代码中，结果是：

如何在地块上保持条件值相同的颜色？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

如果 T 既不可构造、不可复制、也不可移动，那么我可以拥有 std::optional<T> 吗？

为什么我可以定义一个 constinit 的 std::string 实例？如果对象需要动态初始化，constinit 不是被禁止的吗？

如何分配以后放置的新“如同新”

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

将 NULL 和 nullptr 传递给模板参数有什么区别？

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

我正在尝试仅使用海龟随机和数学模块来制作吃豆人游戏

如何在地块上保持条件值相同的颜色？

2 个回答

相关问题