将复制活动的序列号添加到 Blob

Question

farrow90

Asked: 2024-11-21 05:33:13 +0800 CST2024-11-21 05:33:13 +0800 CST 2024-11-21 05:33:13 +0800 CST

删除直方图中的空白处？

772

我在这里看到了这个问题https://math.stackexchange.com/questions/2648895/why-does-fair-random-process-lead-to-unfair-result/5001301#5001301，其中有人对模拟两个玩家随机互相给钱的游戏很感兴趣。

我尝试使用 R 模拟该游戏的多种轨迹，以查看不同指标的分布：

    library(ggplot2)
    library(tidyverse)
    library(gridExtra)
    library(future)
    library(furrr)
    library(parallel)
    
    n_cores <- detectCores() - 1  
    plan(multisession, workers = n_cores)


run_money_simulation <- function(n_sims, n_exchanges, player_a_start, player_b_start) {
    set.seed(123) 
    
    sims_per_core <- ceiling(n_sims / n_cores)
    
    simulate_exchange <- function(n_exchanges, player_a_start, player_b_start) {
        person_a <- numeric(n_exchanges + 1)
        person_b <- numeric(n_exchanges + 1)
        
        person_a[1] <- player_a_start
        person_b[1] <- player_b_start
        
        for(i in 2:(n_exchanges + 1)) {
            change <- sample(c(-1, 1), 1)
            person_a[i] <- person_a[i-1] + change
            person_b[i] <- person_b[i-1] - change
        }
        
        return(list(
            final_diff = person_a[n_exchanges + 1] - person_b[n_exchanges + 1],
            max_diff = max(abs(person_a - person_b)),
            max_amount = max(c(max(person_a), max(person_b))),
            min_amount = min(c(min(person_a), min(person_b)))
        ))
    }
    
    start_time <- Sys.time()
    
    results <- future_map(1:n_sims, function(x) {
        simulate_exchange(n_exchanges, player_a_start, player_b_start)
    }, .options = furrr_options(seed = TRUE))
    
    end_time <- Sys.time()
    time_taken <- difftime(end_time, start_time, units = "secs")
 
    
    final_diffs <- sapply(results, `[[`, "final_diff")
    max_diffs <- sapply(results, `[[`, "max_diff")
    max_amounts <- sapply(results, `[[`, "max_amount")
    min_amounts <- sapply(results, `[[`, "min_amount")
    
    plot_data <- tibble(
        final_diff = final_diffs,
        max_diff = max_diffs,
        max_amount = max_amounts,
        min_amount = min_amounts
    ) %>%
        pivot_longer(everything(), 
                     names_to = "metric", 
                     values_to = "value")
    
    main_plot <- ggplot(plot_data, aes(x = value)) +
        geom_histogram(bins = 50, aes(fill = metric), color = "white", alpha = 0.7) +
        facet_wrap(~metric, scales = "free", ncol = 2) +
        scale_fill_manual(values = c("black", "red", "green4", "purple")) +
        labs(title = paste("Money Exchange Simulation Results\n",
                           "Starting amounts: A =", player_a_start, ", B =", player_b_start),
             subtitle = paste("Number of simulations:", n_sims, 
                              "| Exchanges per simulation:", n_exchanges,
                              "\nProcessed using", n_cores, "CPU cores in", 
                              round(time_taken, 2), "seconds"),
             x = "Value",
             y = "Count") +
        theme_bw() +
        theme(legend.position = "none")
    
    
    print(main_plot)
    
    
    invisible(list(
        final_diffs = final_diffs,
        max_diffs = max_diffs,
        max_amounts = max_amounts,
        min_amounts = min_amounts,
        parameters = list(
            n_sims = n_sims,
            n_exchanges = n_exchanges,
            player_a_start = player_a_start,
            player_b_start = player_b_start,
            n_cores = n_cores,
            processing_time = time_taken
        )
    ))
}

当我调用该函数时：

run_money_simulation(100000, 100, 100, 100)

我得到以下结果：

我只是想知道 - 我可以在 ggplot 中做些什么来检测比例分隔格式，以便删除这些图表中的空白？

2 个回答

Voted

jack4799 · Answer 1 · 2024-11-21T11:54:54+08:00

jack4799

2024-11-21T11:54:54+08:002024-11-21T11:54:54+08:00

尝试添加xlim(c(<lower_limit>, <upper_limit>))。从那里，您可以创建另一个函数来查找数据的第 90 个百分位数（或另一个阈值）所在的位置 - 将其输入到您的上限或下限中。

1

user2554330 · Answer 2 · 2024-11-21T19:27:57+08:00

问题在于您拥有离散数据。直方图是一种密度估计，专为具有密度的连续数据而设计。

查看左上图final_diff，显示的数据始终是 4 的倍数。由于您的箱宽略小于 4，因此有几个箱完全没有显示数据。您可以通过将箱宽设置为 4 来避免出现间隙，但这不是最佳解决方案。最佳解决方案是使用专为离散数据设计的显示器。

这种离散数据通常以条形图的形式显示每个观测值的计数。您可以ggplot2使用geom_bar()而不是来获得该图geom_histogram。例如，


#  ... unchanged code deleted ...

  main_plot <- ggplot(plot_data, aes(x = value)) +
    geom_bar(aes(fill = metric)) +
    facet_wrap(~metric, scales = "free", ncol = 2) +
    scale_fill_manual(values = c("black", "red", "green4", "purple")) +
    labs(title = paste("Money Exchange Simulation Results\n",
                       "Starting amounts: A =", player_a_start, ", B =", player_b_start),
         subtitle = paste("Number of simulations:", n_sims, 
                          "| Exchanges per simulation:", n_exchanges,
                          "\nProcessed using", n_cores, "CPU cores in", 
                          round(time_taken, 2), "seconds"),
         x = "Value",
         y = "Count") +
    theme_bw() +
    theme(legend.position = "none")
  
##  ... more unchanged code ...

^{创建于 2024-11-21，使用reprex v2.1.1}

这看起来比直方图难看，但更忠实于数据。在高分辨率显示器上，它不那么难看，你可以看到所有条形图和它们之间的空白都是相同的宽度。

删除直方图中的空白处？

Vue 3：创建时出错“预期标识符但发现‘导入’”[重复]

为什么这个简单而小的 Java 代码在所有 Graal JVM 上的运行速度都快 30 倍，但在任何 Oracle JVM 上却不行？

具有指定基础类型但没有枚举器的“枚举类”的用途是什么？

如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

我正在尝试仅使用海龟随机和数学模块来制作吃豆人游戏

删除直方图中的空白处？

2 个回答

相关问题