Homer Jay Simpson提出的问题 -coding

Homer Jay Simpson

Asked: 2025-04-24 15:08:51 +0800 CST

Alterar manualmente a cor dos círculos usando o pacote sf em R

5

Gostaria de criar um Diagrama de Venn com sf+ ggplot2pacotes.

Como alterar manualmente as cores dos círculos e as regiões de intersecção dos dois círculos internos? Alterar a escala de preenchimento {ggplot2}não afeta as cores dos círculos.

Inicialização

library(ggplot2)
library(dplyr)
library(sf)

create_circle_sf <- function(x, y, radius, label, fill_id) {
  st_point(c(x, y)) %>%
    st_buffer(radius) %>%
    st_sfc(crs = 4326) %>%
    st_sf(label = label, fill_id = fill_id, geometry = .)
}



# radii from n values
r1 <- sqrt(6000 / pi)
r2 <- sqrt(2000 / pi)
r3 <- sqrt(2000 / pi)


# Add a new `fill_id` for color mapping
circle1 <- create_circle_sf(0, 0, r1, "Party Representatives", "A")
circle2 <- create_circle_sf(14, 0, r2, "Unique parties", "B")
circle3 <- create_circle_sf(-13, 0, r3, "Client Companies", "C")

# Combine all
circles_all <- bind_rows(circle1, circle2, circle3)
# calculate intersection (e.g., between 1 & 2)
intersect12 <- st_intersection(circle1, circle2)
intersect13 <- st_intersection(circle1, circle3)
intersect23 <- st_intersection(circle2, circle3)
# combine all
circles_all <- bind_rows(circle1, circle2, circle3)

Tentativa de enredo


# Define color manually for each fill_id
custom_colors <- c("A" = "#77bca2", "B" = "#FFEB3B", "C" = "#FF9800")
# Circle centers and labels
label_df <- data.frame(
  x = c(0, 14, -13),
  y = c(0, 0, 0),
  label = c("Party Representatives: 41",
            paste0("Unique\n Parties: ", 20),
            paste0("Unique\nClient\nCompanies: ", 30))
)

ggplot() +
  geom_sf(data = circles_all, aes(fill = fill_id), alpha = 0.8) +  
  # use fill_id here
  geom_sf(data = intersect12, fill = "white", alpha = 0.8) +
  geom_sf(data = intersect13, fill = "white", alpha = 0.8) +
  geom_sf(data = intersect23, fill = "white", alpha = 0.8) +
  geom_text(data = label_df, aes(x = x, y = y, label = label),
            size = 4.5, fontface = "bold", lineheight = 0.9) +
  scale_fill_manual(values = custom_colors) +
  theme_void() +
  coord_sf() +
  theme(legend.position = "none")

Homer Jay Simpson

Asked: 2025-03-08 19:36:15 +0800 CST

Quebrar texto para recolher linhas no KableExtra para uma tabela longa em R

6

Tenho um arquivo Rmarkdown com seção Yaml:

---
geometry: top=2cm , bottom= 2.5cm , left=0.5cm, right=0.5cm
output:
  pdf_document:
    latex_engine: xelatex
---

configurar r chunk:

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
suppressMessages(library(kableExtra))
suppressMessages(library(tidyverse))

Tenho um pedaço que formata a tabela longa com o KableExtra usando linhas de recolhimento na coluna 1. O problema é que quero envolvê-lo e ajustá-lo nessa largura da coluna 1.

DF%>%
  kbl(align = "llccc")%>%
  collapse_rows(columns = 1, valign = "top")%>%
  kable_styling(bootstrap_options = c("bordered"))%>%
  column_spec(1, border_left = TRUE, width = "2cm") %>%
  column_spec(2, width = "12cm")%>%
  column_spec(3, width = "1.5cm", extra_css = "vertical-align: middle;")%>%
  column_spec(4, width = "1.5cm", extra_css = "vertical-align: middle;")%>%
  column_spec(5,border_right = TRUE, width = "1.5cm", extra_css = "vertical-align: middle;")%>%
  row_spec(0, background = "#D3D3D3", bold = TRUE) %>%  
  kable_styling(font_size = 9)

Como posso fazer isso em R usando o KableExtra?

Dados

DF = structure(list(category = c("Capability", "Capability", "Capability", 
                                 "Capability", "Capability", "Capability", "Capability", "Capability", 
                                 "Capability", "Capability", "Capability", "Capability", "Capability", 
                                 "Capability", "Challenge", "Challenge", "Challenge", "Challenge", 
                                 "Contracting Management Topics", "Contracting Management Topics", 
                                 "Contracting Management Topics", "Contracting Management Topics", 
                                 "Contracting Management Topics", "Contracting Management Topics", 
                                 "Contracting Management Topics", "Cooperation Site Supervision", 
                                 "Cooperation Site Supervision", "Cooperation Site Supervision", 
                                 "Cooperation Site Supervision", "Cooperation Site Supervision", 
                                 "Cooperation Site Supervision", "Cooperation Site Supervision", 
                                 "Core Beliefs", "Core Beliefs", "Expense", "Safety", "Safety", 
                                 "Safety", "Safety", "Safety", "Safety", "Excellence", "Excellence", 
                                 "Excellence"), questions = c("Do you believe that the following attributes define XYZ? Dedicated", 
                                                              "Do you believe that the following attributes define XYZ? Credible", 
                                                              "Do you believe that the following attributes define XYZ? Creative", 
                                                              "Do you believe that the following attributes define XYZ? Sincere", 
                                                              "Do you believe that the following attributes define XYZ? Ambitious", 
                                                              "Do you believe that the following attributes define XYZ? Qualified", 
                                                              "Do you believe that the following attributes define XYZ? Honest", 
                                                              "Extensive Building knowledge", "The capacity to rapidly share data on your projects", 
                                                              "The skill to coordinate your projects", "Skilled and talented personnel", 
                                                              "A variety of building services that can be adjusted to fit your demands", 
                                                              "Do you believe that the following attributes define XYZ? Dedicated", 
                                                              "Do you believe that the following attributes define XYZ? Credible", 
                                                              "Please evaluate your contentment regarding the following: Compared to competitors, how pleased are you with XYZ?", 
                                                              "Kindly respond to the following inquiries: How inclined are you to use our solutions again?", 
                                                              "Kindly respond to the following inquiries: Would you refer XYZ to associates?", 
                                                              "Please evaluate your contentment regarding the following: Compared to competitors, how pleased are you with XYZ?", 
                                                              "How pleased are you with the pace at which XYZ handled your concerns and grievances?", 
                                                              "The final resolution of your concerns and grievances", "Please evaluate your contentment regarding the following: Billing", 
                                                              "Please evaluate your contentment regarding the following: Requests", 
                                                              "Please evaluate your contentment regarding the following: Adherence to your needs", 
                                                              "Please evaluate your contentment regarding the following: Prompt reaction to your needs", 
                                                              "Please evaluate your contentment regarding the following: Compliance with the plan", 
                                                              "How pleased are you with XYZ executives’ accessibility?", "How pleased are you with XYZ’s efficiency in field monitoring?", 
                                                              "How pleased are you with XYZ’s capacity to accomplish tasks as requested?", 
                                                              "How pleased are you with team expertise and comprehension of building techniques?", 
                                                              "How pleased are you with issue resolution (Provide corresponding examples)?", 
                                                              "How pleased are you with cooperation with client teams?", 
                                                              "How pleased are you with XYZ executives’ accessibility?", "XYZ is devoted to offering exceptional quality service; how pleased are you with our outputs?", 
                                                              "How pleased are you with XYZ policies concerning Security, Protection, and Well-being?", 
                                                              "Please evaluate your contentment regarding the following: The entire expenditure of the undertaking (Original proposal & requests)", 
                                                              "How pleased are you with XYZ policies concerning Security, Protection, and Well-being?", 
                                                              "XYZ Security Framework on the Project", "XYZ’s accountability and dedication to Security on the Project", 
                                                              "Proficiency of XYZ Project Security Team", "Clarity, efficiency, and openness of reports linked to incident assessments", 
                                                              "Dialogue between XYZ project executives and yours concerning Security", 
                                                              "XYZ’s accountability and dedication to Excellence on the project", 
                                                              "XYZ’s Excellence Administration Framework execution on the project", 
                                                              "Productivity, proficiency, and openness concerning remedial steps on quality aspects"
                                 ), `House ball Score (/5)` = c(3, 3, 3, 2, 3, 4, 3, 3, 3, 
                                                                     2, 2, 3, 2, 2, 3, 3, 3, 2, 3, 3, 3, 3, 2, 2, 1, 3, 2, 2, 3, 2, 
                                                                     2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), `House toy Score (/5)` = c(2L, 
                                                                                                                                                3L, 3L, 1L, 5L, 2L, 3L, 3L, 2L, 5L, 4L, 1L, 4L, 5L, 1L, 2L, 3L, 
                                                                                                                                                5L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 1L, 3L, 2L, 2L, 1L, 1L, 3L, 4L, 
                                                                                                                                                5L, 4L, 5L, 3L, 4L, 4L, 4L, 2L, 3L, 5L, 5L), `House Car Score (/5)` = c(4L, 
                                                                                                                                                                                                                         1L, 1L, 2L, 5L, 2L, 1L, 3L, 3L, 4L, 3L, 4L, 4L, 1L, 1L, 5L, 4L, 
                                                                                                                                                                                                                         1L, 1L, 4L, 5L, 4L, 4L, 2L, 5L, 2L, 1L, 3L, 5L, 4L, 2L, 3L, 4L, 
                                                                                                                                                                                                                         2L, 2L, 1L, 3L, 1L, 4L, 2L, 5L, 3L, 3L, 3L)), row.names = c(NA, 
                                                                                                                                                                                                                                                                                     -44L), class = c("tbl_df", "tbl", "data.frame"))

Homer Jay Simpson

Asked: 2025-02-27 19:28:03 +0800 CST

Coloração condicional e bordas externas na tabela pdf KableExtra em R

5

Eu tenho um quadro de dados simulado em R que eu quero formatar condicionalmente o fundo das células (ou seja, se o valor for menor que 30 para ser vermelho, se estiver entre 31 e 75 para ser azul e 76 e acima para verde). Eu tentei adicionar bordas externas na tabela, mas como você pode ver na foto, há espaços vazios com o cabeçalho no canto superior direito e a borda vertical esquerda não funciona.

o pedaço de configuração é:

{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(knitr.table.format = "latex")

e a seção YAML:

output:
  pdf_document:
    latex_engine: xelatex

Dados


library(tibble)
library(kableExtra)

set.seed(123)  # For reproducibility
YEARS = c(1990, 2000, 2010, 2020)
# Generate columns
Year <- rep(YEARS, each = 48)
Category <- rep(sapply(1:16, function(x) paste0(sample(letters, 10, replace = TRUE), collapse = "")), each = 3)
Category <- rep(Category, times = 4)
Favor <- rep(c("Good", "Mediocre", "Bad"), each = 1,times =64)
Percentage <- sample(50:99, 192, replace = TRUE)

# Create tibble
df <- tibble(Year, Category, Favor, Percentage)

df = df%>%
  arrange(
    Category,
    factor(Favor, levels = c("Bad", "Mediocre", "Good")),
    Year
  ) %>%
  pivot_wider(
    names_from = c(Favor, Year),
    values_from = Percentage,
    names_sep = "-"
  )


header_values = c("Category", rep(c(
  paste(YEARS[1]), paste(YEARS[2]), paste(YEARS[3]), paste(YEARS[4])
), 3))
colnames(df)=header_values

df %>%
  kableExtra::kbl(align = "lcccccccccccccc") %>%
  kableExtra::column_spec(1, border_left = TRUE) %>%
  kableExtra::column_spec(ncol(df), border_right = TRUE) %>%
  kableExtra::add_header_above(header = c(" " = 1,
                                          "Good" = 4,
                                          "Mediocre" = 4,
                                          "Bad" = 4)) %>%
  kableExtra::column_spec(1, width = "6cm")

Homer Jay Simpson

Asked: 2025-02-25 15:36:09 +0800 CST

Ajuste o texto em fatias e pinte as fatias dentro de um círculo usando ggplot2 em R

3

Tenho alguns dados de pesquisa e criei círculos dentro de círculos para representar como uma pesquisa está indo. Estou usando o ggforce para criar os círculos, mas meu problema é que o último círculo que quero dividir em 3 partes e ajustar o texto dentro das fatias. Também quero colori-las (as fatias) com cores diferentes (grupo A: vermelho, grupo B: verde, grupo C: amarelo). Como posso fazer isso? Alguém ajuda?

total_participants = 7500  
undelivered = 2500
total_invited = 5000  
total_responses = 2000  
total_unanswered = 1100  
group_c = 900 
group_a = 650  
group_b = 450 
partially_completed = 300  
library(ggplot2)
library(ggforce)

# Define total responses in the pie chart
total_count = group_a + group_b + group_c

# Define proportional angles (in radians)
angle_a = 1.4 * pi * (group_a / total_count)  
angle_b = 8 * pi * (group_b / total_count)  
angle_c = 2 * pi - (angle_a + angle_b)  

# Cumulative angles for the pie slices
angles = c(0, angle_a, angle_a + angle_b, 2 * pi)

# Midpoint angles for text placement
mid_angles = c((angles[1] + angles[2]) / 2,  
               (angles[2] + angles[3]) / 2,  
               (angles[3] + angles[4]) / 2) 

# Pie chart position and radius
pie_center_x = 2.7
pie_center_y = -0.6
pie_radius = 3

# Data frame for the pie chart
pie_data = data.frame(
  category = c(paste0("Group A:\n ", group_a), 
               paste0("Group B: \n", group_b + total_unanswered),
               paste0("Group C: \n", group_c)),
  x_start = pie_center_x,
  y_start = pie_center_y,
  x_end = pie_center_x - pie_radius * cos(angles[2:4]),
  y_end = pie_center_y + pie_radius * sin(angles[2:4]),
  text_x = pie_center_x - (pie_radius * 0.6) * cos(mid_angles),  
  text_y = pie_center_y + (pie_radius * 0.4) * sin(mid_angles)
)

# Bubble chart data
bubble_data = data.frame(
  group = c(paste0("Total Participants: ", total_participants),
            paste0("Invited: ", total_invited),
            paste0("Responded: ", total_unanswered + group_a + group_b + group_c),
            paste0("Not Delivered: ", undelivered),
            paste0("Partial Responses: ", partially_completed)),
  radius = c(7, 5, 3, 0.8, 0.8),
  x0 = c(0, 1, 2.7, -5.4, -1),
  y0 = c(0, 0, -0.6, 1, 1.4)
)

# Final plot
plot_bubble_pie = ggplot() +
  
  # Bubble chart
  ggforce::geom_circle(data = bubble_data, aes(x0 = x0, y0 = y0, r = radius, 
                                               fill = factor(group, group)), 
                       alpha = 1) +
  geom_text(data = bubble_data, aes(x = x0, y = y0 + radius + 0.2, label = group), 
            size = 3.2, fontface = "bold") +
  
  # Pie chart slice lines
  geom_segment(data = pie_data, 
               aes(x = x_start, y = y_start, xend = x_end, yend = y_end, color = category),
               color = "black",
               linewidth = 1) +
  
  # Labels inside pie slices
  geom_label(data = pie_data, 
             aes(x = text_x, y = text_y, label = category), 
             size = 3.2, fontface = "bold", label.size = 0, fill = NA) +
  
  # Custom color mappings
  scale_color_manual(values = c("Group A" = "black", "Group B" = "black", "Group C" = "black"), guide = "none") +
  scale_fill_manual(values = c('#77bca2', '#e1926b', '#a09cc8', "grey", "orange", "indianred", "orange"),
                    guide = 'none') +
  
  coord_equal() +
  theme_void()

# Display the plot
plot_bubble_pie

Homer Jay Simpson

Asked: 2025-02-19 04:30:50 +0800 CST

Uma espécie de gráfico Likert baseado na classificação de outro gráfico Likert em R

5

Tenho um data frame em R que uso para calcular porcentagens e apresentá-las em um gráfico likert mais um gráfico de barras. No meio, tenho um gráfico de barras que tem as porcentagens dos NA's neste data frame em cada pergunta dentro de cada nível de agrupamento. Quero combinar as perguntas do gráfico likert do meio com a da esquerda (ou seja, o gráfico likert da esquerda é minha base e depende deste gráfico para combinar o q1:q6 no gráfico do meio). Como posso conseguir isso em R?

Alguma ajuda?


library(ggstats)
library(dplyr)
library(ggplot2)


likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)
set.seed(42)
df <-
  tibble(
    grouping = sample(c(LETTERS[1:9]), 150, replace = TRUE),
    q1 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q2 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q3 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q4 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q6 = sample(c(likert_levels, NA), 150, replace = TRUE)
  ) |>
  mutate(across(-grouping, ~ factor(.x, levels = likert_levels)))

filter_df = df %>%
  dplyr::select(grouping) %>%
  dplyr::group_by(grouping) %>%
  dplyr::summarise(n = n()) %>%
  dplyr::filter(n >= 18)%>%
  dplyr::arrange(desc(n))
parameters = as.vector(filter_df[[1]])

# Seed used to create the data
set.seed(42)

data_fun <- function(.data) {
  .data |>
    mutate(
      .question = interaction(grouping, .question),
      .question = reorder(
        .question,
        ave(as.numeric(.answer), .question, FUN = \(x) {
          sum(x %in% 4:5) / length(x[!is.na(x)])
        }),
        decreasing = TRUE
      )
    )
}

df = df%>%
  filter(grouping %in% parameters)

v1 <- gglikert(df, q1:q6,
               facet_rows = vars(grouping),
               add_totals = TRUE,
               data_fun = data_fun
) +
  scale_y_discrete(
    labels = ~ gsub("^.*\\.", "", .x)
  ) +
  labs(y = NULL) +
  theme(
    panel.border = element_rect(color = "gray", fill = NA),
    axis.text.x = element_blank(),
    legend.position = "bottom",
    strip.text = element_text(color = "black", face = "bold"),
    strip.placement = "outside"
  ) +
  theme(strip.text.y = element_text(angle = 0)) +
  facet_wrap(
    facets = vars(grouping),
    labeller = labeller(grouping = label_wrap_gen(width = 5)),
    ncol = 1, scales = "free_y",
    strip.position = "right"
  )

v2 <- filter_df %>%
  ggplot2::ggplot(aes(y = grouping, x = n)) +
  geom_bar(stat = "identity", fill = "lightgrey") +
  geom_text(aes(label = n), position = position_stack(vjust = 0.5)) +
  scale_y_discrete(
    limits = rev, expand = c(0, 0)
  ) +
  facet_wrap(
    facets = vars(grouping),
    labeller = labeller(grouping = label_wrap_gen(width = 10)),
    ncol = 1, scales = "free_y",
    strip.position = "left"
  ) +
  theme_light() +
  theme(
    panel.border = element_rect(color = "gray", fill = NA),
    axis.text.x = element_blank(),
    legend.position = "none",
    strip.text.y = element_blank()
  ) +
  labs(x = NULL, y = NULL)

availability_levels <- c(
  "available",
  "not_available"
)

df_ava = df%>%
  
  pivot_longer(!grouping, names_to = "question", values_to = "response")%>%
  mutate(count2 = case_when(is.na(response) ~ "not_available",
                            TRUE ~"available"))%>%
  select(-response)%>%
  group_by(grouping,question)%>%
  summarise(
    total = n(),
    available_percent = sum(count2 == "available") / total * 100,
    not_available_percent = round(sum(count2 == "not_available") / total * 100,0),
    .groups = 'drop'
  )%>%
  select(grouping,question,not_available_percent)

df_ava


v3 <- df_ava%>%
  ggplot2::ggplot(aes(y = question, x = not_available_percent)) +
  geom_bar(stat = "identity", fill = "lightgrey") +
  geom_text(aes(label = paste0(not_available_percent, "%")), 
            size = 2.5,
            position = position_stack(vjust = 0.5)) +
  scale_y_discrete(
    limits = rev, expand = c(0, 0)
  ) +
  facet_wrap(
    facets = vars(grouping),
    labeller = labeller(grouping = label_wrap_gen(width = 10)),
    ncol = 1, scales = "free_y",
    strip.position = "left"
  ) +
  theme_light() +
  theme(
    panel.border = element_rect(color = "gray", fill = NA),
    axis.text.x = element_blank(),
    legend.position = "bottom"#,
  #  strip.text.y = element_blank()
  ) +
  labs(x = NULL, y = NULL)

v1+v3+v2+ plot_layout(widths = c(3,1,.5)
) &
  theme(legend.position = "bottom")

Homer Jay Simpson

Asked: 2025-02-14 15:56:22 +0800 CST

Imap com múltiplos ifelse para 3 tempos, 4 subplots e facetas de linhas e colunas em R

5

Baseado neste post para gglikert. Lá a coluna group1 tinha 2 níveis e group2 tem 3. Agora se eu implementar com 3 níveis em group1 e 4 níveis em group2

library(ggstats)
library(dplyr)
library(ggplot2)
library(tidyverse)
library(patchwork)

# Define Likert scale levels
likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)

# Generate sample data
set.seed(42)
df <- tibble(
  q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
  q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
  q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
) |>
  mutate(across(everything(), ~ factor(.x, levels = likert_levels)))

# Add grouping variables
df_group <- df
df_group$group1 <- sample(c("Friday", "Saturday", "Sunday"), 150, replace = TRUE)
df_group$group2 <- sample(c("Abu Dhabi - UAE", "Buenos Aires - Argentina", " San Sebastian - Spain","New York City - USA"), 150, replace = TRUE)

# Generate Likert plots with conditional faceting for group1 and group2
plots <- df_group |>
  split(~ group1 + group2) |>
  imap(\(x, y) {
    # Create the plot with the corresponding facet layer
    gglikert(x,
             q4:q6,
             labels_size = 3,
             sort = "descending"
    ) +
      facet_grid(group2 ~ group1, scales = "free_y")
  })

# Combine plots with patchwork
combined_plot <- wrap_plots(
  plots,
  ncol = length(unique(df_group$group1)),
  nrow = length(unique(df_group$group2)),
  guides = "collect"
) &
  theme(
    legend.position = "bottom",
    # Remove y-axis text for columns 2 and 3
    strip.text.y.right = element_blank(), # Remove facet strip text on the right
    axis.text.y.right = element_blank(),  # Remove y-axis text on the right
    axis.ticks.y.right = element_blank() # Remove y-axis ticks on the right
  )

# Display the combined plot
combined_plot

Eu recebo isso.

Portanto, as colunas de facetas "Sexta-feira", "Sábado", "Domingo" aparecem em todas as linhas e em todas as colunas. E as linhas "Abu Dhabi - Emirados Árabes Unidos", "Buenos Aires - Argentina", "San Sebastian - Espanha", "Nova York - EUA" não são exibidas. Elas devem ser exibidas na última coluna. Além disso, o texto do eixo y deve estar presente apenas na primeira coluna.

Como posso modificar o imap()com o ifelse para que seja exibido corretamente como aqui ?

Alguma ajuda?

Homer Jay Simpson

Asked: 2025-02-13 23:40:53 +0800 CST

Classificar gglikert dentro das linhas de facetas dos subplots

7

Tenho os mesmos dados simulados mostrados na página do GitHub da biblioteca, em relação às linhas e colunas de facetas (incluídas no exemplo reproduzível abaixo).

Mas eu quero classificar cada subplot com base na soma de concordo fortemente e concordo. Como posso conseguir isso em R usando gglikert?

library(ggstats)
library(dplyr)
library(ggplot2)

likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)
set.seed(42)
df <-
  tibble(
    q1 = sample(likert_levels, 150, replace = TRUE),
    q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
    q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
  ) |>
  mutate(across(everything(), ~ factor(.x, levels = likert_levels)))

df_group <- df
df_group$group1 <- sample(c("A", "B"), 150, replace = TRUE)
df_group$group2 <- sample(c("a", "b", "c"), 150, replace = TRUE)

gglikert(df_group,
  q1:q6,
  facet_cols = vars(group1),
  labels_size = 3
)

gglikert(df_group,
  q3:q6,
  facet_cols = vars(group1),
  facet_rows = vars(group2),
  labels_size = 3
) +
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = expansion(0, .2)
  )

Homer Jay Simpson

Asked: 2025-02-12 00:35:30 +0800 CST

Inverter o texto em linhas de facetas no gglikert em R

5

Tenho os mesmos dados simulados descritos aqui com a diferença de que o grupo 2 tem elementos de texto grande. Quero inverter horizontalmente o texto em linhas de facetas, mas sem desaparecer a faixa de col de facetas no topo.

Como posso fazer isso em R?

library(ggstats)
library(dplyr)
library(ggplot2)
likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)

df <-
  tibble(
    grouping = sample(c("A", "B", "C", "D"), 150, replace = TRUE),
    q1 = sample(likert_levels, 150, replace = TRUE),
    q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
    q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
  ) %>%
  mutate(across(-grouping, ~ factor(.x, levels = likert_levels)))
df
gglikert(df, q1:q6, facet_cols = vars(grouping))
df_group <- df
df_group$group1 <- sample(c("category A", "category B","category C"), 150, replace = TRUE)
df_group$group2 <- sample(c("usa-new york", "usa-san fransisco", "usa-new orleans",
                            "united kingdom-stratford upon avon",
                            "south africa-port elizabeth",
                            "new zealand-upper hutt city"), 150, replace = TRUE)
gglikert(df_group,
         q3:q6,
         facet_cols = vars(group1),
         facet_rows = vars(group2),
         labels_size = 3
) +
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = expansion(0, .2)
  )

Homer Jay Simpson

Asked: 2025-02-08 19:03:07 +0800 CST

A largura do flextable em pdf não mostra por completo os nomes das colunas

5

Tenho o seguinte documento rmarkdown em R que gera um objeto flextable.

Meu problema é que nas duas colunas 4 e 5 os nomes dos cabeçalhos não aparecem por completo e as últimas letras ficam ocultas.


---
title: "flex_width_issue"
output: pdf_document
date: "2025-02-08"
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)



library(tidyverse)
library(officer)
library(flextable)

ft3 = structure(list("Project Number" = c(4107L, 1770L, 1979L, 9252L, 
2581L, 8360L, 6290L, 1002L, 7300L, 2925L), "Client Company" = c("Dynamic Build Concept Agency", 
"Nova", "Alpha Corp", "Global Innovations", "Core Metrics", "Vision Group for Property Holdings", 
"United Firm for Urban Growth Projects", "Eastern Gate Real Estate Investment Group (EGRIG)", 
"Eastern Gate Real Estate Investment Group (EGRIG)", "Eastern Gate Real Estate Investment Group (EGRIG)"
), `organizational growth planning` = c(5, 5, 4.83, 4.67, 4.17, 
4, 3.83, 3.67, 3.5, 2.83), competency = c(5, 5, 4.83, 4.67, 4.27, 
4.08, 4.25, 4, 3.5, 3.25), compression = c(5, 5, 5, 4.67, 4.38, 
4.67, 4.67, 4, 3.67, 3), `International development project` = c(5, 
4.57, 4.43, 4.43, 3.83, 4.17, 3.57, 3.14, 2.71, 2.71), `Team spirit` = c(5, 
5, 5, 4.5, 4.21, 4.5, 4.5, 3.5, 3.5, 3), Plan = c(5, 5, 4, 4, 
3.6, 2, 3, 4, 3, 3), PIR = c(5, 5, 4.17, 4.67, 4.07, 4.17, 4.33, 
3.67, 3.67, 3.33), Success = c(5, 5, 4, 4, 4.08, 5, 3, 4, 2.67, 
3), plant = c(100, 98.92, 90.65, 89.03, 81.6, 81.48, 77.88, 74.95, 
65.55, 60.3)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", 
"data.frame"))


map_color2 = function(value) {
  case_when(
    value >= 1   & value <  1.5  ~ "#ed2e1c",    # [1–1.5[
    value >= 1.5 & value <= 2.5  ~ "#e09c95", # [1.5–2.5]
    value > 2.5  & value <  3.5  ~ "#85c1e9",   # ]2.5–3.5[
    value >= 3.5 & value <= 4.5 ~ "#7FF98B", # [3.5–4.5]
    value > 4.5  & value <=   5  ~ "#04B431",    # ]4.5–5]
    TRUE ~ "white"  # Default color for values outside the range
  )
}

colors2 = ft3 %>%
  select(3:(ncol(.) - 1)) %>%
  mutate(across(everything(), ~ map_color2(.)))  # Apply map_color column-wise


ft4 = ft3%>%
  flextable::flextable()

# Apply map_color to columns from 3 to the second-to-last column
ft4%>%
  border_outer(part = "all") %>%
  border_inner(part = "all") %>%
  bg(part = "header", bg = "grey") %>%
  bold(part = "header") %>%
  align(align = "center", part = "header") %>%         
  theme_zebra()%>%
  bg(bg = "grey", part = "header")%>% 
  align(align="center", part="all")%>%
  fit_to_width(max_width = 7)%>% 
  fontsize(size=7, part="all")%>%
  flextable::bg(i = rep(1:nrow(ft3)), 
                j = rep(3:(ncol(ft3) - 1)), 
                bg = unlist(colors2))%>%
  align(j = 2, align = "center", part = "all")



can anyone help me with this ? 

How this can fit in width and show all the header columns names clearly in full ?

Homer Jay Simpson

Asked: 2025-01-10 04:39:52 +0800 CST

Pivot mais amplo em R com 2 variáveis para nomes_de

7

Tenho um quadro de dados em R chamado data:

data
# A tibble: 192 × 4
    Year Category Favor    Percentage
   <dbl> <chr>    <chr>         <dbl>
 1  2002 A        Good           35.8
 2  2002 A        Mediocre       31.9
 3  2002 A        Bad            45.3
 4  2002 B        Good           51.3
 5  2002 B        Mediocre       42.3
 6  2002 B        Bad            26.4
 7  2002 C        Good           64.4
 8  2002 C        Mediocre       33.4
 9  2002 C        Bad            24.2
10  2002 D        Good           56.2

Quero girá-lo mais amplamente para que fique idealmente como o seguinte:

categoria	Ruim - 1998	Ruim - 1999	...	Ruim - 2002	Medíocre - 1998	...	Medíocre -2002	Bom - 1998 ...Bom - 2002
UM
B
C
E
E
F
G
E
EU
Eu
E
eu
M
Não
O
P

ou seja, a coluna Categoria deve ser a primeira coluna e então começando da segunda Ruim e 1998, terceira coluna Ruim - 1999, quarta Ruim - 2001, quinta Ruim - 2002, sexta Medíocre - 1998, Medíocre - 1999, Medíocre - 2001, Medíocre - 2002, Bom - 1998, Bom - 1999, Bom - 2001 e finalmente a coluna Bom - 2002.

Como posso fazer isso em R usando funções do tidyverse?

Dados

dput(data)
structure(list(Year = c(2002, 2002, 2002, 2002, 2002, 2002, 2002, 
2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 
2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 
2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 
2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 1998, 1998, 1998, 
1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 
1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 
1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 
1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 
1998, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 
1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 
1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 
1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 
1999, 1999, 1999, 1999, 1999, 2001, 2001, 2001, 2001, 2001, 2001, 
2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 
2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 
2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 
2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001), Category = c("A", 
"A", "A", "B", "B", "B", "C", "C", "C", "D", "D", "D", "E", "E", 
"E", "F", "F", "F", "G", "G", "G", "H", "H", "H", "I", "I", "I", 
"J", "J", "J", "K", "K", "K", "L", "L", "L", "M", "M", "M", "N", 
"N", "N", "O", "O", "O", "P", "P", "P", "A", "A", "A", "B", "B", 
"B", "C", "C", "C", "D", "D", "D", "E", "E", "E", "F", "F", "F", 
"G", "G", "G", "H", "H", "H", "I", "I", "I", "J", "J", "J", "K", 
"K", "K", "L", "L", "L", "M", "M", "M", "N", "N", "N", "O", "O", 
"O", "P", "P", "P", "A", "A", "A", "B", "B", "B", "C", "C", "C", 
"D", "D", "D", "E", "E", "E", "F", "F", "F", "G", "G", "G", "H", 
"H", "H", "I", "I", "I", "J", "J", "J", "K", "K", "K", "L", "L", 
"L", "M", "M", "M", "N", "N", "N", "O", "O", "O", "P", "P", "P", 
"A", "A", "A", "B", "B", "B", "C", "C", "C", "D", "D", "D", "E", 
"E", "E", "F", "F", "F", "G", "G", "G", "H", "H", "H", "I", "I", 
"I", "J", "J", "J", "K", "K", "K", "L", "L", "L", "M", "M", "M", 
"N", "N", "N", "O", "O", "O", "P", "P", "P"), Favor = c("Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", 
"Bad", "Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", 
"Good", "Mediocre", "Bad", "Good", "Mediocre", "Bad", "Good", 
"Mediocre", "Bad"), Percentage = c(36.85, 36.88, 46.28, 60.28, 
45.3, 36.42, 70.44, 37.39, 34.17, 58.23, 48.77, 34.99, 67.53, 
33.46, 32.01, 35.96, 35.33, 62.71, 46.84, 32.92, 42.24, 83.21, 
26.67, 16.11, 65.91, 23.94, 46.15, 81.83, 23.86, 27.31, 74.32, 
35.09, 33.59, 71.91, 30.92, 28.17, 62.84, 33.06, 27.1, 63.05, 
44.81, 26.15, 76.68, 35.99, 23.33, 49.19, 34.58, 42.23, 55.21, 
35.37, 34.42, 64.48, 36.53, 33.99, 70.81, 27.1, 31.09, 51.36, 
43, 37.65, 64.57, 34.37, 29.06, 35.55, 28.44, 56.01, 56.84, 33.36, 
38.8, 79.74, 35.74, 22.52, 66.86, 29.99, 44.15, 89.57, 27.79, 
14.64, 82.49, 27.37, 27.14, 75.92, 33.39, 18.69, 69.8, 34.69, 
29.51, 75.2, 42.63, 29.17, 90.33, 36.72, 13.95, 52.73, 30, 38.27, 
68.14, 40.61, 33.25, 66.2, 33.99, 31.81, 80.38, 26.48, 21.13, 
50.5, 40.36, 37.14, 74.17, 31.78, 29.04, 44.91, 35.24, 43.85, 
69.23, 35.44, 32.33, 86.44, 24.11, 17.46, 69.69, 33.06, 40.25, 
86.37, 21.21, 21.42, 80.11, 35.57, 32.32, 77.2, 32.03, 19.77, 
72.98, 28.08, 20.94, 70.81, 29.12, 24.07, 88.14, 22.31, 16.55, 
67.49, 44.16, 32.35, 69.03, 39.45, 28.52, 71.97, 37.6, 25.43, 
79.06, 38.4, 19.55, 68.94, 37.03, 30.03, 80.74, 30.59, 30.67, 
49.07, 45.79, 47.14, 60.1, 27.55, 34.36, 88.54, 30.2, 20.26, 
59.42, 22.98, 43.61, 86.84, 16.73, 14.43, 77.42, 22.07, 22.52, 
78.85, 23.88, 17.28, 78.22, 39.57, 27.22, 80.17, 26.21, 20.63, 
94.63, 28.66, 13.71, 65.86, 31.97, 32.16)), row.names = c(NA, 
-192L), class = c("tbl_df", "tbl", "data.frame"))

Homer Jay Simpson

Asked: 2025-01-03 19:02:27 +0800 CST

O eixo Y em R mostra NA's no gráfico de pontos usando gglot2

4

Tenho um quadro de dados em R chamado df:

> df
# A tibble: 25 × 3
   delta cat    Year
   <dbl> <chr> <dbl>
 1  2.5  A      2019
 2  2    A      2024
 3  2.6  A      2020
 4  4    A      2022
 5  4.5  A      2023
 6  3    B      2019
 7  2.8  B      2024
 8  2.95 B      2023
 9  2.98 B      2022
10  3.07 B      2020
# ℹ 15 more rows

Criei um gráfico de pontos em R usando ggplot2, mas no eixo y estão os anos:

Todos eles são NA
E não são classificados com base na evolução dos anos (ou seja, 2019,...2024)

YEARS <- c(2019,2020, 2022, 2023, 2024)
x=df%>%
  mutate_if(is.factor,as.character)%>%
  group_by(cat) %>%
  arrange(delta,.by_group = TRUE) %>%
  mutate(labels = paste0(letters[1:n()], "/",Year))%>%
  print(n=30)

x%>%
  mutate(Year = factor(Year, levels = rev(YEARS)))%>%
  group_by(Year)%>%
  arrange(Year)%>%
  ggplot(aes(x = delta, y =  Year
  )) +
  geom_point(size = 3) +
  geom_label_repel(aes(label = delta),                # Add text labels showing `delta`
                   size = 3,                         # Adjust text size
                   box.padding = 0.3,                # Padding around text box
                   point.padding = 0.2,              # Padding around points
                   segment.color = "gray",           # Line color
                   segment.size = 0.5) +   
  facet_grid(cat ~., scales="free_y") +
  scale_y_discrete(labels = \(x) str_extract(x, "(?<=/).*")) +
  labs(y =NULL) +
  geom_vline(xintercept=0) +
  theme_bw() +
  theme(legend.position = "none",                           # Remove the legend
        axis.text.x = element_text(angle = 0, hjust = 1),   # Rotate x-axis labels
        strip.text.y = element_text(size = 8, angle = 0, vjust = 0.5),
        axis.text.y  = element_text(size  = 7),
        strip.text = element_text(size = 14),               # Increase facet label size
        axis.title = element_text(size = 14),               # Increase axis title size
        axis.text = element_text(size = 10))+               # Increase axis text size
  theme(strip.background = element_rect(color="black", fill="gray", size=1.5, linetype="solid"))+
  labs(title = "",x = "")

como posso consertar isso?

Dados


df=structure(list(delta = c(2.5, 2, 2.6, 4, 4.5, 3, 
                        2.8, 2.95, 2.98, 3.07, 2.2, 3.3, 3.4, 3.9,5, 3.7, 2.9, 2.9, 
                        2.7, 2.9, 3.6, 2.3, 2.4, 2.3, 2.7), cat = c("A", "A", 
                                                                              "A", "A", "A", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", 
                                                                              "E","E","E","E","E", "D", "D", "D", "D", "D"), Year = c(2019, 
                                                                                                                                     2024, 2020, 2022, 2023, 2019, 2024, 2023, 2022, 2020, 2019, 2024, 
                                                                                                                                     2020, 2022, 2023, 2019, 2020, 2022, 2023, 2024, 2019, 2022, 2020, 
                                                                                                                                     2024, 2023)), row.names = c(NA, -25L), class = c("tbl_df", "tbl", 
                                                                                                                                                                                      "data.frame"))

Homer Jay Simpson

Asked: 2024-12-30 23:39:12 +0800 CST

Função de tabela no SQL Server com vários parâmetros como argumento

6

Tenho uma tabela no SQL Server 2016 chamada df:

-- Create a new table with department and gender columns
CREATE TABLE df 
(
    country VARCHAR(50),
    year INT,
    val1 INT,
    val2 INT,
    val3 INT,
    department VARCHAR(50),
    gender VARCHAR(10)
);

-- Insert data into the new table, including department and gender
INSERT INTO df (country, year, val1, val2, val3, department, gender) 
VALUES ('USA', 2020, 4, 4, 5, 'Sales', 'Male'),
('USA', 2020, 4, 4, 5, 'Sales', 'Male'),
('USA', 2020, 5, 5, 5, 'Sales', 'Female'),
('USA', 2020, 5, 5, 5, 'Sales', 'Female'),
('USA', 2020, 1, 1, 5, 'Sales', 'Male'),
('USA', 2020, 3, 3, 5, 'Sales', 'Female'),
('USA', 2020, 4, 2, 5, 'Sales', 'Male'),
('USA', 2020, 1, 1, 5, 'Sales', 'Female'),
('USA', 2020, 2, 2, 5, 'Sales', 'Male'),
('Canada', 2020, 2, 2, 3, 'HR', 'Female'),
('Canada', 2020, 2, 2, 3, 'HR', 'Female'),
('Canada', 2020, 2, 2, 3, 'HR', 'Male'),
('Canada', 2020, 2, 2, 3, 'HR', 'Male'),
('Canada', 2020, 5, 5, 3, 'HR', 'Female'),
('Canada', 2020, 5, 5, 3, 'HR', 'Male'),
('Canada', 2020, 1, 1, 3, 'HR', 'Female'),
('Canada', 2020, 1, 1, 3, 'HR', 'Male'),
('Canada', 2020, 3, 4, 3, 'HR', 'Female'),
('Canada', 2020, 3, 4, 3, 'HR', 'Male'),
('Canada', 2020, 5, 4, 3, 'HR', 'Female'),
('Canada', 2020, 5, 4, 5, 'HR', 'Male'),
('Canada', 2020, 5, 4, 5, 'HR', 'Female'),
('Germany', 2022, 5, 5, 4, 'IT', 'Male'),
('France', 2020, 1, 1, 2, 'Finance', 'Female'),
('France', 2020, 1, 1, 2, 'Finance', 'Female'),
('France', 2020, 3, 2, 2, 'Finance', 'Male'),
('France', 2020, 3, 4, 2, 'Finance', 'Female'),
('France', 2020, 3, 5, 5, 'Finance', 'Male'),
('France', 2020, 3, 4, 4, 'Finance', 'Female'),
('France', 2020, 3, 4, 4, 'Finance', 'Male'),
('France', 2020, 3, 4, 3, 'Finance', 'Female'),
('UK', 2021, 4, 2, 3, 'Marketing', 'Male'),
('Australia', 2022, 3, 3, 4, 'Support', 'Female'),
('Italy', 2020, 5, 5, 5, 'Operations', 'Male'),
('Italy', 2020, 5, 5, 5, 'Operations', 'Female'),
('Italy', 2020, 5, 1, 1, 'Operations', 'Male'),
('Italy', 2020, 4, 4, 1, 'Operations', 'Female'),
('Italy', 2020, 2, 1, 2, 'Operations', 'Male'),
('Italy', 2020, 3, 5, 3, 'Operations', 'Female'),
('Spain', 2021, 1, 2, 3, 'Customer Service', 'Male'),
('Mexico', 2022, 4, 4, 4, 'Logistics', 'Female'),
('Brazil', 2020, 4, 1, 1, 'R&D', 'Male'),
('Brazil', 2020, 4, 1, 1, 'R&D', 'Female'),
('Brazil', 2020, 4, 3, 4, 'R&D', 'Male'),
('Brazil', 2020, 5, 3, 5, 'R&D', 'Female'),
('Brazil', 2020, 5, 3, 5, 'R&D', 'Male'),
('Brazil', 2020, 3, 3, 1, 'R&D', 'Female'),
('Brazil', 2020, 2, 3, 1, 'R&D', 'Male');

-- Select all rows from the new table to check the data
SELECT * FROM df;

Com esta tabela, crio algumas porcentagens e uma coluna de contagem com base em alguns filtros.

-- Parameters
DECLARE @Year INT = 2020;
DECLARE @Metric VARCHAR(50) = 'count'; 
DECLARE @Gender VARCHAR(20) = NULL; -- Set to specific gender (e.g., 'Male', 'Female') or NULL to include all
DECLARE @Department VARCHAR(50) = NULL; -- Set to specific department (e.g., 'HR', 'Engineering') or NULL to include all
-- Set @Metric to 'dissatisfaction', 'satisfaction', or 'count'

WITH UnpivotedData AS 
(
    SELECT country, gender, department, year, Vals
    FROM 
        (SELECT country, gender, department, year, val1, val2, val3
         FROM df) AS SourceTable
    UNPIVOT 
        (Vals FOR ValueColumn IN (val1, val2, val3)) AS Unpivoted
    WHERE year = @Year
),
Proportions AS 
(
    SELECT 
        country,
        gender,
        department,
        CASE 
            WHEN Vals = 1 THEN 'Very Dissatisfied'
            WHEN Vals = 2 THEN 'Dissatisfied'
            WHEN Vals = 3 THEN 'Neutral'
            WHEN Vals = 4 THEN 'Satisfied'
            WHEN Vals = 5 THEN 'Very Satisfied'
        END AS SatisfactionLevel,
        COUNT(*) * 1.0 / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department) AS Proportion
    FROM 
        UnpivotedData
    GROUP BY 
        country, gender, department, Vals
),
Pivoted AS 
(
    SELECT country, gender, department, 
           [Very Dissatisfied], 
           [Dissatisfied], 
           [Neutral], 
           [Satisfied], 
           [Very Satisfied]
    FROM Proportions
    PIVOT 
        (MAX(Proportion)
         FOR SatisfactionLevel IN ([Very Dissatisfied], [Dissatisfied], [Neutral], [Satisfied], [Very Satisfied])) AS p
),
CountryCounts AS 
(
    SELECT 
        CASE WHEN country IS NULL THEN 'Unknown' ELSE country END AS country,
        gender, 
        department,
        COUNT(*) AS Total
    FROM df
    WHERE year = @Year
    -- Apply filters for gender and department if provided
    AND (@Gender IS NULL OR gender = @Gender)
    AND (@Department IS NULL OR department = @Department)
    GROUP BY CASE WHEN country IS NULL THEN 'Unknown' ELSE country END, gender, department
),
OrderedData AS 
(
    SELECT 
        p.country,
        p.gender,
        p.department,
        [Very Dissatisfied],
        [Dissatisfied],
        [Neutral],
        [Satisfied],
        [Very Satisfied],
        c.Total,
        CASE 
            WHEN @Metric = 'satisfaction' THEN ISNULL([Satisfied], 0) + ISNULL([Very Satisfied], 0)
            WHEN @Metric = 'dissatisfaction' THEN ISNULL([Very Dissatisfied], 0) + ISNULL([Dissatisfied], 0)
            WHEN @Metric = 'count' THEN c.Total
        END AS SortValue
    FROM Pivoted AS p
    INNER JOIN CountryCounts AS c ON p.country = c.country AND p.gender = c.gender AND p.department = c.department
)
SELECT 
    country,
    gender,
    department,
    [Very Dissatisfied],
    [Dissatisfied],
    [Neutral],
    [Satisfied],
    [Very Satisfied],
    Total
FROM 
    OrderedData
ORDER BY 
    SortValue DESC;

Quero criar uma função de tabela que terá 3 argumentos:

Métrica
Ano
Fator

Factorpode ser o Gênero ou o Departamento ou ambos. Se por exemplo Factorfor o Gênero a tabela a ser agrupada pelo Gênero e se for o Departamento a ser agrupada pelo Departamento.

Se ambos forem agrupados por ambos. Se Factorfor nulo ou padrão para não ser agrupado de forma alguma.

Em relação a Year: se o Yearfor passado para ser agrupado por ano. Se o Yearfor nulo, mostre todos os anos sem agrupamento.

Existe uma maneira de fazer isso no SQL Server?

Eu tenho um violino aqui

Homer Jay Simpson

Asked: 2024-12-25 17:43:49 +0800 CST

Mensagem de aviso em ggplot2 `geom_label()`

5

Tenho um quadro de dados em R que, após algumas transformações de dados, cálculos e plotagens:

library(tidyverse)
library(ggstats)
library(patchwork)
library(tibble)
library(tidyverse)
library(ggplot2)
library(ggstats)
likert_levels = c(
  "Very \n Dissatisfied",
  "Dissatisfied",
  "Neutral",
  "Satisfied",
  "Very \n Satisfied"
)


custom_colors = c(
  "Very \n Dissatisfied" = "#ed2e1c",
  "Dissatisfied" = "#e09c95",
  "Neutral" = "#85c1e9",
  "Satisfied" = "#7FF98B",
  "Very \n Satisfied" = "#04B431"
)

var_levels <- c(LETTERS[1:20])
n = 500
likert_levels = c(
  "Very \n Dissatisfied",
  "Dissatisfied",
  "Neutral",
  "Satisfied",
  "Very \n Satisfied"
)

df <- tibble(
  var = sample(var_levels, n, replace = TRUE),  
  val1 = sample(likert_levels, n, replace = TRUE),
  val2 = sample(c(likert_levels, NA),n, replace = TRUE),
  val3 = sample(likert_levels, n, replace = TRUE)
)

df2 = df%>%
  pivot_longer(!var, names_to = "Categories", values_to = "likert_values")%>%
  select(-Categories)%>%
  tidyr::drop_na()


df_bar = df%>%
  select(var)%>%
  group_by(var)%>%
  summarise(n=n())

df_likert = df2 %>%
  group_by(var, likert_values) %>%             # Group by `var` and `likert_values`
  summarise(count = n(), .groups = "drop") %>% # Count the occurrences
  group_by(var) %>%                            # Group by `var`
  mutate(percentage = (count / sum(count)) * 100) %>% # Calculate percentages
  ungroup()                                    # Ungroup for a clean output


df = df_likert%>%
  left_join(.,df_bar,by = "var")%>%
  select(-count)%>%
  pivot_wider(names_from = likert_values, values_from = percentage)%>%
  dplyr::relocate(var,.before=n)%>%
  dplyr::relocate(n,.before=`Very \n Dissatisfied`)%>%
  dplyr::relocate(`Very \n Dissatisfied` ,.after = n)%>%
  dplyr::relocate( Dissatisfied,.after = `Very \n Dissatisfied`)%>%
  dplyr::relocate(Neutral,.after =Dissatisfied )%>%
  dplyr::relocate(Satisfied,.after=Neutral)%>%
  dplyr::relocate(`Very \n Satisfied`,.after = Satisfied)


levels <- names(df)[-c(1:2)]
df_long <- df %>%
  select(-n) %>%
  pivot_longer(!var, names_to = "Likert", values_to = "Percentage") |>
  mutate(Likert = factor(Likert, levels))





df_tot <- df_long |>
  summarise(
    prop_lower = sum(Percentage[Likert %in% levels[1:2]]),
    prop_higher = sum(Percentage[Likert %in% levels[4:5]]),
    .by = var
  ) |>
  pivot_longer(-var,
               names_prefix = "prop_",
               values_to = "Percentage",
               names_to = "where"
  )

var_ordered <- levels(with(df_tot, reorder(var,
                                          ifelse(where == "higher", Percentage, NA),
                                          na.rm = TRUE   )) )
var_ordered = var_ordered[1:10]

df_long=df_long%>%
  filter(var %in% var_ordered)

# Likert plot
likert_plot <- ggplot(df_long, aes(x = Percentage, y = var, fill = Likert)) +
  geom_col(position = position_likert(reverse = FALSE)) +
  geom_text(
    aes(
      label = label_percent_abs(hide_below = .01, accuracy = 1, scale = 1)(Percentage)
    ),
    position = position_likert(vjust = 0.5, reverse = FALSE),
    size = 3.5,
    fontface = "bold"
  ) +
  geom_label(
    data = df_tot,
    aes(
      label = label_percent_abs(hide_below = .01, accuracy = 1, scale = 1)(Percentage),
      x = ifelse(where == "lower", -.8 , .8),
      fill = NULL
    ),
    size = 3.5,
    fontface = "bold",
    label.size = 0.2,
    show.legend = FALSE
  ) +
  scale_x_continuous(
    labels = label_percent_abs()
  ) +
  labs(
    title = "Likert Responses by Category",
    x = "Category",
    y = "Percentage",
    fill = "Likert Scale"
  ) +
  theme_bw() +
  theme( panel.border = element_rect(color = "black"))+
  scale_fill_manual(values = custom_colors) +
  labs(x = NULL, y = NULL, fill = NULL) +
  coord_cartesian(clip = "off")+
  scale_y_discrete(limits = var_ordered)



df = df%>%
  filter(var %in% var_ordered)
# Horizontal bar plot
bar_plot <- ggplot(df, aes(x = n, y = var)) +
  geom_bar(stat = "identity", fill = "lightgrey") +
  geom_label(
    aes(
      label = label_number_abs(hide_below = .05, accuracy = 2)(n)
    ),
    size = 3.5,
    position = position_stack(vjust = 0.5),
    hjust = 1,
    fill = NA,
    label.size = 0,
    color = "black"
  ) +
  scale_y_discrete(limits = var_ordered)+
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = c(0, .15)
  ) +
  theme_light() +
  theme(
    legend.position = "bottom",
    panel.grid.major.y = element_blank(),
    panel.border = element_rect(color = "black") ,
    axis.text.x = element_blank() # Hides x-axis numbers
  ) +
  labs(x = NULL, y = NULL, fill = NULL)

# Print plots

(likert_plot) + (bar_plot) +
  plot_layout(
    width = c(4, 1)
  ) &
  theme(legend.position = "bottom")

Eu recebo o:

mas no console recebo uma mensagem de aviso:

Warning message:
Removed 20 rows containing missing values or values outside the scale range (`geom_label()`).

por que recebo esse aviso? é algo sobre os NA's? Como posso parar isso?

Homer Jay Simpson

Asked: 2024-12-24 18:45:45 +0800 CST

Classifique o eixo y em cada grade de facetas no ggplot 2 em R

6

Eu tenho um quadro de dados em R chamado X.

X
# A tibble: 27 × 6
    Year   delta count color     labels categ
   <int>   <dbl> <int> <chr>     <chr>  <chr>
 1  2024 -0.246     26 red       a/2024 A    
 2  2023 -0.243     37 red       b/2023 A    
 3  2022  0.0490    51 red       c/2022 A    
 4  2020  0.0603   125 red       d/2020 A    
 5  2023 -0.219     24 darkgreen a/2023 B    
 6  2022 -0.185     36 darkgreen b/2022 B    
 7  2024 -0.118     19 darkgreen c/2024 B    
 8  2020 -0.0550    89 darkgreen d/2020 B    
 9  2024 -0.592      9 blue      a/2024 C    
10  2022  0.0336    14 blue      b/2022 C    
# ℹ 17 more rows

Quero plotá-los usando a grade de facetas no ggplot2, mas classificar os anos em ordem decrescente no eixo y para cada categoria. Começando de 2024 a 2020.

Como posso fazer essa classificação em R?

YEARS <- c(2020, 2022, 2023, 2024)
X%>%
  mutate(Year = factor(Year, levels = YEARS))%>%
  group_by(Year)%>%
  arrange(Year)%>%
  ggplot(aes(x = delta, y = labels#, color = Year
  )) +
  geom_point(size = 3) +
  facet_grid(categ ~., scales="free_y") +
  scale_y_discrete(labels = \(x) str_extract(x, "(?<=/).*")) +
  labs(y="") +
  geom_vline(xintercept=0) +
  theme_bw() +
  theme(legend.position = "none",                           # Remove the legend
        axis.text.x = element_text(angle = 0, hjust = 1),   # Rotate x-axis labels
        strip.text.y = element_text(size = 8, angle = 0, vjust = 0.5),
        axis.text.y  = element_text(size  = 7),
        strip.text = element_text(size = 14),               # Increase facet label size
        axis.title = element_text(size = 14),               # Increase axis title size
        axis.text = element_text(size = 10))+               # Increase axis text size
  theme(strip.background = element_rect(color="black", fill=FACETBACKGROUND, size=1.5, linetype="solid"))+
  labs(title = "",x = "")

resultando em:

Dados

structure(list(Year = c(2024L, 2023L, 2022L, 2020L, 2023L, 2022L, 
2024L, 2020L, 2024L, 2022L, 2023L, 2020L, 2024L, 2020L, 2022L, 
2020L, 2022L, 2024L, 2023L, 2020L, 2022L, 2023L, 2024L, 2023L, 
2024L, 2020L, 2022L), delta = c(-0.245846153846154, -0.242934362934363, 
0.0490196078431371, 0.0603110504774897, -0.219285714285714, -0.184640522875817, 
-0.118315789473684, -0.0550147922191395, -0.592, 0.0336134453781511, 
0.139047619047619, 0.181280747447187, -0.0920000000000001, 0.561234127400567, 
1.07647058823529, -0.159860728663615, -0.0308464849354378, 0.0329999999999999, 
0.364047619047619, -0.112974017395813, -0.0897631779984723, 0.114805194805195, 
0.237268292682927, -0.38, 0.208, 0.393401959568399, 0.46218487394958
), count = c(26L, 37L, 51L, 125L, 24L, 36L, 19L, 89L, 9L, 14L, 
15L, 33L, 6L, 13L, 5L, 163L, 41L, 16L, 24L, 221L, 77L, 55L, 41L, 
14L, 5L, 88L, 14L), color = c("red", "red", "red", "red", "darkgreen", 
"darkgreen", "darkgreen", "darkgreen", "blue", "blue", "blue", 
"blue", "black", "black", "black", "orange", "orange", "orange", 
"orange", "purple", "purple", "purple", "purple", "#4778BB", 
"#4778BB", "#4778BB", "#4778BB"), labels = c("a/2024", "b/2023", 
"c/2022", "d/2020", "a/2023", "b/2022", "c/2024", "d/2020", "a/2024", 
"b/2022", "c/2023", "d/2020", "a/2024", "b/2020", "c/2022", "a/2020", 
"b/2022", "c/2024", "d/2023", "a/2020", "b/2022", "c/2023", "d/2024", 
"a/2023", "b/2024", "c/2020", "d/2022"), categ = c("A", "A", 
"A", "A", "B", "B", "B", "B", "C", "C", "C", "C", "D", "D", "D", 
"E", "E", "E", "E", "F", "F", "F", "F", "G", "G", "G", "G")), row.names = c(NA, 
-27L), class = c("tbl_df", "tbl", "data.frame"))
>

Homer Jay Simpson

Asked: 2024-12-24 00:57:13 +0800 CST

Adicione totais no gráfico Likert no ggplot2 em R

5

Eu tenho esse quadro de dados em R chamado df:

df
# A tibble: 5 × 7
  var       n `Very \n Dissatisfied` Dissatisfied Neutral Satisfied `Very \n Satisfied`
  <chr> <int>                  <dbl>        <dbl>   <dbl>     <dbl>               <dbl>
1 A       106                   18.9         14.5    23.0      22.0                21.7
2 B       106                   19.2         16.0    25.5      18.9                20.4
3 C        87                   22.2         25.3    15.7      17.2                19.5
4 D       102                   19.0         19.0    21.2      22.9                18.0
5 E        99                   22.2         20.5    20.9      17.5                18.9

a partir deste quadro de dados criei dois gráficos, um gráfico Likert e um gráfico de barras:


df_long <- df %>%
  select(-n)%>%
  pivot_longer(!var, names_to = "Likert", values_to = "Percentage")
# Likert plot
likert_plot <- ggplot(df_long, aes(x = var, y = Percentage, fill = Likert)) +
  geom_col(position = position_likert(reverse = FALSE)) +
  geom_text(
    aes(
      label = label_percent_abs(hide_below = .01, accuracy = 1)(Percentage),
      color = after_scale(hex_bw(.data$fill))
    ),
    position = position_likert(vjust = 0.5, reverse = FALSE),
    size = 3.5
  ) +
  scale_y_continuous(labels = scales::percent) +
  labs(
    title = "Likert Responses by Category",
    x = "Category",
    y = "Percentage",
    fill = "Likert Scale"
  ) +
  coord_flip()+
  theme_minimal()+scale_fill_manual(values = custom_colors) +
  labs(x = NULL, y = NULL, fill = NULL)

# Horizontal bar plot
bar_plot <- ggplot(df, aes(x = n, y = var)) +
  geom_bar(stat = "identity", fill = "lightgrey") +
  geom_label(
    aes(
      label = label_number_abs(hide_below = .05, accuracy = 2)(n)
    ),
    size = 3.5,
    position = position_stack(vjust = 0.5), 
    hjust = 1,
    fill = NA,
    label.size = 0,
    color = "black"
  ) +
  scale_y_discrete(labels = \(x) gsub("\\..*$", "", x)) +
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = c(0, .15)
  ) +
  theme_light() +
  theme(
    legend.position = "bottom",
    panel.grid.major.y = element_blank(),
    axis.text.x = element_blank()  # Hides x-axis numbers
  ) +
  labs(x = NULL, y = NULL, fill = NULL)

# Print plots

(likert_plot) +(bar_plot)+
  plot_layout(
   width = c(4,1)
  ) &
  theme(legend.position = "bottom")

que se parecem com isto:

Eu quero :

as porcentagens dentro de cada nível de barra para arredondá-las para 0 casas decimais
torná-los ousados
some os totais à direita e à esquerda. (ou seja, os totais à esquerda são a soma de muito insatisfeito e satisfeito e os totais à direita são a soma de satisfeito e muito satisfeito)

Como posso conseguir isso em R?

Dados

dput(df)
structure(list(var = c("A", "B", "C", "D", "E"), n = c(106L, 
106L, 87L, 102L, 99L), `Very 
 Dissatisfied` = c(18.8679245283019, 
19.1823899371069, 22.2222222222222, 18.9542483660131, 22.2222222222222
), Dissatisfied = c(14.4654088050314, 16.0377358490566, 25.2873563218391, 
18.9542483660131, 20.5387205387205), Neutral = c(22.9559748427673, 
25.4716981132075, 15.7088122605364, 21.2418300653595, 20.8754208754209
), Satisfied = c(22.0125786163522, 18.8679245283019, 17.2413793103448, 
22.8758169934641, 17.5084175084175), `Very 
 Satisfied` = c(21.6981132075472, 
20.440251572327, 19.5402298850575, 17.9738562091503, 18.8552188552189
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))

Homer Jay Simpson

Asked: 2024-12-20 15:50:29 +0800 CST

Classificar com base nos níveis de fator dois plot em ggplot2

6

Eu tenho um quadro de dados chamado df que contém dados Likert e uma coluna var com 5 níveis.

Quero classificar tanto o gráfico Likert quanto o gráfico de barras com base em uma ordem específica. Por exemplo, quero classificá-los de cima para baixo, "A, C, D, E, B", e não com base na soma das proporções:

library(tibble)
library(tidyverse)
library(ggplot2)
library(ggstats)

var_levels <- c(LETTERS[1:5])
n = 500
likert_levels = c(
  "Very \n Dissatisfied",
  "Dissatisfied",
  "Neutral",
  "Satisfied",
  "Very \n Satisfied"
)

df <- tibble(
  var = sample(var_levels, n, replace = TRUE),  
  val1 = sample(likert_levels, n, replace = TRUE),
  val2 = sample(likert_levels, n, replace = TRUE)
)

df
df2 = df%>%
  pivot_longer(!var, names_to = "Categories", values_to = "likert_values")%>%
  select(-Categories)
df2

library(tidyverse)
library(ggstats)
library(patchwork)

# Define the order of 'var' levels
desired_order <- c("A", "C", "D", "E", "B")

# Ensure 'var' is a factor with the specified order
dat <- df |>
  mutate(
    var = factor(var, levels = desired_order),
    across(-var, ~ factor(.x, likert_levels))
  ) |>
  pivot_longer(-var, names_to = "group") |>
  count(var, value, group) |>
  complete(var, value, group, fill = list(n = 0)) |>
  mutate(
    prop = n / sum(n),
    prop_lower = sum(prop[value %in% likert_levels[1:2]]),
    prop_higher = sum(prop[value %in% likert_levels[4:5]]),
    .by = c(var, group)
  ) |>
  arrange(group, prop_lower) |>
  mutate(
    y_sort = paste(var, group, sep = "."),
    y_sort = fct_inorder(y_sort)
  )

top10 <- dat |>
  distinct(group, var, prop_lower) |>
  slice_max(prop_lower, n = 10, by = group)

dat <- dat |>
  semi_join(top10)

dat_tot <- dat |>
  distinct(group, var, y_sort, prop_lower, prop_higher) |>
  pivot_longer(-c(group, var, y_sort),
               names_to = c(".value", "name"),
               names_sep = "_"
  ) |>
  mutate(
    hjust_tot = ifelse(name == "lower", 1, 0),
    x_tot = ifelse(name == "lower", -1, 1)
  )

dat_bar <- dat |> 
  summarise(
    n = sum(n), .by = c(y_sort, group)
  )

p1 <- ggplot(dat, aes(y = y_sort, x = prop, fill = value)) +
  geom_col(position = position_likert(reverse = FALSE)) +
  geom_text(
    aes(
      label = label_percent_abs(hide_below = .05, accuracy = 1)(prop),
      color = after_scale(hex_bw(.data$fill))
    ),
    position = position_likert(vjust = 0.5, reverse = FALSE),
    size = 3.5
  ) +
  geom_label(
    aes(
      x = x_tot,
      label = label_percent_abs(accuracy = 1)(prop),
      hjust = hjust_tot,
      fill = NULL
    ),
    data = dat_tot,
    size = 3.5,
    color = "black",
    fontface = "bold",
    label.size = 0,
    show.legend = FALSE
  ) +
  scale_y_discrete(labels = \(x) gsub("\\..*$", "", x)) +
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = c(0, .15)
  ) +
  scale_fill_brewer(palette = "BrBG") +
  facet_wrap(~group,
             scales = "free_y", ncol = 1,
             strip.position = "right"
  ) +
  theme_light() +
  theme(
    legend.position = "bottom",
    panel.grid.major.y = element_blank(),
    strip.text = element_blank()
  ) +
  labs(x = NULL, y = NULL, fill = NULL)

p2 <- ggplot(dat_bar, aes(y = y_sort, x = n)) +
  geom_col() +
  geom_label(
    aes(
      label = label_number_abs(hide_below = .05, accuracy = 1)(n)
    ),
    size = 3.5,
    hjust = 1,
    fill = NA,
    label.size = 0,
    color = "white"
  ) +
  scale_y_discrete(labels = \(x) gsub("\\..*$", "", x)) +
  scale_x_continuous(
    labels = label_number_abs(),
    expand = c(0, 0, 0, .05)
  ) +
  facet_wrap(~group,
             scales = "free_y", ncol = 1,
             strip.position = "right"
  ) +
  theme_light() +
  theme(
    legend.position = "bottom",
    panel.grid.major.y = element_blank()
  ) +
  labs(x = NULL, y = NULL, fill = NULL)

# Combine the plots
p1 + p2 +
  plot_layout(
    guides = "collect") & 
  theme(legend.position = "bottom")

o problema é que o df é o quadro de dados original e o df2 é o quadro de dados anexado. Combinando esses dois para plotar o gráfico de barras do original e o gráfico likert do anexado. Ambos devem ser classificados de cima para baixo como a ordem "A, C, D, E, B". Como posso fazer isso em R?

Homer Jay Simpson

Asked: 2024-12-12 03:39:08 +0800 CST

Adicionar contagem de colunas de ocorrências

6

Tenho uma tabela no SQL Server chamada df encontrada aqui :

-- Parameters
DECLARE @Year INT = 2020; --, @Country varchar(50)= 'Brazil';

WITH ModeData AS (
    SELECT country, 
           a.Mode
    FROM df
    CROSS APPLY (
        SELECT TOP 1 Mode, COUNT(*) AS cnt
        FROM (VALUES (val1), (val2), (val3)) AS t(Mode)
        GROUP BY Mode
        ORDER BY COUNT(*) DESC
    ) a
  where year=@year --and  country=@country 
)

-- Calculate proportions and map modes to labels
, Proportions AS (
    SELECT country, 
           CASE 
               WHEN Mode = 1 THEN 'Very Dissatisfied'
               WHEN Mode = 2 THEN 'Dissatisfied'
               WHEN Mode = 3 THEN 'Neutral'
               WHEN Mode = 4 THEN 'Satisfied'
               WHEN Mode = 5 THEN 'Very Satisfied'
           END AS SatisfactionLevel,
           COUNT(*) * 1.0 / SUM(COUNT(*)) OVER (PARTITION BY country) AS Proportion
    FROM ModeData
    GROUP BY country, Mode
)

-- Pivot the results to get each satisfaction level as a column
SELECT country, 
       [Very Dissatisfied], 
       [Dissatisfied], 
       [Neutral], 
       [Satisfied], 
       [Very Satisfied]
FROM Proportions
PIVOT (
    MAX(Proportion)
    FOR SatisfactionLevel IN ([Very Dissatisfied], [Dissatisfied], [Neutral], [Satisfied], [Very Satisfied])
) AS p
ORDER BY country;

A tabela resultante é:

País	Muito insatisfeito	Insatisfeito	Neutro	Satisfeito	Muito satisfeito
Brasil	0,285714285714	0,142857142857	0,142857142857	0,142857142857	0,285714285714
Canadá	0,1111111111111	0,1111111111111	0,333333333333	0,222222222222	0,222222222222
França	0,250000000000	0,125000000000	0,250000000000	0,250000000000	0,125000000000
Itália	0,166666666666	0,166666666666	0,166666666666	0,166666666666	0,333333333333
EUA	0,222222222222	0,1111111111111	0,1111111111111	0,333333333333	0,222222222222

Quero calcular a contagem de cada país. Quantas linhas tem cada país na tabela df e adicionar essa contagem como uma coluna extra na tabela resultante. Idealmente e com base nos dados do exemplo de brinquedo, o resultado que quero que pareça:

País	Muito insatisfeito	Insatisfeito	Neutro	Satisfeito	Muito satisfeito	Contar
Brasil	0,285714285714	0,142857142857	0,142857142857	0,142857142857	0,285714285714	7
Canadá	0,1111111111111	0,1111111111111	0,333333333333	0,222222222222	0,222222222222	9
França	0,250000000000	0,125000000000	0,250000000000	0,250000000000	0,125000000000	8
Itália	0,166666666666	0,166666666666	0,166666666666	0,166666666666	0,333333333333	6
EUA	0,222222222222	0,1111111111111	0,1111111111111	0,333333333333	0,222222222222	9

Homer Jay Simpson

Asked: 2024-12-12 02:29:50 +0800 CST

Modo estatístico linha a linha na função de tabela [duplicado]

5

Tenho uma tabela no SQL Server chamada SurveyData encontrada aqui

Isso se parece com isso:

-- Create the table
CREATE TABLE SurveyData 
(
    country VARCHAR(50),
    year INT,
    val1 INT,
    val2 INT,
    val3 INT
);

-- Insert 10 rows of data
INSERT INTO SurveyData (country, year, val1, val2, val3) 
VALUES ('USA', 2020, 4, 4, 5),
       ('Canada', 2021, 2, 4, 3),
       ('Germany', 2022, 5, 5, 4),
       ('France', 2020, 3, 4, 2),
       ('UK', 2021, 4, 2, 3),
       ('Australia', 2022, 3, 3, 4),
       ('Italy', 2020, 5, 5, 5),
       ('Spain', 2021, 1, 2, 3),
       ('Mexico', 2022, 4, 4, 4),
       ('Brazil', 2020, 2, 3, 1);

-- Add the mode column to the table
SELECT * FROM SurveyData

Quero criar uma função de tabela que receberá dois parâmetros declarados país = EUA e ano = 2021 e a saída será a tabela filtrada df por país EUA e ano 2021 e a terceira coluna será o modo estatístico (mais frequente) das colunas val1, val2 e val3, mas por linha. Como posso fazer isso no SQL Server?

Homer Jay Simpson

Asked: 2024-12-02 20:26:17 +0800 CST

Mostrar porcentagens em categorias específicas no gráfico Likert usando ggplot2

8

Eu tenho um quadro de dados em R chamado df:

# Define categories and Likert levels
var_levels <- c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q")

likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)

# Set seed for reproducibility
set.seed(42)

# Create the dataframe with three Likert response columns
df <- tibble(
  var = sample(var_levels, 50, replace = TRUE),  # Random values from A to Q
  val1 = sample(likert_levels, 50, replace = TRUE) # Random values from Likert levels
  
)


library(tidyverse)
library(ggstats)

dat <- df |>
  mutate(
    across(-var, ~ factor(.x, likert_levels))
  ) |>
  pivot_longer(-var, names_to = "group") |>
  count(var, value, group) |>
  complete(var, value, group, fill = list(n = 0)) |>
  mutate(
    prop = n / sum(n),
    prop_lower = sum(prop[value %in% c("Strongly disagree", "Disagree")]),
    prop_higher = sum(prop[value %in% c("Strongly agree", "Agree")]),
    .by = c(var, group)
  ) |>
  arrange(group, prop_lower) |>
  mutate(
    y_sort = paste(var, group, sep = "."),
    y_sort = fct_inorder(y_sort)
  )

top10 <- dat |>
  distinct(group, var, prop_lower) |>
  slice_max(prop_lower, n = 10, by = group)

dat <- dat |>
  semi_join(top10)
#> Joining with `by = join_by(var, group, prop_lower)`

dat_tot <- dat |>
  distinct(group, var, y_sort, prop_lower, prop_higher) |>
  pivot_longer(-c(group, var, y_sort),
               names_to = c(".value", "name"),
               names_sep = "_"
  ) |>
  mutate(
    hjust_tot = ifelse(name == "lower", 1, 0),
    x_tot = ifelse(name == "lower", -1, 1)
  )

quero calcular o gráfico de liekrt:

p1 <- ggplot(dat, aes(y = y_sort, x = prop, fill = value)) +
  geom_col(position = position_likert(reverse = FALSE)) +
  geom_text(
    aes(
      label = label_percent_abs(hide_below = .05, accuracy = 1)(prop),
      color = after_scale(hex_bw(.data$fill))
    ),
    position = position_likert(vjust = 0.5, reverse = FALSE),
    size = 3.5
  ) +
  geom_label(
    aes(
      x = x_tot,
      label = label_percent_abs(accuracy = 1)(prop),
      hjust = hjust_tot,
      fill = NULL
    ),
    data = dat_tot,
    size = 3.5,
    color = "black",
    fontface = "bold",
    label.size = 0,
    show.legend = FALSE
  ) +
  scale_y_discrete(labels = \(x) gsub("\\..*$", "", x)) +
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = c(0, .15)
  ) +
  scale_fill_brewer(palette = "BrBG") +
  facet_wrap(~group,
             scales = "free_y", ncol = 1,
             strip.position = "right"
  ) +
  theme_light() +
  theme(
    legend.position = "bottom",
    panel.grid.major.y = element_blank()
  ) +
  labs(x = NULL, y = NULL, fill = NULL)

resultando na imagem. Mas eu quero ver apenas os totais à direita e à esquerda e a categoria do meio. Não mostrar as porcentagens "Discordo" ou "Concordo". Por exemplo, na imagem na última linha, eu quero mostrar 33% à esquerda, 33% à direita e 33% na barra de cor branca para a categoria Likert "Nem concordo nem discordo".

Como posso fazer isso em R?

Homer Jay Simpson

Asked: 2024-11-27 16:31:36 +0800 CST

Exibir texto no final e à direita de cada linha no ggplot2 em R [duplicado]

3

Eu tenho um quadro de dados em R chamado df_ que se parece com isto:

df_
# A tibble: 40 × 4
# Groups:   Year [5]
    Year Country    mu Color 
   <int> <fct>   <dbl> <fct> 
 1  2019 ALPHA    68.9 red   
 2  2019 BETA     64.8 black 
 3  2019 GAMMA    70.0 yellow
 4  2019 RHO      65.2 gray  
 5  2019 DELTA    70.1 green 
 6  2019 EPSILON  69.6 pink  
 7  2019 THETA    69.8 purple
 8  2019 OMEGA    67.9 orange
 9  2020 ALPHA    69.3 red   
10  2020 BETA     65.2 black 
# ℹ 30 more rows

eu quero 2 coisas:

a) aplicar a cada linha a cor do país correspondente b) no final de cada linha exibir o texto de cada coluna de país e o último valor (ou seja, o ano 2024). Por exemplo, no final e à direita da linha exibir ("ALPHA ,76.4").

Como posso fazer isso em R usando ggplot2?

ggplot(df_, aes(x = Year, y = mu,color = Color, group =Country)) +
  geom_line(size = 1.5) +
  geom_point() +
  labs(x = "Years", y = "") +
  theme_minimal() +
  theme(legend.position = "none")

dados

structure(list(Year = c(2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 
2019L, 2019L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 
2020L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 2022L, 
2023L, 2023L, 2023L, 2023L, 2023L, 2023L, 2023L, 2023L, 2024L, 
2024L, 2024L, 2024L, 2024L, 2024L, 2024L, 2024L), Country = structure(c(1L, 
2L, 5L, 7L, 3L, 4L, 8L, 6L, 1L, 2L, 5L, 7L, 3L, 4L, 8L, 6L, 1L, 
2L, 5L, 7L, 3L, 4L, 8L, 6L, 1L, 2L, 5L, 7L, 3L, 4L, 8L, 6L, 1L, 
2L, 5L, 7L, 3L, 4L, 8L, 6L), levels = c("ALPHA", "BETA", "DELTA", 
"EPSILON", "GAMMA", "OMEGA", "RHO", "THETA"), class = "factor"), 
    mu = c(68.855, 64.77, 69.9875, 65.22, 70.1266666666667, 69.6166666666667, 
    69.8085714285714, 67.9093333333333, 69.2675, 65.2, 72.4075, 
    69.49, 72.28, 69.262, 70.07125, 65.3864285714286, 74.6584615384615, 
    67.77, 75.3533333333333, 73, 64.09, 73.1715384615385, 66.058, 
    72.12, 75.5645833333333, 70.46, 78.2933333333333, 79.07, 
    59.82, 79.6361538461538, 74.225, 69.5871428571429, 76.4007407407407, 
    67.91, 76.805, 77.31, 74.0966666666667, 81.2811764705882, 
    74.6671428571428, 78.0316666666667), Color = structure(c(7L, 
    1L, 8L, 2L, 3L, 5L, 6L, 4L, 7L, 1L, 8L, 2L, 3L, 5L, 6L, 4L, 
    7L, 1L, 8L, 2L, 3L, 5L, 6L, 4L, 7L, 1L, 8L, 2L, 3L, 5L, 6L, 
    4L, 7L, 1L, 8L, 2L, 3L, 5L, 6L, 4L), levels = c("black", 
    "gray", "green", "orange", "pink", "purple", "red", "yellow"
    ), class = "factor")), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -40L), groups = structure(list(
    Year = c(2019L, 2020L, 2022L, 2023L, 2024L), .rows = structure(list(
        1:8, 9:16, 17:24, 25:32, 33:40), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L), .drop = TRUE))

editar

adicionando:

  mutate(label = if_else(Year == max(Year), as.character(Country), NA_character_))

no quadro de dados e no gráfico:

  theme(legend.position = "none")  +
  geom_label_repel(aes(label = label),
                   nudge_x = 1,
                   na.rm = TRUE)

cria:

Alterar manualmente a cor dos círculos usando o pacote sf em R

Quebrar texto para recolher linhas no KableExtra para uma tabela longa em R

Coloração condicional e bordas externas na tabela pdf KableExtra em R

Ajuste o texto em fatias e pinte as fatias dentro de um círculo usando ggplot2 em R

Uma espécie de gráfico Likert baseado na classificação de outro gráfico Likert em R

Imap com múltiplos ifelse para 3 tempos, 4 subplots e facetas de linhas e colunas em R

Classificar gglikert dentro das linhas de facetas dos subplots

Inverter o texto em linhas de facetas no gglikert em R

A largura do flextable em pdf não mostra por completo os nomes das colunas

Pivot mais amplo em R com 2 variáveis para nomes_de

O eixo Y em R mostra NA's no gráfico de pontos usando gglot2

Função de tabela no SQL Server com vários parâmetros como argumento

Mensagem de aviso em ggplot2 `geom_label()`

Classifique o eixo y em cada grade de facetas no ggplot 2 em R

Adicione totais no gráfico Likert no ggplot2 em R

Classificar com base nos níveis de fator dois plot em ggplot2

Adicionar contagem de colunas de ocorrências

Modo estatístico linha a linha na função de tabela [duplicado]

Mostrar porcentagens em categorias específicas no gráfico Likert usando ggplot2

Exibir texto no final e à direita de cada linha no ggplot2 em R [duplicado]

Reformatar números, inserindo separadores em posições fixas

Por que os conceitos do C++20 causam erros de restrição cíclica, enquanto o SFINAE antigo não?

Problema com extensão desinstalada automaticamente do VScode (tema Material)

Vue 3: Erro na criação "Identificador esperado, mas encontrado 'import'" [duplicado]

Qual é o propósito de `enum class` com um tipo subjacente especificado, mas sem enumeradores?

Como faço para corrigir um erro MODULE_NOT_FOUND para um módulo que não importei manualmente?

`(expression, lvalue) = rvalue` é uma atribuição válida em C ou C++? Por que alguns compiladores aceitam/rejeitam isso?

Um programa vazio que não faz nada em C++ precisa de um heap de 204 KB, mas não em C

PowerBI atualmente quebrado com BigQuery: problema de driver Simba com atualização do Windows

AdMob: MobileAds.initialize() - "java.lang.Integer não pode ser convertido em java.lang.String" para alguns dispositivos

Homer Jay Simpson's questions