Extraia um valor entre 2 tubos

Question

Homer Jay Simpson

Asked: 2024-12-30 23:39:12 +0800 CST2024-12-30 23:39:12 +0800 CST 2024-12-30 23:39:12 +0800 CST

Função de tabela no SQL Server com vários parâmetros como argumento

772

Tenho uma tabela no SQL Server 2016 chamada df:

-- Create a new table with department and gender columns
CREATE TABLE df 
(
    country VARCHAR(50),
    year INT,
    val1 INT,
    val2 INT,
    val3 INT,
    department VARCHAR(50),
    gender VARCHAR(10)
);

-- Insert data into the new table, including department and gender
INSERT INTO df (country, year, val1, val2, val3, department, gender) 
VALUES ('USA', 2020, 4, 4, 5, 'Sales', 'Male'),
('USA', 2020, 4, 4, 5, 'Sales', 'Male'),
('USA', 2020, 5, 5, 5, 'Sales', 'Female'),
('USA', 2020, 5, 5, 5, 'Sales', 'Female'),
('USA', 2020, 1, 1, 5, 'Sales', 'Male'),
('USA', 2020, 3, 3, 5, 'Sales', 'Female'),
('USA', 2020, 4, 2, 5, 'Sales', 'Male'),
('USA', 2020, 1, 1, 5, 'Sales', 'Female'),
('USA', 2020, 2, 2, 5, 'Sales', 'Male'),
('Canada', 2020, 2, 2, 3, 'HR', 'Female'),
('Canada', 2020, 2, 2, 3, 'HR', 'Female'),
('Canada', 2020, 2, 2, 3, 'HR', 'Male'),
('Canada', 2020, 2, 2, 3, 'HR', 'Male'),
('Canada', 2020, 5, 5, 3, 'HR', 'Female'),
('Canada', 2020, 5, 5, 3, 'HR', 'Male'),
('Canada', 2020, 1, 1, 3, 'HR', 'Female'),
('Canada', 2020, 1, 1, 3, 'HR', 'Male'),
('Canada', 2020, 3, 4, 3, 'HR', 'Female'),
('Canada', 2020, 3, 4, 3, 'HR', 'Male'),
('Canada', 2020, 5, 4, 3, 'HR', 'Female'),
('Canada', 2020, 5, 4, 5, 'HR', 'Male'),
('Canada', 2020, 5, 4, 5, 'HR', 'Female'),
('Germany', 2022, 5, 5, 4, 'IT', 'Male'),
('France', 2020, 1, 1, 2, 'Finance', 'Female'),
('France', 2020, 1, 1, 2, 'Finance', 'Female'),
('France', 2020, 3, 2, 2, 'Finance', 'Male'),
('France', 2020, 3, 4, 2, 'Finance', 'Female'),
('France', 2020, 3, 5, 5, 'Finance', 'Male'),
('France', 2020, 3, 4, 4, 'Finance', 'Female'),
('France', 2020, 3, 4, 4, 'Finance', 'Male'),
('France', 2020, 3, 4, 3, 'Finance', 'Female'),
('UK', 2021, 4, 2, 3, 'Marketing', 'Male'),
('Australia', 2022, 3, 3, 4, 'Support', 'Female'),
('Italy', 2020, 5, 5, 5, 'Operations', 'Male'),
('Italy', 2020, 5, 5, 5, 'Operations', 'Female'),
('Italy', 2020, 5, 1, 1, 'Operations', 'Male'),
('Italy', 2020, 4, 4, 1, 'Operations', 'Female'),
('Italy', 2020, 2, 1, 2, 'Operations', 'Male'),
('Italy', 2020, 3, 5, 3, 'Operations', 'Female'),
('Spain', 2021, 1, 2, 3, 'Customer Service', 'Male'),
('Mexico', 2022, 4, 4, 4, 'Logistics', 'Female'),
('Brazil', 2020, 4, 1, 1, 'R&D', 'Male'),
('Brazil', 2020, 4, 1, 1, 'R&D', 'Female'),
('Brazil', 2020, 4, 3, 4, 'R&D', 'Male'),
('Brazil', 2020, 5, 3, 5, 'R&D', 'Female'),
('Brazil', 2020, 5, 3, 5, 'R&D', 'Male'),
('Brazil', 2020, 3, 3, 1, 'R&D', 'Female'),
('Brazil', 2020, 2, 3, 1, 'R&D', 'Male');

-- Select all rows from the new table to check the data
SELECT * FROM df;

Com esta tabela, crio algumas porcentagens e uma coluna de contagem com base em alguns filtros.

-- Parameters
DECLARE @Year INT = 2020;
DECLARE @Metric VARCHAR(50) = 'count'; 
DECLARE @Gender VARCHAR(20) = NULL; -- Set to specific gender (e.g., 'Male', 'Female') or NULL to include all
DECLARE @Department VARCHAR(50) = NULL; -- Set to specific department (e.g., 'HR', 'Engineering') or NULL to include all
-- Set @Metric to 'dissatisfaction', 'satisfaction', or 'count'

WITH UnpivotedData AS 
(
    SELECT country, gender, department, year, Vals
    FROM 
        (SELECT country, gender, department, year, val1, val2, val3
         FROM df) AS SourceTable
    UNPIVOT 
        (Vals FOR ValueColumn IN (val1, val2, val3)) AS Unpivoted
    WHERE year = @Year
),
Proportions AS 
(
    SELECT 
        country,
        gender,
        department,
        CASE 
            WHEN Vals = 1 THEN 'Very Dissatisfied'
            WHEN Vals = 2 THEN 'Dissatisfied'
            WHEN Vals = 3 THEN 'Neutral'
            WHEN Vals = 4 THEN 'Satisfied'
            WHEN Vals = 5 THEN 'Very Satisfied'
        END AS SatisfactionLevel,
        COUNT(*) * 1.0 / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department) AS Proportion
    FROM 
        UnpivotedData
    GROUP BY 
        country, gender, department, Vals
),
Pivoted AS 
(
    SELECT country, gender, department, 
           [Very Dissatisfied], 
           [Dissatisfied], 
           [Neutral], 
           [Satisfied], 
           [Very Satisfied]
    FROM Proportions
    PIVOT 
        (MAX(Proportion)
         FOR SatisfactionLevel IN ([Very Dissatisfied], [Dissatisfied], [Neutral], [Satisfied], [Very Satisfied])) AS p
),
CountryCounts AS 
(
    SELECT 
        CASE WHEN country IS NULL THEN 'Unknown' ELSE country END AS country,
        gender, 
        department,
        COUNT(*) AS Total
    FROM df
    WHERE year = @Year
    -- Apply filters for gender and department if provided
    AND (@Gender IS NULL OR gender = @Gender)
    AND (@Department IS NULL OR department = @Department)
    GROUP BY CASE WHEN country IS NULL THEN 'Unknown' ELSE country END, gender, department
),
OrderedData AS 
(
    SELECT 
        p.country,
        p.gender,
        p.department,
        [Very Dissatisfied],
        [Dissatisfied],
        [Neutral],
        [Satisfied],
        [Very Satisfied],
        c.Total,
        CASE 
            WHEN @Metric = 'satisfaction' THEN ISNULL([Satisfied], 0) + ISNULL([Very Satisfied], 0)
            WHEN @Metric = 'dissatisfaction' THEN ISNULL([Very Dissatisfied], 0) + ISNULL([Dissatisfied], 0)
            WHEN @Metric = 'count' THEN c.Total
        END AS SortValue
    FROM Pivoted AS p
    INNER JOIN CountryCounts AS c ON p.country = c.country AND p.gender = c.gender AND p.department = c.department
)
SELECT 
    country,
    gender,
    department,
    [Very Dissatisfied],
    [Dissatisfied],
    [Neutral],
    [Satisfied],
    [Very Satisfied],
    Total
FROM 
    OrderedData
ORDER BY 
    SortValue DESC;

Quero criar uma função de tabela que terá 3 argumentos:

Métrica
Ano
Fator

Factorpode ser o Gênero ou o Departamento ou ambos. Se por exemplo Factorfor o Gênero a tabela a ser agrupada pelo Gênero e se for o Departamento a ser agrupada pelo Departamento.

Se ambos forem agrupados por ambos. Se Factorfor nulo ou padrão para não ser agrupado de forma alguma.

Em relação a Year: se o Yearfor passado para ser agrupado por ano. Se o Yearfor nulo, mostre todos os anos sem agrupamento.

Existe uma maneira de fazer isso no SQL Server?

Eu tenho um violino aqui

2 respostas

Voted

Charlieface · Answer 1 · 2024-12-31T07:46:41+08:00

Como eu disse na sua pergunta anterior sobre SQL , você está complicando muito isso.

Você pode fazer a filtragem, desarticular e dinamizar em um único nível de CTE, e você só precisa de um nível para adicionar o Total, o que não seria necessário se houvesse uma IDcoluna, porque então você poderia fazer COUNT(DISTINCT ID).

Para criar uma função, basta adicionar a CREATE FUNCTIONsintaxe normal. Você não pode adicionar ORDER BYa uma função de tabela, é basicamente apenas uma visualização. Você precisa adicionar isso à consulta externa.

CREATE OR ALTER FUNCTION dbo.MyAggregation (
    @Year INT,
    @Gender VARCHAR(20), -- Set to specific gender (e.g., 'Male', 'Female') or NULL to include all
    @Department VARCHAR(50), -- Set to specific department (e.g., 'HR', 'Engineering') or NULL to include all
    @Metric VARCHAR(50) -- Set @Metric to 'dissatisfaction', 'satisfaction', or 'count'
)
RETURNS TABLE
AS RETURN

WITH AllRows AS (
      SELECT *,
          COUNT(*) OVER (PARTITION BY country, gender, department, year) AS Total
      FROM df
      WHERE (year = @Year OR @Year IS NULL)
        AND (Department = @Department OR @DepARTMENT IS NULL)
        AND (Gender = @Gender OR @Gender IS NULL)
)
SELECT
      country,
      gender,
      department,
      year,
      COUNT(CASE WHEN Vals = 1 THEN 1 END) * 1.0
         / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year) AS [Very Dissatisfied],
      COUNT(CASE WHEN Vals = 2 THEN 1 END) * 1.0
         / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year) AS [Dissatisfied],
      COUNT(CASE WHEN Vals = 3 THEN 1 END) * 1.0
         / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year) AS [Neutral],
      COUNT(CASE WHEN Vals = 4 THEN 1 END) * 1.0
         / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year) AS [Satisfied],
      COUNT(CASE WHEN Vals = 5 THEN 1 END) * 1.0
         / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year) AS [Very Satisfied],
      MIN(Total) AS Total,
      CASE @Metric
        WHEN 'satisfaction' THEN
          COUNT(CASE WHEN Vals IN (4, 5) THEN 1 END) * 1.0
            / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year)
        WHEN 'dissatisfaction' THEN
          COUNT(CASE WHEN Vals IN (1, 2) THEN 1 END) * 1.0
           / SUM(COUNT(*)) OVER (PARTITION BY country, gender, department, year)
        WHEN 'count' THEN MIN(Total)
      END AS SortValue
FROM AllRows
CROSS APPLY (VALUES
        ('val1', val1),
        ('val2', val2),
        ('val3', val3)
) v(ValueColumn, Vals)
GROUP BY
    country,
    gender,
    department,
    year;

Então você apenas faz

SELECT *
FROM dbo.MyAggregation(2020, NULL, NULL, 'count')
ORDER BY
    SortValue;

db<>violino

Note que o parâmetro sort-value não deve ser passado de uma variável ou junção lateral, pois isso deixará sua consulta muito lenta. Se for uma string constante, o otimizador pode fatorá-la.

Adicionar agrupamento dinâmico complica substancialmente isso, porque agora você precisa anular os valores antes de agrupá-los (como mostrado na outra resposta). Também será muito lento em tabelas grandes, pois você não pode usar índices. Eu recomendo fortemente que você crie funções separadas com diferentes construções de agrupamento/particionamento, alternativamente faça isso em SQL dinâmico.

ValNik · Answer 2 · 2024-12-31T05:32:56+08:00

Estamos aplicando substituição de valor para a coluna a ser agrupada por. Por exemplo, se o parâmetro @factorGender for nulo, agrupamos por valor, gendersenão agrupamos por valor constante all- na verdade, nenhum agrupamento por gênero.

Para simplificar as coisas, subquery é o seu equivalente de operação UNPIVOT-PIVOT. Podemos contar diretamente a dispersão de valores para val1,val2,val3. Expressão

iif(val1=1,1.0,0)+iif(val2=1,1.0,0)+iif(val3=1,1.0,0) cv1

ou como

case  when val1=1 then 1.0 else 0 end
   +case when val2=1 then 1.0 else 0 end
   +case when val3=1 then 1.0 else 0 end as cv1

conta val1=1,val2=1,val3=1 para todas as linhas.

Veja o exemplo

Atualização 1. Após o comentário de @DaleK, decidi que realmente precisava responder à pergunta com mais precisão e sugerir um exemplo de uma função e um exemplo de seu uso.


CREATE FUNCTION SatisfiedByGroups (
 -- -- Parameters for row filters
    @Year INT = 2020
   ,@Metric VARCHAR(50) ='dissatisfaction' -- Set @Metric to 'satisfaction','dissatisfaction', 'count' ; 
   ,@Gender VARCHAR(20) = NULL  -- Set to specific gender (e.g., 'Male', 'Female') or NULL to include all
   ,@Department VARCHAR(50) = NULL  -- Set to specific department (e.g., 'HR', 'Engineering') or NULL to include all

-- parameters for grouping
   ,@factorYear int = null -- 0 ; -- Set to any value if do not group by, else null
   ,@factorgender VARCHAR(50) = null  --'all'  -- Set to any value if do not group by, else null
   ,@factorDepartment VARCHAR(50) = null --'all'  -- Set to value if do not group by, else null
   ,@factorCountry VARCHAR(50) = null -- 'all' -- Set to value if do not group by, else null
  )
RETURNS TABLE
AS
RETURN (
select grCountry country,grGender gender,grDepartment department
  ,sum(cv1)/count(*)/3.0 [Very Dissatisfied]
  ,sum(cv2)/count(*)/3.0 [Dissatisfied]
  ,sum(cv3)/count(*)/3.0 [Neutral]
  ,sum(cv4)/count(*)/3.0 [Satisfied]
  ,sum(cv5)/count(*)/3.0 [Very Satisfied]
  ,count(*) total
  ,CASE 
      WHEN @Metric = 'satisfaction' THEN (sum(cv4)+sum(cv5))/3.0/count(*)
      WHEN @Metric = 'dissatisfaction' THEN (sum(cv1)+sum(cv2))/3.0/count(*)
      WHEN @Metric = 'count' THEN count(*)
   END AS SortValue
  ,grCountry,grYear,grDepartment,grGender,@metric as metric
from(
select coalesce(country,'unknown') country,year ,gender,department
  ,iif(val1=1,1.0,0)+iif(val2=1,1.0,0)+iif(val3=1,1.0,0) cv1 
  ,iif(val1=2,1.0,0)+iif(val2=2,1.0,0)+iif(val3=2,1.0,0) cv2
  ,iif(val1=3,1.0,0)+iif(val2=3,1.0,0)+iif(val3=3,1.0,0) cv3
  ,iif(val1=4,1.0,0)+iif(val2=4,1.0,0)+iif(val3=4,1.0,0) cv4
  ,iif(val1=5,1.0,0)+iif(val2=5,1.0,0)+iif(val3=5,1.0,0) cv5
  ,coalesce(@factorYear,year) grYear
  ,coalesce(@FactorDepartment,department)grDepartment
  ,coalesce(@factorGender,gender) grGender
  ,coalesce(@factorCountry,coalesce(country,'UNKNOWN')) grCountry
from df
WHERE (year = @Year or @Year is null)
      -- Apply filters for gender and department if provided
    AND (@Gender IS NULL OR gender = @Gender)
    AND (@Department IS NULL OR department = @Department)
)d 
group by grCountry,grYear,grDepartment,grGender
);

E chame esta função

select * from SatisfiedByGroups(
        2020 -- @year
       ,'count' -- @metric
       ,NULL    -- @gender
       ,NULL    -- @department
       ,null -- @factorYear
       ,null -- @factorgender
       ,null -- @factorDepartment
       ,null -- @factorCountry
)

A saída é

país	gênero	departamento	Muito insatisfeito	Insatisfeito	Neutro	Satisfeito	Muito satisfeito	total	ClassificarValor	grPaís	grAno	grDepartamento	grGênero	métrica
Canadá	Fêmea	RH	0,095238	0,190476	0,333333	0,142857	0,238095	7	7.000000	Canadá	2020	RH	Fêmea	contar
Canadá	Macho	RH	0,111111	0,222222	0,333333	0,111111	0,222222	6	6.000000	Canadá	2020	RH	Macho	contar
França	Fêmea	Financiar	0,266666	0,200000	0,266666	0,266666	0,000000	5	5.000000	França	2020	Financiar	Fêmea	contar
EUA	Macho	Vendas	0,133333	0,200000	0,000000	0,333333	0,333333	5	5.000000	EUA	2020	Vendas	Macho	contar
EUA	Fêmea	Vendas	0,166666	0,000000	0,166666	0,000000	0,666666	4	4.000000	EUA	2020	Vendas	Fêmea	contar
Brasil	Macho	P&D	0,250000	0,083333	0,250000	0,250000	0,166666	4	4.000000	Brasil	2020	P&D	Macho	contar
Brasil	Fêmea	P&D	0,333333	0,000000	0,333333	0,111111	0,222222	3	3.000000	Brasil	2020	P&D	Fêmea	contar
França	Macho	Financiar	0,000000	0,222222	0,333333	0,222222	0,222222	3	3.000000	França	2020	Financiar	Macho	contar
Itália	Fêmea	Operações	0,111111	0,000000	0,222222	0,222222	0,444444	3	3.000000	Itália	2020	Operações	Fêmea	contar
Itália	Macho	Operações	0,333333	0,222222	0,000000	0,000000	0,444444	3	3.000000	Itália	2020	Operações	Macho	contar

Então tente executar a consulta com outra condição de grupo.

select * from SatisfiedByGroups(
        2020 -- @year
       ,'dissatisfaction' -- @metric
       ,NULL    -- @gender
       ,NULL    -- @department
       ,null  -- @factorYear
       ,'all' -- @factorgender
       ,'all' -- @factorDepartment
       ,null -- @factorCountry
)

país	gênero	departamento	Muito insatisfeito	Insatisfeito	Neutro	Satisfeito	Muito satisfeito	total	ClassificarValor	grPaís	grAno	grDepartamento	grGênero	métrica
França	todos	todos	0,166666	0,208333	0,291666	0,250000	0,083333	8	0,375000	França	2020	todos	todos	insatisfação
Itália	todos	todos	0,222222	0,111111	0,111111	0,111111	0,444444	6	0,333333	Itália	2020	todos	todos	insatisfação
Brasil	todos	todos	0,285714	0,047619	0,285714	0,190476	0,190476	7	0,333333	Brasil	2020	todos	todos	insatisfação
Canadá	todos	todos	0,102564	0,205128	0,333333	0,128205	0,230769	13	0,307692	Canadá	2020	todos	todos	insatisfação
EUA	todos	todos	0,148148	0,111111	0,074074	0,185185	0,481481	9	0,259259	EUA	2020	todos	todos	insatisfação

violino

Se no seu servidor não estiver disponível a função IIF(...) converta esta expressão usandocase when ... end

  ,case  when val1=1 then 1.0 else 0 end
   +case when val2=1 then 1.0 else 0 end
   +case when val3=1 then 1.0 else 0 end as cv1
  ,case  when val1=2 then 1.0 else 0 end
   +case when val2=2 then 1.0 else 0 end
   +case when val3=2 then 1.0 else 0 end as cv2
  ,case  when val1=3 then 1.0 else 0 end
   +case when val2=3 then 1.0 else 0 end
   +case when val3=3 then 1.0 else 0 end as cv3
  ,case  when val1=4 then 1.0 else 0 end
   +case when val2=4 then 1.0 else 0 end
   +case when val3=4 then 1.0 else 0 end as cv4
  ,case  when val1=5 then 1.0 else 0 end
   +case when val2=5 then 1.0 else 0 end
   +case when val3=5 then 1.0 else 0 end as cv5
--   ,iif(val1=1,1.0,0)+iif(val2=1,1.0,0)+iif(val3=1,1.0,0) cv1 
--   ,iif(val1=2,1.0,0)+iif(val2=2,1.0,0)+iif(val3=2,1.0,0) cv2
--  ,iif(val1=3,1.0,0)+iif(val2=3,1.0,0)+iif(val3=3,1.0,0) cv3
--  ,iif(val1=4,1.0,0)+iif(val2=4,1.0,0)+iif(val3=4,1.0,0) cv4
--  ,iif(val1=5,1.0,0)+iif(val2=5,1.0,0)+iif(val3=5,1.0,0) cv5

violino
e

Função de tabela no SQL Server com vários parâmetros como argumento

Vue 3: Erro na criação "Identificador esperado, mas encontrado 'import'" [duplicado]

Por que esse código Java simples e pequeno roda 30x mais rápido em todas as JVMs Graal, mas não em nenhuma JVM Oracle?

Qual é o propósito de `enum class` com um tipo subjacente especificado, mas sem enumeradores?

Como faço para corrigir um erro MODULE_NOT_FOUND para um módulo que não importei manualmente?

`(expression, lvalue) = rvalue` é uma atribuição válida em C ou C++? Por que alguns compiladores aceitam/rejeitam isso?

Quando devo usar um std::inplace_vector em vez de um std::vector?

Um programa vazio que não faz nada em C++ precisa de um heap de 204 KB, mas não em C

PowerBI atualmente quebrado com BigQuery: problema de driver Simba com atualização do Windows

AdMob: MobileAds.initialize() - "java.lang.Integer não pode ser convertido em java.lang.String" para alguns dispositivos

Estou tentando fazer o jogo pacman usando apenas o módulo Turtle Random e Math

Função de tabela no SQL Server com vários parâmetros como argumento

2 respostas

relate perguntas