Dados dois números n
e m
, quero gerar uma série da forma
1, 2, ..., (n-1), n, n, (n-1), ... 2, 1
e repita várias m
vezes.
Por exemplo, para n = 3
e m = 4
, quero uma sequência dos seguintes 24 números:
1, 2, 3, 3, 2, 1, 1, 2, 3, 3, 2, 1, 1, 2, 3, 3, 2, 1, 1, 2, 3, 3, 2, 1
---------------- ---------------- ---------------- ----------------
Eu sei como obter esse resultado no PostgreSQL por um dos dois métodos:
Usando a seguinte consulta, que usa a generate_series
função, e alguns macetes para garantir que a ordem seja a correta:
WITH parameters (n, m) AS
(
VALUES (3, 5)
)
SELECT
xi
FROM
(
SELECT
i, i AS xi
FROM
parameters, generate_series(1, parameters.n) AS x(i)
UNION ALL
SELECT
i + parameters.n, parameters.n + 1 - i AS xi
FROM
parameters, generate_series(1, parameters.n) AS x(i)
) AS s0
CROSS JOIN
generate_series (1, (SELECT m FROM parameters)) AS x(j)
ORDER BY
j, i ;
... ou use uma função para o mesmo propósito, com loops adjuntos e aninhados:
CREATE FUNCTION generate_up_down_series(
_elements /* n */ integer,
_repetitions /* m */ integer)
RETURNS SETOF integer AS
$BODY$
declare
j INTEGER ;
i INTEGER ;
begin
for j in 1 .. _repetitions loop
for i in 1 .. _elements loop
return next i ;
end loop ;
for i in reverse _elements .. 1 loop
return next i ;
end loop ;
end loop ;
end ;
$BODY$
LANGUAGE plpgsql IMMUTABLE STRICT ;
Como eu poderia fazer o equivalente no SQL padrão ou no Transact-SQL/SQL Server?
Postgres
Você pode fazê-lo funcionar com uma matemática simples
generate_series()
e básica (ver funções matemáticas ).Embrulhado em uma função SQL simples:
Ligar:
Gera o resultado desejado. n e m podem ser qualquer inteiro onde n*2*m não estoura
int4
.Como?
Na subconsulta:
Gere o número total desejado de linhas ( n*2*m ), com um número crescente simples. Eu o nomeio
n2m
. 0 a N-1 (não 1 a N ) para simplificar a seguinte operação de módulo .Considere % n*2 (
%
é o operador de módulo) para obter uma série de n números crescentes, m vezes. Eu o nomeion2
.Na consulta externa:
Adicione 1 à metade inferior ( n2 < n ).
Para a metade superior ( n2 >= n ) espelho da metade inferior com n*2 - n2 .
Acrescentei
ORDER BY
para garantir o pedido solicitado. Com as versões atuais do Postgres também funciona semORDER BY
a consulta simples - mas não necessariamente em consultas mais complexas! Isso é um detalhe de implementação (e não vai mudar), mas não obrigatório pelo padrão SQL.Infelizmente,
generate_series()
é específico do Postgres e não SQL padrão, como foi comentado. Mas podemos reutilizar a mesma lógica:SQL padrão
Você pode gerar os números de série com um CTE recursivo em vez de
generate_series()
, ou, de forma mais eficiente para uso repetido, criar uma tabela com números inteiros de série uma vez. Qualquer um pode ler, ninguém pode escrever nele!Então, o acima
SELECT
se torna ainda mais simples:TL;DR
This is a long one, so I'm putting the best (i.e. fastest of my) methods here first. It makes use of the INTARRAY extension - for parameters of 340 and 570, it takes 21ms*. The second best (22 ms - same parameters) only uses standard PostgreSQL constructs. If anyone else comes up with (a) faster method(s), I will put them here and "retire" mine!
* 8 GB RAM, Intel i5 11th gen CPU, Quad core - 8 threads, NVMe 256 GB drive
</TL;DR>
Introduction:
This question intrigued me (+1) and I pondered
a) how to answer it and
b) how could the answer be generalised?
All of the code below can be found in the various fiddles - per method/contributor.
Interestingly, nobody who's answered so far (10 answers - and some very good quality SQL/programming to boot!) has made use of
ARRAY
s (tutorial), which I believe are very helpful in this case.In a general sense, my approach has been to generate the first series as an ARRAY (
{1, 2,.. n, n,..2, 1}
) and then generatem
of these for the complete task.I took
threefive approaches:the first is PostgreSQL specific, making use of
GENERATE_SERIES()
(tutorial) andARRAY
s. There's also a function call that can be replaced with aRECURSIVE CTE
("RCTE"
- see tutorial).the second (also PostgreSQL specific) combines an
RCTE
with a call toGENERATE_SERIES()
andARRAY
s. TheGENERATE_SERIES()
can be replaced with anRCTE
- see solution 3.the third solution is
"Pure SQL"
and should work on any RDBMS that supportsRCTE
s (SQL Server for example) - can also use anumbers
table (i.e. a sequence) Following discussions on the dba chatroom, I have removed the requirement forRCTE
s to be used to construct sequences.then there are two "speedy" solutions which rely on the fact that
UNNEST()
preserves order.Preparation step:
I used a table called
param
to store the values (3
&5
) as per the question - these can obviously be changed. Also, for the benchmarking steps (see below), I used this param table for all queries tested so that there would be an even playing field. As mentioned above, anumbers
sequence table is also permitted.The first method is as follows:
Method 1 - GENERATE_SERIES (fiddle):
Step 1:
Result:
Step 2:
Results (same):
Step 3:
Result:
Finally:
Result:
Method 2 - Recursive CTE (fiddle):
Here, I managed to "kill two birds with one stone" by constructing both the desired sequence and its numbering scheme in the same RCTE as follows:
Result:
We only want the last record, so we select this by using the
CARDINALITY()
function - where that is equal to (n * 2
) is the last record (step not shown individually).Final step - see fiddle for more detail
Result (same as others):
A simpler, (and IMHO) more elegant solution exists - using the
GENERATE_SUBSCRIPTS()
function (explanation) as follows:Result (same):
Method 3 - Pure SQL (fiddle):
Final step:
Since all of the code below has been seen above in one form or another, I'm just including the final step. No PostgreSQL specific functionality has been used and it also works under SQL Server 2019 (Linux fiddle) - also works back to 2016 - all versions.
Result (same):
4th solution (and
clubhouse leader!
) (fiddle):Same result as for all the others.
5th solution (Honourable mention) (fiddle):
Same result as for all the others.
6th Solution (another scorcher! - uses the INTARRAY extension fiddle):
Same result!
Benchmarking:
I benchmarked all of the PostgreSQL solutions.
I have done my utmost to be fair in these benchmarks - I used a param table
(3, 5)
(also(34, 57)
and(340, 570)
at(home)
) in my SQL. For those queries requiring anumber
table (i.e. a sequence), after discussion, I have included it for those queries requiring it. I'm not entirely sure about this, since consultants are frequently forbidden from creating separate tables, no matter how trivial, but this appears to have been the consensus!Please let me know if you are unhappy with any of the tests and I'll gladly rerun them!
I used
db<>fiddle
for the tests and the usual caveats apply - I don't know what else is running on that server at any given moment in time - I took an average of several runs for each solution (the vast bulk of the results were within ~ 10% of each other - discarded obvious outliers (longer, not shorter times).It was pointed out to me (knew anyway) that 3 & 5 aren't very large numbers - I did try with low 100's for each parameter, but the runs kept failing on db<>fiddle.uk, but I would just say that the all of the runs were remarkably consistent with each other, only varying by ~ +- 10%.
The second reading runs with a are for values of 34 & 57 - feel free to try yourselves.
With the
(home)
tests, I used params of(340, 570)
on an 8GB machine (i5 - 10th gen, NVMe 256GB) - nothing else running - variance v. low ~ 1/2%!Vérace's (Another scorcher using INTARRAY!)
6th solution (fiddle) (0.110ms)/0.630 ms/21.5 ms (home) -new leader
!Vérace's (ex-Clubhouse leader) 4th solution (fiddle) 0.120 ms/0.625 ms/22.5 ms (home)
Vérace's (Honourable mention) 5th solution (fiddle) (0.95ms/0.615 (goes down!)/26 ms (home)
Vérace GENERATE_SERIES SQL method (fiddle): 0.195 ms/3.1 ms/140ms (home)
Vérace RECURSIVE CTE SQL method (fiddle): 0.200 ms/2.9 ms/145m (home)
Vérace GENERATE_SUBSCRIPTS() SQL method (fiddle): 0.110 ms/2.75 ms/130ms (home)
Vérace "Pure SQL" method (fiddle): 0.134 ms/2.85ms/190ms (home)
OP's SQL method (fiddle): 12.50 ms/18.5ms/190ms (home)
OP's PL/pgSQL function method (fiddle): 0.60 ms/0.075ms/86ms (home)
ypercube's SQL method (fiddle): 0.175 ms, /4.3 ms /240 ms (home)
ypercube's alternative method (fiddle): 0.090 ms/0.95 ms/36ms (home)
Erwin Brandtstetter's SQL method (fiddle): 2.15 ms /3.65 ms/160ms (home) (160 drops to ~ 100 without the ORDER BY - home)
Erwin Brandtstetter's function method (fiddle): 0.169 ms/2.3 ms/180 ms (home)
Abelisto's SQL method (fiddle) 0.145/fails if params changed?
Evan Carroll's SQL method (fiddle) 0.125 ms/1.1ms/45ms (home)
McNet's PL/pgSQL method (fiddle) 0.075 ms/ 0.075 ms/125ms (home)
Again, I reiterate (at the risk of repeating myself (multiple times! :-) )), if you are unhappy with your benchmark, please let me know and I will include any amendments here - I would just stress that I am genuinely interested in fairness and actually learning from this process - unicorn points are all very well, but my priority is on increasing my (our) knowledge-base!
The source code of PostgreSQL is a bit above my pay grade, but I believe that operations with
GENERATE_SERIES
andARRAY
s preserve order - theWITH ORDINALITY
implies this (again, I think) - the correct answers come up even without paying attention to ordering (although this not a guarantee)! @ErwinBrandstetter says that:My understanding is that
ARRAY
s are fast in PostgreSQL because much of the backend code is implemented through them - but as I said, I'm not really aC
expert.Current standings (as of 2021-10-27 13:50 UTC) are:
I found these posts on ARRAYs from Erwin Brandstetter (1) & a_horse_with_no_name (2) very helpful! Other's that I found helpful were as follows (1, 2).
No Postgres, é fácil usar a
generate_series()
função:No SQL padrão - e supondo que haja um limite razoável para o tamanho dos parâmetros n, m, ou seja, menos de um milhão - você pode usar uma
Numbers
tabela:preencha-o com o método preferido do seu DBMS:
e, em seguida, usá-lo, em vez de
generate_series()
:Se você precisar de SQL simples. Teoricamente deveria funcionar na maioria dos DBMSs (testados em PostgreSQL e SQLite):
Explicação
Gerar série 1..n
Assumindo que
n=3
É bastante simples e pode ser encontrado em quase todos os documentos sobre CTEs recursivos. No entanto, precisamos de duas instâncias de cada valor, então
Gerar série 1,1,..,n,n
Aqui estamos apenas dobrando o valor inicial, que tem duas linhas, mas o segundo grupo precisamos na ordem inversa, então vamos introduzir a ordem daqui a pouco.
Antes de introduzirmos a ordem, observe que isso também é uma coisa. Podemos ter duas linhas na condição inicial com três colunas cada, nossa
n<3
ainda é uma condição de coluna única. E ainda estamos apenas aumentando o valor den
.Da mesma forma, podemos misturá-los um pouco, observar nossa mudança de condição inicial aqui : aqui temos um
(6,2)
,(1,1)
Gerar série 1..n,n..1
O truque aqui é gerar a série, (1..n) duas vezes, e simplesmente mudar a ordem no segundo conjunto.
Aqui
i
está a ordem ez
o número da sequência (ou metade da sequência, se você quiser). Assim, para a sequência 1 estamos aumentando a ordem de 1 para 3 e para a sequência 2 estamos diminuindo a ordem de 6 para 4. E finalmenteMultiplique a série para
m
(veja a primeira consulta na resposta)
No PostgreSQL, isso é fácil,
If you want a portable solution you need to realize that this is basically a mathematical problem.
Given @n as the highest number of the sequence and @x as the position of the number in that sequence (starting with zero), the following function would work in SQL Server:
You can check it with this CTE:
(Quick explanation: the function uses MODULO() to create a sequence of repeating numbers and ABS() to turn it into a zig-zag wave. The other operations transform that wave to match the desired result.)
Uma função básica usando iteradores.
T-SQL
Postgres
Isso funciona no MS-SQL e acho que pode ser modificado para qualquer tipo de SQL.
Uma maneira de fazer isso no SQL Server usando um cte recursivo.
Gere o número necessário de membros na série (para n = 3 e m = 4, seria 24, que é 2 n m)
Depois disso, usando a lógica em uma
case
expressão, você pode gerar a série necessária.Sample Demo
Conforme sugerido por @AndriyM .. a
case
expressão pode ser simplificada paraDemo
Using only basic Math
+ - * /
and Modulo:This doesn't require a specific RDBMS.
With
numbers
being a number table:This generate a number table (1-1000) without using a recursive CTE. See Sample. 2nm must be smaller than the number of row in numbers.
Output with n=3 and m=4:
This version requires a smaller number table (v >= n and v >= m):
See Sample.