kylejw2提出的问题 -dba

kylejw2

Asked: 2023-08-12 05:28:23 +0800 CST

Por que o PostgreSQL Query Planner está escolhendo uma solução tão ineficiente?

Estou trabalhando com postgres 13.9 no Amazon Aurora. Em nosso ambiente de produção, estamos executando uma consulta que leva mais de 15 segundos para ser executada quando a consulta está usando um pequeno arquivo LIMIT. Por exemplo, quando a consulta está sendo executada com LIMIT 1, vemos o seguinte resultado

Limit  (cost=1.54..2608.50 rows=1 width=1100) (actual time=17945.422..17945.424 rows=1 loops=1)
  Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
  ->  Merge Semi Join  (cost=1.54..60650907.42 rows=23265 width=1100) (actual time=17945.420..17945.422 rows=1 loops=1)
        Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
        Merge Cond: (tasks.id = t0.id)
        ->  Index Scan using tasks_pkey on public.tasks  (cost=0.56..14315808.88 rows=11401481 width=1100) (actual time=0.054..4908.126 rows=2722000 loops=1)
              Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
              Filter: ((tasks.deleted_at IS NULL) AND (tasks.milestone_id IS NOT NULL))
              Rows Removed by Filter: 650237
        ->  Nested Loop  (cost=0.98..46306291.52 rows=28266 width=8) (actual time=12863.972..12863.973 rows=1 loops=1)
              Output: t0.id
              Inner Unique: true
              ->  Index Scan using tasks_pkey on public.tasks t0  (cost=0.56..14350439.73 rows=13852340 width=16) (actual time=0.010..4179.561 rows=3372237 loops=1)
                    Output: t0.project_id, t0.id
                    Index Cond: (t0.id IS NOT NULL)
              ->  Index Scan using projects_pkey on public.projects j0  (cost=0.42..2.31 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=3372237)
                    Output: j0.id
                    Index Cond: (j0.id = t0.project_id)
                    Filter: (j0.organization_id = 79403)
                    Rows Removed by Filter: 1
Planning Time: 0.914 ms
Execution Time: 17945.475 ms

A mesma consulta, sendo executada com LIMIT 500, tem a seguinte explicação:

Limit  (cost=322268.59..322269.84 rows=500 width=1100) (actual time=1329.805..1330.032 rows=500 loops=1)
  Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
  ->  Sort  (cost=322268.59..322326.76 rows=23266 width=1100) (actual time=1329.803..1329.989 rows=500 loops=1)
        Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
        Sort Key: tasks.id
        Sort Method: top-N heapsort  Memory: 444kB
        ->  Nested Loop  (cost=218419.30..321109.27 rows=23266 width=1100) (actual time=563.649..1313.910 rows=20876 loops=1)
              Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
              Inner Unique: true
              ->  HashAggregate  (cost=218418.74..218701.41 rows=28267 width=8) (actual time=563.618..570.523 rows=21926 loops=1)
                    Output: t0.id
                    Group Key: t0.id
                    Batches: 1  Memory Usage: 2065kB
                    ->  Gather  (cost=1000.56..218348.08 rows=28267 width=8) (actual time=1.032..553.590 rows=21926 loops=1)
                          Output: t0.id
                          Workers Planned: 2
                          Workers Launched: 2
                          ->  Nested Loop  (cost=0.56..214521.38 rows=11778 width=8) (actual time=1.020..522.937 rows=7309 loops=3)
                                Output: t0.id
                                Worker 0:  actual time=2.356..510.679 rows=7819 loops=1
                                Worker 1:  actual time=0.063..537.921 rows=7849 loops=1
                                ->  Parallel Seq Scan on public.projects j0  (cost=0.00..53613.72 rows=417 width=8) (actual time=0.515..68.601 rows=220 loops=3)
                                      Output: j0.id
                                      Filter: (j0.organization_id = 79403)
                                      Rows Removed by Filter: 90977
                                      Worker 0:  actual time=0.885..109.155 rows=225 loops=1
                                      Worker 1:  actual time=0.034..37.727 rows=212 loops=1
                                ->  Index Scan using index_tasks_on_project_id on public.tasks t0  (cost=0.56..384.30 rows=157 width=16) (actual time=0.886..2.059 rows=33 loops=660)
                                      Output: t0.project_id, t0.id
                                      Index Cond: (t0.project_id = j0.id)
                                      Filter: (t0.id IS NOT NULL)
                                      Worker 0:  actual time=0.698..1.778 rows=35 loops=225
                                      Worker 1:  actual time=0.960..2.353 rows=37 loops=212
              ->  Index Scan using tasks_pkey on public.tasks  (cost=0.56..3.63 rows=1 width=1100) (actual time=0.033..0.033 rows=1 loops=21926)
                    Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
                    Index Cond: (tasks.id = t0.id)
                    Filter: ((tasks.deleted_at IS NULL) AND (tasks.milestone_id IS NOT NULL))
                    Rows Removed by Filter: 0
Planning Time: 0.872 ms
Execution Time: 1330.691 ms

Corrija-me se estiver errado, mas o culpado é o planejador de consulta por escolher um plano de consulta tão ineficiente com um pequeno arquivo LIMIT. Como o planejador de consulta usa o pg_statisticspara fazer seus planos de execução, acreditamos que nossas estatísticas sejam inválidas. Ao verificar, determinamos que correr VACUUM(FULL, ANALYZE, VERBOSE)seria a melhor solução para nós. Se você usar este comando, seja cauteloso. Pode demorar um pouco e bloqueará as tabelas do banco de dados temporariamente . Isso atualizou as estatísticas, mas o planejador de consulta ainda está escolhendo um plano de execução incorreto.

Eu adoraria entender por que o planejador de consultas está escolhendo um plano tão ineficiente e como isso pode ser resolvido.

Por que o PostgreSQL Query Planner está escolhendo uma solução tão ineficiente?

conectar ao servidor PostgreSQL: FATAL: nenhuma entrada pg_hba.conf para o host

Como fazer a saída do sqlplus aparecer em uma linha?

Selecione qual tem data máxima ou data mais recente

Como faço para listar todos os esquemas no PostgreSQL?

Listar todas as colunas de uma tabela especificada

Como usar o sqlplus para se conectar a um banco de dados Oracle localizado em outro host sem modificar meu próprio tnsnames.ora

Como você mysqldump tabela (s) específica (s)?

Listar os privilégios do banco de dados usando o psql

Como inserir valores em uma tabela de uma consulta de seleção no PostgreSQL?

Como faço para listar todos os bancos de dados e tabelas usando o psql?

kylejw2's questions