AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • Início
  • system&network
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • Início
  • system&network
    • Recentes
    • Highest score
    • tags
  • Ubuntu
    • Recentes
    • Highest score
    • tags
  • Unix
    • Recentes
    • tags
  • DBA
    • Recentes
    • tags
  • Computer
    • Recentes
    • tags
  • Coding
    • Recentes
    • tags
Início / user-277947

kylejw2's questions

Martin Hope
kylejw2
Asked: 2023-08-12 05:28:23 +0800 CST

Por que o PostgreSQL Query Planner está escolhendo uma solução tão ineficiente?

  • 5

Estou trabalhando com postgres 13.9 no Amazon Aurora. Em nosso ambiente de produção, estamos executando uma consulta que leva mais de 15 segundos para ser executada quando a consulta está usando um pequeno arquivo LIMIT. Por exemplo, quando a consulta está sendo executada com LIMIT 1, vemos o seguinte resultado

Limit  (cost=1.54..2608.50 rows=1 width=1100) (actual time=17945.422..17945.424 rows=1 loops=1)
  Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
  ->  Merge Semi Join  (cost=1.54..60650907.42 rows=23265 width=1100) (actual time=17945.420..17945.422 rows=1 loops=1)
        Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
        Merge Cond: (tasks.id = t0.id)
        ->  Index Scan using tasks_pkey on public.tasks  (cost=0.56..14315808.88 rows=11401481 width=1100) (actual time=0.054..4908.126 rows=2722000 loops=1)
              Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
              Filter: ((tasks.deleted_at IS NULL) AND (tasks.milestone_id IS NOT NULL))
              Rows Removed by Filter: 650237
        ->  Nested Loop  (cost=0.98..46306291.52 rows=28266 width=8) (actual time=12863.972..12863.973 rows=1 loops=1)
              Output: t0.id
              Inner Unique: true
              ->  Index Scan using tasks_pkey on public.tasks t0  (cost=0.56..14350439.73 rows=13852340 width=16) (actual time=0.010..4179.561 rows=3372237 loops=1)
                    Output: t0.project_id, t0.id
                    Index Cond: (t0.id IS NOT NULL)
              ->  Index Scan using projects_pkey on public.projects j0  (cost=0.42..2.31 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=3372237)
                    Output: j0.id
                    Index Cond: (j0.id = t0.project_id)
                    Filter: (j0.organization_id = 79403)
                    Rows Removed by Filter: 1
Planning Time: 0.914 ms
Execution Time: 17945.475 ms

A mesma consulta, sendo executada com LIMIT 500, tem a seguinte explicação:

Limit  (cost=322268.59..322269.84 rows=500 width=1100) (actual time=1329.805..1330.032 rows=500 loops=1)
  Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
  ->  Sort  (cost=322268.59..322326.76 rows=23266 width=1100) (actual time=1329.803..1329.989 rows=500 loops=1)
        Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
        Sort Key: tasks.id
        Sort Method: top-N heapsort  Memory: 444kB
        ->  Nested Loop  (cost=218419.30..321109.27 rows=23266 width=1100) (actual time=563.649..1313.910 rows=20876 loops=1)
              Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
              Inner Unique: true
              ->  HashAggregate  (cost=218418.74..218701.41 rows=28267 width=8) (actual time=563.618..570.523 rows=21926 loops=1)
                    Output: t0.id
                    Group Key: t0.id
                    Batches: 1  Memory Usage: 2065kB
                    ->  Gather  (cost=1000.56..218348.08 rows=28267 width=8) (actual time=1.032..553.590 rows=21926 loops=1)
                          Output: t0.id
                          Workers Planned: 2
                          Workers Launched: 2
                          ->  Nested Loop  (cost=0.56..214521.38 rows=11778 width=8) (actual time=1.020..522.937 rows=7309 loops=3)
                                Output: t0.id
                                Worker 0:  actual time=2.356..510.679 rows=7819 loops=1
                                Worker 1:  actual time=0.063..537.921 rows=7849 loops=1
                                ->  Parallel Seq Scan on public.projects j0  (cost=0.00..53613.72 rows=417 width=8) (actual time=0.515..68.601 rows=220 loops=3)
                                      Output: j0.id
                                      Filter: (j0.organization_id = 79403)
                                      Rows Removed by Filter: 90977
                                      Worker 0:  actual time=0.885..109.155 rows=225 loops=1
                                      Worker 1:  actual time=0.034..37.727 rows=212 loops=1
                                ->  Index Scan using index_tasks_on_project_id on public.tasks t0  (cost=0.56..384.30 rows=157 width=16) (actual time=0.886..2.059 rows=33 loops=660)
                                      Output: t0.project_id, t0.id
                                      Index Cond: (t0.project_id = j0.id)
                                      Filter: (t0.id IS NOT NULL)
                                      Worker 0:  actual time=0.698..1.778 rows=35 loops=225
                                      Worker 1:  actual time=0.960..2.353 rows=37 loops=212
              ->  Index Scan using tasks_pkey on public.tasks  (cost=0.56..3.63 rows=1 width=1100) (actual time=0.033..0.033 rows=1 loops=21926)
                    Output: tasks.id, tasks.name, tasks.description, tasks.priority, tasks.estimated_hours, tasks.sort_order, tasks.estimated_points, tasks.responsibility, tasks.sign_off_required, tasks.created_at, tasks.updated_at, tasks.milestone_id, tasks.status, tasks.sign_off_user_id, tasks.assignee_id, tasks.creator_id, tasks.start_on, tasks.due_on, tasks.project_id, tasks.template_id, tasks.actual_hours, tasks.deleted_at, tasks.assignment_email_sent_at, tasks.stuck_message, tasks.overdue_pm_reminder_sent_at, tasks.duration, tasks.dependency_type, tasks.dependency_id, tasks.last_activity_at, tasks.completed_at, tasks.overdue_watched_tasks_email_sent_at, tasks.task_type, tasks.must_start_on, tasks.must_start_on_required, tasks.must_start_on_email_sent_at, tasks.visibility, tasks.type, tasks.related_task_id, tasks.action_items_count, tasks.open_action_items_count, tasks.billable_hours, tasks.non_billable_hours, tasks.jira_sync, tasks.public_id, tasks.event_details, tasks.blueprint_task_id, tasks.task_group_id
                    Index Cond: (tasks.id = t0.id)
                    Filter: ((tasks.deleted_at IS NULL) AND (tasks.milestone_id IS NOT NULL))
                    Rows Removed by Filter: 0
Planning Time: 0.872 ms
Execution Time: 1330.691 ms

Corrija-me se estiver errado, mas o culpado é o planejador de consulta por escolher um plano de consulta tão ineficiente com um pequeno arquivo LIMIT. Como o planejador de consulta usa o pg_statisticspara fazer seus planos de execução, acreditamos que nossas estatísticas sejam inválidas. Ao verificar, determinamos que correr VACUUM(FULL, ANALYZE, VERBOSE)seria a melhor solução para nós. Se você usar este comando, seja cauteloso. Pode demorar um pouco e bloqueará as tabelas do banco de dados temporariamente . Isso atualizou as estatísticas, mas o planejador de consulta ainda está escolhendo um plano de execução incorreto.

Eu adoraria entender por que o planejador de consultas está escolhendo um plano tão ineficiente e como isso pode ser resolvido.

postgresql
  • 1 respostas
  • 33 Views

Sidebar

Stats

  • Perguntas 205573
  • respostas 270741
  • best respostas 135370
  • utilizador 68524
  • Highest score
  • respostas
  • Marko Smith

    conectar ao servidor PostgreSQL: FATAL: nenhuma entrada pg_hba.conf para o host

    • 12 respostas
  • Marko Smith

    Como fazer a saída do sqlplus aparecer em uma linha?

    • 3 respostas
  • Marko Smith

    Selecione qual tem data máxima ou data mais recente

    • 3 respostas
  • Marko Smith

    Como faço para listar todos os esquemas no PostgreSQL?

    • 4 respostas
  • Marko Smith

    Listar todas as colunas de uma tabela especificada

    • 5 respostas
  • Marko Smith

    Como usar o sqlplus para se conectar a um banco de dados Oracle localizado em outro host sem modificar meu próprio tnsnames.ora

    • 4 respostas
  • Marko Smith

    Como você mysqldump tabela (s) específica (s)?

    • 4 respostas
  • Marko Smith

    Listar os privilégios do banco de dados usando o psql

    • 10 respostas
  • Marko Smith

    Como inserir valores em uma tabela de uma consulta de seleção no PostgreSQL?

    • 4 respostas
  • Marko Smith

    Como faço para listar todos os bancos de dados e tabelas usando o psql?

    • 7 respostas
  • Martin Hope
    Jin conectar ao servidor PostgreSQL: FATAL: nenhuma entrada pg_hba.conf para o host 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane Como faço para listar todos os esquemas no PostgreSQL? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh Por que o log de transações continua crescendo ou fica sem espaço? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland Listar todas as colunas de uma tabela especificada 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney O MySQL pode realizar consultas razoavelmente em bilhões de linhas? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx Como posso monitorar o andamento de uma importação de um arquivo .sql grande? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison Como você mysqldump tabela (s) específica (s)? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas Como posso cronometrar consultas SQL usando psql? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas Como inserir valores em uma tabela de uma consulta de seleção no PostgreSQL? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas Como faço para listar todos os bancos de dados e tabelas usando o psql? 2011-02-18 00:45:49 +0800 CST

Hot tag

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • Início
  • Perguntas
    • Recentes
    • Highest score
  • tag
  • help

Footer

AskOverflow.Dev

About Us

  • About Us
  • Contact Us

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve