Estou tendo problemas com falhas inesperadas no sistema provenientes de uma implantação do postgresql em um servidor de produção moderadamente ocupado (aplicativo Django 1.8 implantado no droplet DO com um gunicorn
servidor de aplicativos e nginx
proxy reverso).
Sou um DBA acidental e preciso da opinião de um especialista para entender o que está acontecendo. Eu não tenho ideia agora; Vou fornecer vários dados de log abaixo:
Todo o aplicativo trava - a nova relíquia acaba mostrando o seguinte erro:
django.db.utils:OperationalError: ERROR: no more connections allowed (max_client_conn)
Rastreamento de pilha completo sendo:
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gevent/baseserver.py", line 26, in _handle_and_close_when_done
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/ggevent.py", line 155, in handle
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/base_async.py", line 56, in handle
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/ggevent.py", line 160, in handle_request
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/base_async.py", line 114, in handle_request
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/api/web_transaction.py", line 704, in __iter__
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/api/web_transaction.py", line 1080, in __call__
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 189, in __call__
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 108, in get_response
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/hooks/framework_django.py", line 228, in wrapper
File "/home/ubuntu/app/mproject/middleware/mycustommiddleware.py", line 6, in process_request
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/utils/functional.py", line 225, in inner
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/utils/functional.py", line 376, in _setup
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/contrib/auth/middleware.py", line 22, in <lambda>
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/contrib/auth/middleware.py", line 10, in get_user
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/contrib/auth/__init__.py", line 174, in get_user
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/contrib/auth/backends.py", line 93, in get_user
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/models/manager.py", line 127, in manager_method
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/models/query.py", line 328, in get
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/models/query.py", line 144, in __len__
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/models/query.py", line 965, in _fetch_all
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/models/query.py", line 238, in iterator
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 838, in execute_sql
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 164, in cursor
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 135, in _cursor
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 130, in ensure_connection
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/utils.py", line 98, in __exit__
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 130, in ensure_connection
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 119, in connect
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 176, in get_new_connection
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/hooks/database_dbapi2.py", line 102, in __call__
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/psycopg2/__init__.py", line 126, in connect
File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/psycogreen/gevent.py", line 32, in gevent_wait_callback
A nova relíquia também mostra um grande aumento em Postgres set
:
O log lento do Postgresql mostra durações de COMMIT exorbitantes assim:
2020-05-15 01:22:47.474 UTC [10722] ubuntu@myapp LOG: duration: 280668.902 ms statement: COMMIT
2020-05-15 01:22:47.474 UTC [11536] ubuntu@myapp LOG: duration: 49251.277 ms statement: COMMIT
2020-05-15 01:22:47.474 UTC [10719] ubuntu@myapp LOG: duration: 281609.394 ms statement: COMMIT
2020-05-15 01:22:47.474 UTC [10719] ubuntu@myapp LOG: could not send data to client: Broken pipe
2020-05-15 01:22:47.474 UTC [10719] ubuntu@myapp FATAL: connection to client lost
2020-05-15 01:22:47.475 UTC [10672] ubuntu@myapp LOG: duration: 285371.872 ms statement: COMMIT
O painel do oceano digital mostra um pico em Load Average
e Disk IO
(mas CPU
, memory usage
e disk usage
permanece baixo).
As estatísticas do instantâneo no momento desse Load Average
pico são:
syslog
tem inúmeras instâncias do seguinte log de erros:
May 15 01:22:46 main-app gunicorn[10779]: error: [Errno 111] Connection refused
May 15 01:22:46 main-app gunicorn[10779]: [2020-05-15 01:22:46 +0000] [10874] [ERROR] Socket error processing request.
May 15 01:22:46 main-app gunicorn[10779]: Traceback (most recent call last):
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/base_async.py", line 66, in handle
May 15 01:22:46 main-app gunicorn[10779]: six.reraise(*sys.exc_info())
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/base_async.py", line 56, in handle
May 15 01:22:46 main-app gunicorn[10779]: self.handle_request(listener_name, req, client, addr)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/ggevent.py", line 160, in handle_request
May 15 01:22:46 main-app gunicorn[10779]: addr)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/base_async.py", line 129, in handle_request
May 15 01:22:46 main-app gunicorn[10779]: six.reraise(*sys.exc_info())
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gunicorn/workers/base_async.py", line 114, in handle_request
May 15 01:22:46 main-app gunicorn[10779]: for item in respiter:
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/api/web_transaction.py", line 704, in __iter__
May 15 01:22:46 main-app gunicorn[10779]: for item in self.generator:
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/api/web_transaction.py", line 1080, in __call__
May 15 01:22:46 main-app gunicorn[10779]: self.start_response)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 189, in __call__
May 15 01:22:46 main-app gunicorn[10779]: response = self.get_response(request)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 218, in get_response
May 15 01:22:46 main-app gunicorn[10779]: response = self.handle_uncaught_exception(request, resolver, sys.exc_info())
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/hooks/framework_django.py", line 448, in wrapper
May 15 01:22:46 main-app gunicorn[10779]: return _wrapped(*args, **kwargs)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/hooks/framework_django.py", line 441, in _wrapped
May 15 01:22:46 main-app gunicorn[10779]: return wrapped(request, resolver, exc_info)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 256, in handle_uncaught_exception
May 15 01:22:46 main-app gunicorn[10779]: 'request': request
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/logging/__init__.py", line 1193, in error
May 15 01:22:46 main-app gunicorn[10779]: self._log(ERROR, msg, args, **kwargs)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/logging/__init__.py", line 1286, in _log
May 15 01:22:46 main-app gunicorn[10779]: self.handle(record)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/logging/__init__.py", line 1296, in handle
May 15 01:22:46 main-app gunicorn[10779]: self.callHandlers(record)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/logging/__init__.py", line 1336, in callHandlers
May 15 01:22:46 main-app gunicorn[10779]: hdlr.handle(record)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/logging/__init__.py", line 759, in handle
May 15 01:22:46 main-app gunicorn[10779]: self.emit(record)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/utils/log.py", line 129, in emit
May 15 01:22:46 main-app gunicorn[10779]: self.send_mail(subject, message, fail_silently=True, html_message=html_message)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/utils/log.py", line 132, in send_mail
May 15 01:22:46 main-app gunicorn[10779]: mail.mail_admins(subject, message, *args, connection=self.connection(), **kwargs)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/api/function_trace.py", line 110, in literal_wrapper
May 15 01:22:46 main-app gunicorn[10779]: return wrapped(*args, **kwargs)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/mail/__init__.py", line 98, in mail_admins
May 15 01:22:46 main-app gunicorn[10779]: mail.send(fail_silently=fail_silently)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/newrelic-2.56.0.42/newrelic/api/function_trace.py", line 110, in literal_wrapper
May 15 01:22:46 main-app gunicorn[10779]: return wrapped(*args, **kwargs)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/mail/message.py", line 303, in send
May 15 01:22:46 main-app gunicorn[10779]: return self.get_connection(fail_silently).send_messages([self])
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/mail/backends/smtp.py", line 100, in send_messages
May 15 01:22:46 main-app gunicorn[10779]: new_conn_created = self.open()
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/django/core/mail/backends/smtp.py", line 58, in open
May 15 01:22:46 main-app gunicorn[10779]: self.connection = connection_class(self.host, self.port, **connection_params)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/smtplib.py", line 256, in __init__
May 15 01:22:46 main-app gunicorn[10779]: (code, msg) = self.connect(host, port)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/smtplib.py", line 316, in connect
May 15 01:22:46 main-app gunicorn[10779]: self.sock = self._get_socket(host, port, self.timeout)
May 15 01:22:46 main-app gunicorn[10779]: File "/usr/lib/python2.7/smtplib.py", line 291, in _get_socket
May 15 01:22:46 main-app gunicorn[10779]: return socket.create_connection((host, port), timeout)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gevent/socket.py", line 96, in create_connection
May 15 01:22:46 main-app gunicorn[10779]: sock.connect(sa)
May 15 01:22:46 main-app gunicorn[10779]: File "/home/ubuntu/.virtualenvs/app/local/lib/python2.7/site-packages/gevent/_socket2.py", line 244, in connect
May 15 01:22:46 main-app gunicorn[10779]: raise error(err, strerror(err))
Algumas linhas de vmtstat 1
:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 4698028 280600 16692992 0 0 0 14 0 0 13 1 87 0 0
3 0 0 4697204 280600 16693016 0 0 0 24 8294 8111 12 0 88 0 0
3 0 0 4698148 280600 16693044 0 0 0 204 8533 9047 15 0 85 0 0
4 0 0 4697628 280600 16693092 0 0 0 76 8695 8775 17 1 83 0 0
3 0 0 4697288 280600 16693140 0 0 8 16 9578 9720 19 1 80 0 0
3 0 0 4695768 280600 16693200 0 0 0 152 8594 8073 16 1 83 0 0
6 0 0 4686044 280600 16693272 0 0 0 32 7998 7706 12 1 87 0 0
5 0 0 4681876 280600 16693308 0 0 0 224 9130 8810 14 1 85 0 0
Isso é praticamente tudo o que tenho. Eu não posso juntar o que poderia estar acontecendo em tudo. Está acontecendo quase uma vez por dia agora. Algum especialista pode dar uma luz sobre isso? Desde já, obrigado!
Nota: em caso de garantia, SELECT version();
produz:
PostgreSQL 9.6.16 on x86_64-pc-linux-gnu (Ubuntu 9.6.16-1.pgdg16.04+1), compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit