当我尝试通过 yum 手动安装代理时,正在安装最新版本的代理而不是我当前使用的版本,我如何安装我想要的代理版本我需要在 datastax 存储库中指定任何内容吗?
我尝试通过 opscenter 安装,但无法连接。为了让 opscenter 能够登录到我提供用户名、密码的节点,我将整个私钥文件 (.ppk) 粘贴到 opscenter 登录凭据中,我做错了吗??
当我尝试通过 yum 手动安装代理时,正在安装最新版本的代理而不是我当前使用的版本,我如何安装我想要的代理版本我需要在 datastax 存储库中指定任何内容吗?
我尝试通过 opscenter 安装,但无法连接。为了让 opscenter 能够登录到我提供用户名、密码的节点,我将整个私钥文件 (.ppk) 粘贴到 opscenter 登录凭据中,我做错了吗??
我们最近将公司服务器 (Datastax Enterprise 4.5.3) 升级到 DSE 4.6.0。我们面临的唯一问题是新的备份服务,我们无法为“所有密钥空间”创建备份。然而,一个接一个地备份键空间就像一个魅力。该错误似乎来自安装在节点上的 datastax-agent(s),我在下面附上了尽可能多的细节。
OpsCenter 事件日志:
备份所有键空间失败:备份以下目标的所有键空间失败:快照
节点 < node-IP > 上所有键空间的快照失败:clojure.lang.Compiler$CompilerException: java.lang.ClassFormatError: Invalid method Code length 96939 in class file clojure/core$eval87, compile:(NO_SOURCE_PATH:0:0) (<节点-IP>)
节点 < node-IP > 上所有键空间的快照失败:clojure.lang.Compiler$CompilerException: java.lang.ClassFormatError: Invalid method Code length 96939 in class file clojure/core$eval87, compile:(NO_SOURCE_PATH:0:0) (<节点-IP>)
上述错误(所有键空间的快照...)稍长一些,因为集群上的每个可用节点都会出现一次,最后会出现“所有键空间的备份失败:...”错误。
同时,所有 datastax-agents 都显示以下错误消息:
错误 [qtp1549990111-47] 2015-02-13 18:35:50,887 未处理的路由 异常:clojure.lang.Compiler$CompilerException: java.lang.ClassFormatError:类中的方法无效代码长度 96939 文件 clojure/core$eval87,编译:(NO_SOURCE_PATH:0:0) Compiler.java:6567 clojure.lang.Compiler.analyzeSeq Compiler.java:6361 clojure.lang.Compiler.analyze Compiler.java:6616 clojure.lang.Compiler.eval Compiler.java:6608 clojure.lang.Compiler.eval Compiler.java:6582 clojure.lang.Compiler.eval core.clj:2852 clojure.core/eval routes.clj:58 opsagent.http.routes/fn core.clj:94 compojure.core/make-route[fn] core.clj:40 compojure.core/if-route[fn] core.clj:25 compojure.core/if-method[fn] core.clj:107 compojure.core/routing[fn] core.clj:2443 clojure.core/some core.clj:107 compojure.core/routing RestFn.java:139 clojure.lang.RestFn.applyTo core.clj:619 clojure.core/apply core.clj:112 compojure.core/routes[fn] Var.java:415 clojure.lang.Var.invoke middleware.clj:93 opsagent.http.middleware/wrap-application-error[fn] middleware.clj:75 opsagent.http.middleware/wrap-content-type[fn] 中间件.clj:112 opsagent.http.middleware/wrap-content-error[fn] 中间件.clj:31 opsagent.http.middleware/wrap-request-logging[fn] 中间件.clj:17 opsagent.http.middleware/wrap-opscenter-id-check[fn] 中间件.clj:123 opsagent.http.middleware/wrap-version-header[fn] keyword_params.clj:32 ring.middleware.keyword-params/wrap-keyword-params[fn] params.clj:58 ring.middleware.params/wrap-params[fn] jetty.clj:19 opsagent.http.jetty/proxy-handler[fn] (来源不明) opsagent.http.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$0.handle HandlerWrapper.java:111 org.eclipse.jetty.server.handler.HandlerWrapper.handle Server.java:349 org.eclipse.jetty.server.Server.handle AbstractHttpConnection.java:452 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest AbstractHttpConnection.java:894 org.eclipse.jetty.server.AbstractHttpConnection.content AbstractHttpConnection.java:948 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content HttpParser.java:857 org.eclipse.jetty.http.HttpParser.parseNext HttpParser.java:235 org.eclipse.jetty.http.HttpParser.parseAvailable AsyncHttpConnection.java:76 org.eclipse.jetty.server.AsyncHttpConnection.handle SelectChannelEndPoint.java:609 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle SelectChannelEndPoint.java:45 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run QueuedThreadPool.java:599 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob QueuedThreadPool.java:534 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run (Unknown Source) java.lang.Thread.run Caused by: java.lang.ClassFormatError: Invalid method Code length 96939 in 类文件 clojure/core$eval87 (未知来源) java.lang.ClassLoader.defineClass1 (未知来源) java.lang.ClassLoader.defineClass (未知来源) java.lang.ClassLoader.defineClass DynamicClassLoader.java:46 clojure.lang.DynamicClassLoader.defineClass Compiler.java:4663 clojure.lang.Compiler$ObjExpr.getCompiledClass Compiler.java:3819 clojure.lang.Compiler$FnExpr.parse Compiler.java:6558 clojure.lang.Compiler.analyzeSeq 信息 [qtp1549990111-47] 2015-02-13 18:35:50,888 HTTP::post /ops/take-snapshot {:req-id "c13bb101-2f9e-4880-8b1f-efc178f49b3e"} - 500
以上适用于 2 个数据中心(Datastax 默认值、Cassandra/Analytics DC 和 DseSimpleSnitch)中的 5 个节点的生产集群。分析 DC 与 Spark 和 CFS 一起使用。我已经尝试了相同的程序(升级路径 4.5.3->4.6.0-> 备份所有密钥空间)到我的本地 2 机器集群(一个 Cassandra,一个 Analytics),数据集要小得多,它就像一个魅力。
我无法使用 datastax opscenter 5.0.1 创建新集群
我可以将 datastax 代理添加到现有集群上的节点,但无法创建(端口已打开,ssh 连接正常,sudo 适用于安装用户)
这是我所做的:
SSH-ing 到机器,我可以看到没有安装/传输任何东西,显然“agent_files.tar”还没有被 scp-ed。没有关于转移过程中可能失败的细节。
opscenterd.log 提取,loglevel DEBUG(自愿在此处仅使用 1 个服务器以避免多个日志条目):
2014-11-04 15:48:11+0000 [] INFO: Testing SSH connectivity to 10.133.243.24
2014-11-04 15:48:11+0000 [] INFO: Testing SSH login to 10.133.243.24
2014-11-04 15:48:11+0000 [] DEBUG: performing ssh: ['/usr/bin/ssh', '-l', u'deploy', '-p', '22', '-o', 'LogLevel=Error', u'10.133.243.24', '/usr/bin/test', '0']
2014-11-04 15:48:11+0000 [] INFO: SSH connectivity/login test succeeded
2014-11-04 15:48:11+0000 [] INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'jmx_pass': '*****', 'thrift_ssl_truststore': None, 'rollups86400_ttl': -1, 'api_port': '61621', 'use_ssl': 0, 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'thrift_user': '', 'provisioning': 0, 'metrics_ignored_column_families': '', 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter', 'jmx_user': '', 'cassandra_install_location': '', 'kerberos_use_keytab': True, 'rollups300_ttl': 2419200, 'thrift_pass': '*****', 'metrics_ignored_solr_cores': '', 'metrics_enabled': 1, 'kerberos_use_ticket_cache': True, 'thrift_ssl_truststore_type': 'JKS', 'rollups60_ttl': 604800, 'ec2_metadata_api_host': '169.254.169.254', 'kerberos_renew_tgt': True, 'thrift_ssl_truststore_password': '*****'}
2014-11-04 15:48:12+0000 [] INFO: Starting provisioning process
2014-11-04 15:48:12+0000 [] DEBUG: Persisting config file /etc/opscenter/clusters/ChallengerDeep.conf
2014-11-04 15:48:12+0000 [] INFO: Starting installation phase of cluster provisioning
2014-11-04 15:48:12+0000 [] DEBUG: performing ssh: ['/usr/bin/ssh', '-l', u'deploy', '-p', '22', '-o', 'LogLevel=Error', u'10.133.243.24', 'echo', '-n', '.$(which apt-get 2> /dev/null) .$(which yum 2> /dev/null)']
2014-11-04 15:48:12+0000 [] DEBUG: Seeing if ip/hostname 10.133.243.24 is an ipv4 address
2014-11-04 15:48:12+0000 [] DEBUG: 10.133.243.24 is an ipv4 address
2014-11-04 15:48:12+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:12+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:12+0000 [] INFO: Beginning install of OpsCenter agent to 10.133.243.24
2014-11-04 15:48:12+0000 [] DEBUG: Prepping ssh connections
2014-11-04 15:48:12+0000 [] DEBUG: performing scp: ['/usr/bin/scp', '-q', '-P', '22', '/tmp/tmpdJdGZ3', u'[email protected]:/tmp/tmpdJdGZ3']
2014-11-04 15:48:13+0000 [] DEBUG: performing scp: ['/usr/bin/scp', '-q', '-P', '22', './agent_files.tar', u'[email protected]:agent_files.tar']
2014-11-04 15:48:13+0000 [] DEBUG: performing ssh: ['/usr/bin/ssh', '-l', u'deploy', '-p', '22', '-o', 'LogLevel=Error', u'10.133.243.24', 'rm', '-rf', 'datastax-agent-installer', '&&', 'mkdir', 'datastax-agent-installer', '&&', 'cp', 'agent_files.tar', 'datastax-agent-installer/agent_files.tar', '&&', 'cd', 'datastax-agent-installer', '&&', 'tar', 'xvf', 'agent_files.tar', '&&', 'cd', '../', '&&', 'mv', '/tmp/tmpdJdGZ3', 'datastax-agent-installer/pfile', '&&', './datastax-agent-installer/bin/install_agent.sh', '', '10.133.249.88', ';', 'rm', '-rf', 'datastax-agent-installer', 'agent_files.tar']
2014-11-04 15:48:14+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:14+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:14+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:19+0000 [] DEBUG: Average opscenterd CPU usage: 2.24%, memory usage: 43 MB
2014-11-04 15:48:19+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:19+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:19+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:24+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:24+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:24+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:29+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:29+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:29+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:34+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:34+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:34+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:39+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:39+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:39+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:44+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:44+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:44+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:49+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:49+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:49+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:54+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:54+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:54+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:48:59+0000 [] DEBUG: Performing HTTP request (GET): http://10.133.243.24:61621/alive?, body: None
2014-11-04 15:48:59+0000 [] WARN: HTTP request http://10.133.243.24:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2014-11-04 15:48:59+0000 [] DEBUG: Agent is still not alive, sleeping 5 seconds...
2014-11-04 15:49:04+0000 [] WARN: Marking request 58fdf092-fc83-4b82-a2be-22b3e63ff795 as failed: The installed agent doesn't seem to be responding.
2014-11-04 15:49:04+0000 [] INFO: Successfully installed agent and dsc on node 10.133.243.24
2014-11-04 15:49:04+0000 [] DEBUG: Subrequests complete for 'install stage' (31cc12a1-5552-443b-8cd5-ec1a91a9191d)
2014-11-04 15:49:04+0000 [] WARN: Marking request 'install stage' (31cc12a1-5552-443b-8cd5-ec1a91a9191d) as failed: The installed agent doesn't seem to be responding.
2014-11-04 15:49:04+0000 [] ERROR: Installation stage failed: The installed agent doesn't seem to be responding.
2014-11-04 15:49:04+0000 [] DEBUG: Subrequest failed (key=install request=RequestCollection[31cc12a1-5552-443b-8cd5-ec1a91a9191d](error, The installed agent doesn't seem to be responding.)): Installation stage failed: The installed agent doesn't seem to be responding.
2014-11-04 15:49:04+0000 [] WARN: Marking request 'provision' (c5243946-3bb6-4eb5-b669-04355c319339) as failed: Installation stage failed: The installed agent doesn't seem to be responding.
2014-11-04 15:49:04+0000 [] ERROR:
2014-11-04 15:49:04+0000 [] ERROR: Cluster provisioning failed: Exception: Installation stage failed: The installed agent doesn't seem to be responding.
2014-11-04 15:49:04+0000 [] DEBUG: Seeing if ip/hostname 10.133.243.24 is an ipv4 address
2014-11-04 15:49:04+0000 [] DEBUG: 10.133.243.24 is an ipv4 address
2014-11-04 15:49:04+0000 [] ERROR: Failed to provision cluster: Cluster provisioning failed: Exception: Installation stage failed: The installed agent doesn't seem to be responding.
我刚刚在 Linux 上使用“DataStax All-in-One Installer”安装了 OpsCenter 5.0。
我试图通过在以下设置中添加身份验证/etc/opscenter/opscenterd.conf
:
[authentication]
enabled = True
重新启动服务器后service opscenterd restart
,身份验证仍然不存在。
我尝试使用文档说明启用 SSL,但也没有任何效果。服务器甚至不监听默认设置的 8443 端口。
分析日志文件后,/var/log/opscenter/opscenterd.log
我没有发现任何相关的错误。
我想到了什么,是 OpsCenter 可能没有读取配置文件吗?为了检查这一点,我决定在配置文件中放入一些随机字符串,以在 OpsCenter 守护程序启动期间引发错误。重新启动后,日志文件仍然没有相关错误。
我的另一个想法是 OpsCenter 可能只是忽略了无效字符串?我已经删除了以前无效的字符串并在该部分中编辑了该port
值。重新启动后 OpsCenter 仍在 8888 下侦听,而没有绑定到 8887。8887
[webserver]
我最后一次尝试是重新启动整个服务器。它也没有帮助。
看起来 OpsCenter 根本不读取配置文件。什么会导致这种情况?我怎样才能解决这个问题?
谢谢你的时间,亚当
PS:这里是全部/var/log/opscenter/opscenterd.log
内容。我还提供了一个PasteBin 链接以提高可读性。
2014-10-20 19:40:12+0200 [] INFO: Log opened.
2014-10-20 19:40:12+0200 [] INFO: twistd 10.2.0 (/usr/bin/python2.7 2.7.6) starting up.
2014-10-20 19:40:12+0200 [] INFO: reactor class: twisted.internet.epollreactor.EPollReactor.
2014-10-20 19:40:12+0200 [] INFO: set uid/gid 0/0
2014-10-20 19:40:12+0200 [] INFO: Logging level set to 'info'
2014-10-20 19:40:12+0200 [] INFO: OpsCenter version: 5.0.1
2014-10-20 19:40:12+0200 [] INFO: Compatible agent version: 5.0.1
2014-10-20 19:40:12+0200 [] INFO: Loading per-cluster config file ./conf/clusters/local.conf
2014-10-20 19:40:12+0200 [] INFO: HTTP BASIC authentication disabled
2014-10-20 19:40:12+0200 [] INFO: Starting webserver with ssl disabled.
2014-10-20 19:40:12+0200 [] INFO: Stats Reporter is connected via HTTP
2014-10-20 19:40:12+0200 [] INFO: SSL disabled
2014-10-20 19:40:12+0200 [] ERROR: Unable to import SSL, further definition actions will fail.
2014-10-20 19:40:12+0200 [] INFO: Starting Definition Update Service
2014-10-20 19:40:12+0200 [] INFO: opscenterd.WebServer.OpsCenterdWebServer starting on 8888
2014-10-20 19:40:12+0200 [] INFO: Starting factory <opscenterd.WebServer.OpsCenterdWebServer instance at 0x7fe23a2ac128>
2014-10-20 19:40:12+0200 [] INFO: morbid.morbid.StompFactory starting on 61619
2014-10-20 19:40:12+0200 [] INFO: Starting factory <morbid.morbid.StompFactory instance at 0x7fe237a41368>
2014-10-20 19:40:12+0200 [] INFO: Configuring agent communication with ssl support disabled.
2014-10-20 19:40:12+0200 [] INFO: morbid.morbid.StompFactory starting on 61620
2014-10-20 19:40:12+0200 [] ERROR: No http agent exists, likely due to SSL import failure.
2014-10-20 19:40:12+0200 [local] INFO: Starting services for cluster local
2014-10-20 19:40:12+0200 [local] INFO: Loading event plugins
2014-10-20 19:40:12+0200 [local] INFO: Loading event plugin conf ./conf/event-plugins/posturl.conf
2014-10-20 19:40:12+0200 [local] INFO: Successfully loaded event plugin conf ./conf/event-plugins/posturl.conf
2014-10-20 19:40:12+0200 [local] INFO: Loading event plugin conf ./conf/event-plugins/email.conf
2014-10-20 19:40:12+0200 [local] INFO: Successfully loaded event plugin conf ./conf/event-plugins/email.conf
2014-10-20 19:40:12+0200 [local] INFO: Done loading event plugins
2014-10-20 19:40:12+0200 [] INFO: Metric caching enabled with 50 points and 1000 metrics cached
2014-10-20 19:40:12+0200 [] INFO: Starting PushService
2014-10-20 19:40:12+0200 [local] INFO: Starting CassandraCluster service
2014-10-20 19:40:12+0200 [local] INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'jmx_pass': '*****', 'thrift_ssl_truststore': None, 'rollups86400_ttl': -1, 'api_port': '61621', 'use_ssl': 0, 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'thrift_user': '', 'provisioning': 0, 'metrics_ignored_column_families': '', 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter', 'jmx_user': '', 'cassandra_install_location': '', 'kerberos_use_keytab': True, 'rollups300_ttl': 2419200, 'thrift_pass': '*****', 'metrics_ignored_solr_cores': '', 'metrics_enabled': 1, 'kerberos_use_ticket_cache': True, 'thrift_ssl_truststore_type': 'JKS', 'rollups60_ttl': 604800, 'ec2_metadata_api_host': '169.254.169.254', 'kerberos_renew_tgt': True, 'thrift_ssl_truststore_password': '*****'}
2014-10-20 19:40:13+0200 [] INFO: OS Version: Linux version 3.13.0-32-generic (buildd@kissel) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014
2014-10-20 19:40:13+0200 [] INFO: CPU Info: ['2666.774', '2666.774']
2014-10-20 19:40:13+0200 [] INFO: Mem Info: 2989MB
2014-10-20 19:40:13+0200 [local] INFO: Enterprise functionality: True
2014-10-20 19:40:13+0200 [local] INFO: Cluster Name: Test Cluster
2014-10-20 19:40:13+0200 [local] INFO: Snitch: com.datastax.bdp.snitch.DseDelegateSnitch
2014-10-20 19:40:13+0200 [] INFO: Package Manager: aptitude
2014-10-20 19:40:13+0200 [local] INFO: Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
2014-10-20 19:40:13+0200 [local] INFO: Recognizing new node 127.0.0.1 ('-8867774524416669848')
2014-10-20 19:40:13+0200 [local] INFO: Node 127.0.0.1 has multiple tokens (vnodes). Only one picked for display.
2014-10-20 19:40:13+0200 [local] INFO: Keyspaces: {'system_traces': CassandraKeyspace(name=system_traces, column_families=[], tables=[u'events', u'sessions'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'OpsCenter': CassandraKeyspace(name=OpsCenter, column_families=['events_timeline', 'settings', 'rollups60', 'rollups86400', 'bestpractice_results', 'pdps', 'rollups7200', 'events', 'rollups300'], tables=[u'events_timeline', u'settings', u'rollups60', u'rollups86400', u'bestpractice_results', u'pdps', u'rollups7200', u'events', u'rollups300'], attributes={'strategy_options': {'replication_factor': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'system': CassandraKeyspace(name=system, column_families=['IndexInfo', 'NodeIdInfo', 'schema_keyspaces', 'hints'], tables=[u'peers', u'range_xfers', u'schema_keyspaces', u'schema_columns', u'IndexInfo', u'schema_triggers', u'sstable_activity', u'peer_events', u'paxos', u'batchlog', u'NodeIdInfo', u'compaction_history', u'compactions_in_progress', u'schema_columnfamilies', u'local', u'hints'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.LocalStrategy'}), 'dse_system': CassandraKeyspace(name=dse_system, column_families=[], tables=[u'encrypted_keys', u'leases'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'})}
2014-10-20 19:40:13+0200 [local] INFO: Persisting agent configuration to Cassandra
2014-10-20 19:40:13+0200 [local] INFO: Initializing event storage.
2014-10-20 19:40:13+0200 [local] INFO: Attempting to load all persisted alert rules
2014-10-20 19:40:13+0200 [] INFO: Starting to update agents' configuration
2014-10-20 19:40:13+0200 [local] INFO: Done loading persisted scheduled job descriptions
2014-10-20 19:40:13+0200 [local] INFO: Done loading persisted alert rules
2014-10-20 19:40:13+0200 [local] INFO: Done initializing event storage.
2014-10-20 19:40:13+0200 [local] INFO: OpsCenter starting up.
2014-10-20 19:40:13+0200 [local] INFO: Version: {'search': None, 'jobtracker': None, 'tasktracker': None, 'spark': {u'master': None, u'version': None, u'worker': None}, 'dse': u'4.5.2', 'cassandra': u'2.0.10.71'}
2014-10-20 19:40:13+0200 [local] INFO: Node 127.0.0.1 changed its mode to normal
2014-10-20 19:40:15+0200 [local] INFO: Using 127.0.0.1 as the RPC address for node 127.0.0.1
2014-10-20 19:40:44+0200 [] INFO: Received SIGTERM, shutting down.
2014-10-20 19:40:44+0200 [local] INFO: OpsCenter shutting down.
2014-10-20 19:40:44+0200 [local] INFO: Stopping repair service
2014-10-20 19:40:44+0200 [] INFO: (TCP Port 61620 Closed)
2014-10-20 19:40:44+0200 [] INFO: (TCP Port 61619 Closed)
2014-10-20 19:40:44+0200 [] INFO: Stopping factory <morbid.morbid.StompFactory instance at 0x7fe237a41368>
2014-10-20 19:40:44+0200 [] INFO: (TCP Port 8888 Closed)
2014-10-20 19:40:44+0200 [] INFO: Stopping factory <opscenterd.WebServer.OpsCenterdWebServer instance at 0x7fe23a2ac128>
2014-10-20 19:40:44+0200 [local] INFO: Stopping CassandraCluster service
2014-10-20 19:40:44+0200 [local] ERROR: Error publishing event plugin "CassandraStore": Connection closed ([Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion: Connection lost.
])
2014-10-20 19:40:44+0200 [] INFO: Main loop terminated.
2014-10-20 19:40:44+0200 [] INFO: Server Shut Down.
2014-10-20 19:40:45+0200 [] INFO: Log opened.
2014-10-20 19:40:45+0200 [] INFO: twistd 10.2.0 (/usr/bin/python2.7 2.7.6) starting up.
2014-10-20 19:40:45+0200 [] INFO: reactor class: twisted.internet.epollreactor.EPollReactor.
2014-10-20 19:40:45+0200 [] INFO: set uid/gid 0/0
2014-10-20 19:40:45+0200 [] INFO: Logging level set to 'info'
2014-10-20 19:40:45+0200 [] INFO: OpsCenter version: 5.0.1
2014-10-20 19:40:45+0200 [] INFO: Compatible agent version: 5.0.1
2014-10-20 19:40:45+0200 [] INFO: Loading per-cluster config file ./conf/clusters/local.conf
2014-10-20 19:40:45+0200 [] INFO: HTTP BASIC authentication disabled
2014-10-20 19:40:45+0200 [] INFO: Starting webserver with ssl disabled.
2014-10-20 19:40:45+0200 [] INFO: Stats Reporter is connected via HTTP
2014-10-20 19:40:45+0200 [] INFO: SSL disabled
2014-10-20 19:40:45+0200 [] ERROR: Unable to import SSL, further definition actions will fail.
2014-10-20 19:40:45+0200 [] INFO: Starting Definition Update Service
2014-10-20 19:40:45+0200 [] INFO: opscenterd.WebServer.OpsCenterdWebServer starting on 8888
2014-10-20 19:40:45+0200 [] INFO: Starting factory <opscenterd.WebServer.OpsCenterdWebServer instance at 0x7f681b6bd128>
2014-10-20 19:40:45+0200 [] INFO: morbid.morbid.StompFactory starting on 61619
2014-10-20 19:40:45+0200 [] INFO: Starting factory <morbid.morbid.StompFactory instance at 0x7f6818e52368>
2014-10-20 19:40:45+0200 [] INFO: Configuring agent communication with ssl support disabled.
2014-10-20 19:40:45+0200 [] INFO: morbid.morbid.StompFactory starting on 61620
2014-10-20 19:40:45+0200 [] ERROR: No http agent exists, likely due to SSL import failure.
2014-10-20 19:40:45+0200 [local] INFO: Starting services for cluster local
2014-10-20 19:40:45+0200 [local] INFO: Loading event plugins
2014-10-20 19:40:45+0200 [local] INFO: Loading event plugin conf ./conf/event-plugins/posturl.conf
2014-10-20 19:40:45+0200 [local] INFO: Successfully loaded event plugin conf ./conf/event-plugins/posturl.conf
2014-10-20 19:40:45+0200 [local] INFO: Loading event plugin conf ./conf/event-plugins/email.conf
2014-10-20 19:40:45+0200 [local] INFO: Successfully loaded event plugin conf ./conf/event-plugins/email.conf
2014-10-20 19:40:45+0200 [local] INFO: Done loading event plugins
2014-10-20 19:40:45+0200 [] INFO: Metric caching enabled with 50 points and 1000 metrics cached
2014-10-20 19:40:45+0200 [] INFO: Starting PushService
2014-10-20 19:40:45+0200 [local] INFO: Starting CassandraCluster service
2014-10-20 19:40:45+0200 [local] INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'jmx_pass': '*****', 'thrift_ssl_truststore': None, 'rollups86400_ttl': -1, 'api_port': '61621', 'use_ssl': 0, 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'thrift_user': '', 'provisioning': 0, 'metrics_ignored_column_families': '', 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter', 'jmx_user': '', 'cassandra_install_location': '', 'kerberos_use_keytab': True, 'rollups300_ttl': 2419200, 'thrift_pass': '*****', 'metrics_ignored_solr_cores': '', 'metrics_enabled': 1, 'kerberos_use_ticket_cache': True, 'thrift_ssl_truststore_type': 'JKS', 'rollups60_ttl': 604800, 'ec2_metadata_api_host': '169.254.169.254', 'kerberos_renew_tgt': True, 'thrift_ssl_truststore_password': '*****'}
2014-10-20 19:40:46+0200 [] INFO: OS Version: Linux version 3.13.0-32-generic (buildd@kissel) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014
2014-10-20 19:40:46+0200 [] INFO: CPU Info: ['2666.774', '2666.774']
2014-10-20 19:40:46+0200 [] INFO: Mem Info: 2989MB
2014-10-20 19:40:46+0200 [local] INFO: Enterprise functionality: True
2014-10-20 19:40:46+0200 [local] INFO: Cluster Name: Test Cluster
2014-10-20 19:40:46+0200 [] INFO: Package Manager: aptitude
2014-10-20 19:40:46+0200 [local] INFO: Snitch: com.datastax.bdp.snitch.DseDelegateSnitch
2014-10-20 19:40:46+0200 [local] INFO: Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
2014-10-20 19:40:46+0200 [local] INFO: Recognizing new node 127.0.0.1 ('-8867774524416669848')
2014-10-20 19:40:46+0200 [local] INFO: Node 127.0.0.1 has multiple tokens (vnodes). Only one picked for display.
2014-10-20 19:40:46+0200 [local] INFO: Keyspaces: {'system_traces': CassandraKeyspace(name=system_traces, column_families=[], tables=[u'events', u'sessions'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'OpsCenter': CassandraKeyspace(name=OpsCenter, column_families=['events_timeline', 'settings', 'rollups60', 'rollups86400', 'bestpractice_results', 'pdps', 'rollups7200', 'events', 'rollups300'], tables=[u'events_timeline', u'settings', u'rollups60', u'rollups86400', u'bestpractice_results', u'pdps', u'rollups7200', u'events', u'rollups300'], attributes={'strategy_options': {'replication_factor': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'system': CassandraKeyspace(name=system, column_families=['IndexInfo', 'NodeIdInfo', 'schema_keyspaces', 'hints'], tables=[u'peers', u'range_xfers', u'schema_keyspaces', u'schema_columns', u'IndexInfo', u'schema_triggers', u'sstable_activity', u'peer_events', u'paxos', u'batchlog', u'NodeIdInfo', u'compaction_history', u'compactions_in_progress', u'schema_columnfamilies', u'local', u'hints'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.LocalStrategy'}), 'dse_system': CassandraKeyspace(name=dse_system, column_families=[], tables=[u'encrypted_keys', u'leases'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'})}
2014-10-20 19:40:46+0200 [local] INFO: Persisting agent configuration to Cassandra
2014-10-20 19:40:46+0200 [local] INFO: Initializing event storage.
2014-10-20 19:40:46+0200 [local] INFO: Attempting to load all persisted alert rules
2014-10-20 19:40:46+0200 [local] INFO: Done loading persisted alert rules
2014-10-20 19:40:46+0200 [local] INFO: Done initializing event storage.
2014-10-20 19:40:46+0200 [local] INFO: Done loading persisted scheduled job descriptions
2014-10-20 19:40:46+0200 [] INFO: Starting to update agents' configuration
2014-10-20 19:40:46+0200 [local] INFO: OpsCenter starting up.
2014-10-20 19:40:46+0200 [local] INFO: Version: {'search': None, 'jobtracker': None, 'tasktracker': None, 'spark': {u'master': None, u'version': None, u'worker': None}, 'dse': u'4.5.2', 'cassandra': u'2.0.10.71'}
2014-10-20 19:40:46+0200 [local] INFO: Node 127.0.0.1 changed its mode to normal
2014-10-20 19:40:48+0200 [local] INFO: Using 127.0.0.1 as the RPC address for node 127.0.0.1
2014-10-20 19:40:59+0200 [local] INFO: Agent for ip 127.0.0.1 is version u'5.0.1'
在 OpsCenter 5.0 中添加集群没有问题。据我所知,opscenterd.log 中没有错误。但是,Web 界面不显示任何节点。我可以看到浏览器中的所有 XHR 调用看起来都不错。
如果我尝试单击“集群操作”->“配置”,我会得到以下信息:
Error
Unable to get definition for cassandra-yaml for cluster settings.
我尝试重新安装 OpsCenter(以及删除 OpsCenter 键空间),没有任何区别。
简单的纽比问题。我在 EC2 中启动了 3 个实例,一个 cassandra、solr 和 spark 节点。我原以为它们会在 OpsCenter 中显示为单个集群环,但实际上它们显示为 3 个单独的环,每个环中只有一个节点。这是预期的行为,还是我做了一些愚蠢的事情?