我很好奇,考虑到数百万个元素。如果我有一个包含 100 个元素的分页查询,并且恰好有 100 个元素与我的查询匹配,ElasticSearch 是否始终会将它们全部返回一次,或者是否有可能它有时会使用分页索引返回少于 100 个元素?
script_score
在 OpenSearch 中,我使用 Painless 脚本语言实现了自定义。当我只使用query.bool.should
它时,每个文档都会调用一次,并且返回的结果_score
是正确的
但是,当我在查询中结合使用query.bool.should
和时,每个文档会调用两次或三次,而最终得分是所有调用的总和。这会导致得分高于预期。query.bool.must
script_score
should
为什么会发生这种情况?当同时使用和must
时,如何确保每个文档仅调用一次query
?或者至少阻止 OpenSearch 对每个文档的所有调用结果求和,并仅返回其中一个调用的结果?
例如,参见下面的查询(我在这里简化了它,以便示例易于理解),您会看到来源script_source
是,return Integer.parseInt(doc['_id'].value);
但是因为我使用了两者should
,并且在我的查询中,文档的must
计算是(即)而不是_score
6148
18444
6148 * 3
6148
{
"from": 0,
"size": 10,
"stored_fields": "_none_",
"docvalue_fields": [
"_id",
"_score"
],
"sort": [
{
"_score": {
"order": "asc"
}
}
],
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": { "category_ids": "2" }
},
{
"terms": { "visibility": ["3", "4"] }
}
],
"should": [
{
"ids": {
"values": [
"6148"
]
}
}
],
"minimum_should_match": 1
}
},
"script_score": {
"script": {
"lang": "painless",
"source": "return Integer.parseInt(doc['_id'].value);"
}
}
}
}
}
以下是我用来获取过去三个月(月初)数据的查询。如何将其更改为当前周的开始?
{
"size": 10,
"timeout": "1000s",
"query": {
"bool": {
"must": [
{
"match": {
"filter_grouper": "ABC"
}
},
{
"match": {
"state": "High"
}
},
{
"range": {
"end_date": {
"gte": "now-3M/M",
"lte": "now"
}
}
}
]
}
},
"_source": [
"number",
"barrel"
]
}
我尝试使用
{
"range": {
"end_date": {
"gte": "now/yyyy-W/W",
"lte": "now"
}
}
}
但它会引发错误 -
{
"type": "parse_exception",
"reason": "operator not supported for date math [/yyyy-W/W]"
}
也一样now-1W/W
我正在尝试将同义词过滤器添加到我的 Elasticsearch 分析器中。确切地说,我需要将同义词添加到索引分析器中,因为我使用模糊搜索,并且希望它也能识别同义词中的拼写错误。
但是当我尝试创建索引分析器时:
"filter": {
"synonyms": {
"type": "synonym_graph",
"synonyms": [
...
]
}
}
"index_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "synonyms", "edge_ngram"]
}
然后将其指定为字段的索引分析器:
"text" : {
"type" : "text",
"analyzer": "index_analyzer",
"search_analyzer": "search_analyzer"
}
我收到一个错误:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_exception",
"reason" : "analyzer [index_analyzer] contains filters [synonyms] that are not allowed to run in index time mode."
}
],
"type" : "mapper_parsing_exception",
"reason" : "Failed to parse mapping [_doc]: analyzer [index_analyzer] contains filters [synonyms] that are not allowed to run in index time mode.",
"caused_by" : {
"type" : "mapper_exception",
"reason" : "analyzer [index_analyzer] contains filters [synonyms] that are not allowed to run in index time mode."
}
},
"status" : 400
}
尽管Elasticsearch 文档说:
您可以将包含同义词集的分析器指定为搜索时间分析器或索引时间分析器。
我做错了什么?Elastisearch 版本是 7.17.5。
我开发了一个操作员来监控我开发的 CR,假设它的类型是MyCustomResource1。然后它基本上会执行任何操作员所做的操作,在协调循环中,它使当前集群状态更接近所需状态。
现在,在我的 CR 规范中,有一个类似这样的字段
elastic_configuration:
cpu: 2
memory: 4Gi
作为其协调循环的一部分,CR 可确保启动具有 N 个 pod 的 Elasticsearch 集群,内存和 CPU 字段分别由字段elastic_configuration.cpu和elastic_configuration.memory决定。
现在,K8S 中的实际Elasticsearch集群由 ECK 操作员启动(它监视其自己的Elasticsearch类型的 CR ),但我希望内存和 CPU 的值成为MyCustomResource1规范的一部分
每当用户想要增加Elasticsearch pod 的 CPU 和内存时,他们都应该编辑我的 CR 中字段 elastic_configuration.cpu和elastic_configuration.memory的值。我编写的运算符在其协调循环中将识别出这些值已更改,并将更新 Elasticsearch 资源规范的相应字段,该规范由 ECK 运算符监控。
问题是,我的操作员更新 CR 的最佳方法是什么?
网上最流行的解决方案建议使用非结构化对象,但我发现反复将所有内容转换为 map[string]interface{},然后将其转换为某种所需类型有点繁琐。
我想到另一个解决方案是,因为我使用的是 ECK 版本 2.13.0,所以我可以克隆整个 repo 或者复制代表Elasticsearch资源的相关结构(当然是从 2.13.0 开始),更新操作员协调循环中结构的字段,然后以某种方式通过某些 K8S 客户端使用这个结构来更新 CR,我知道我在这里缺乏细节,但这是一个高层次的想法,我想到过但我不确定是否可以实现)
TLDR;以编程方式更新自定义资源(比如 Go 中的 K8S 中的Elasticsearch)的正确方法是什么?
我目前在构建正确的 Elasticsearch 查询时遇到问题。目标是实现具有以下要求的过滤机制
- 我自己的帖子(userId 与当前用户的 userId 匹配)应该始终显示,无论 hiddenUntil 条件如何。
- 只有当 hiddenUntil 字段小于或等于当前时间时,才应显示其他用户的帖子(userId 与当前用户的 userId 不匹配)。
在 SQL 中,这可以使用 OR 条件直接实现,如下所示:
SELECT *
FROM posts
WHERE userId = 'my_user_id'
OR (userId != 'my_user_id' AND hiddenUntil <= CURRENT_TIMESTAMP);
我怎样才能编写与 SQL 运行相同的弹性搜索查询?
{
"query": {
"bool": {
"filter": [
{
"term": { "userId": "my_user_id" }
},
{
"bool": {
"must": [
{ "range": { "hiddenUntil": { "lte": 1728291700750 } } }
],
"must_not": [
{ "term": { "userId": "my_user_id" } }
]
}
}
]
}
}
}
- 仅使用查询时分析器创建索引:
PUT /local_persons
{
"settings": {
"analysis": {
"analyzer": {
"person_search_analyzer": {
"type": "custom",
"char_filter": ["remove_special_chars"],
"filter": ["lowercase"],
"tokenizer": "whitespace"
}
},
"char_filter": {
"remove_special_chars": {
"type": "pattern_replace",
"pattern": "[^a-zA-Z0-9]",
"replacement": ""
}
}
}
}
}
- 使用特殊字符对数据进行索引:
PUT /local_persons/_doc/1
{
"id": 1,
"firstName": "Re'mo",
"lastName": "D'souza",
"email": "[email protected],
"dateOfBirth": "1973-01-01",
"isActive": 1
}
现在搜索查询时的人员:
方法 1:使用 query_string 搜索(查询时使用分析器)
GET /local_persons/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "remo",
"fields": ["firstName"],
"analyzer": "person_search_analyzer"
}
},
{
"query_string": {
"query": "dsouza",
"fields": ["lastName"],
"analyzer": "person_search_analyzer"
}
}
]
}
}
}
方法 2:使用匹配查询
GET /local_persons/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"firstName": {
"query": "remo",
"analyzer": "person_search_analyzer"
}
}
},
{
"match": {
"lastName": {
"query": "dsouza",
"analyzer": "person_search_analyzer"
}
}
}
]
}
}
}
但上述两种方法均未得到预期的结果。
我今天早上尝试启动 elasticsearch 但我的终端上收到以下日志:
[2024-06-19T11:44:33,701][INFO ][o.e.r.s.FileSettingsService] [LAPTOP-32SOQ4EE] setting file [C:\Users\amazi\Downloads\elasticsearch-8.14.0\config\operator\settings.json] not found, initializing [file_settings] as empty
[2024-06-19T11:44:33,711][INFO ][o.e.c.c.NodeJoinExecutor ] [LAPTOP-32SOQ4EE] node-join: [{LAPTOP-32SOQ4EE}{uX3EK5tVQOieVnq7d6JRVQ}{wDv5vWGCS3a2CgZukxOiVw}{LAPTOP-32SOQ4EE}{127.0.0.1}{127.0.0.1:9300}{cdfhilmrstw}{8.14.0}{7000099-8505000}] with reason [completing election]
[2024-06-19T11:44:33,727][INFO ][o.e.h.AbstractHttpServerTransport] [LAPTOP-32SOQ4EE] publish_address {192.168.10.249:9200}, bound_addresses {[::]:9200}
[2024-06-19T11:44:33,789][INFO ][o.e.n.Node ] [LAPTOP-32SOQ4EE] started {LAPTOP-32SOQ4EE}{uX3EK5tVQOieVnq7d6JRVQ}{wDv5vWGCS3a2CgZukxOiVw}{LAPTOP-32SOQ4EE}{127.0.0.1}{127.0.0.1:9300}{cdfhilmrstw}{8.14.0}{7000099-8505000}{ml.allocated_processors=4, ml.machine_memory=3967725568, transform.config_version=10.0.0, xpack.installed=true, ml.config_version=12.0.0, ml.max_jvm_size=1983905792, ml.allocated_processors_double=4.0}
[2024-06-19T11:44:35,133][INFO ][o.e.x.s.a.Realms ] [LAPTOP-32SOQ4EE] license mode is [basic], currently licensed security realms are [reserved/reserved,file/default_file,native/default_native]
[2024-06-19T11:44:35,235][INFO ][o.e.l.ClusterStateLicenseService] [LAPTOP-32SOQ4EE] license [2756b818-3984-42d0-938c-4843baed9365] mode [basic] - valid
[2024-06-19T11:44:35,301][INFO ][o.e.g.GatewayService ] [LAPTOP-32SOQ4EE] recovered [34] indices into cluster_state
[2024-06-19T11:44:39,608][INFO ][o.e.h.n.s.HealthNodeTaskExecutor] [LAPTOP-32SOQ4EE] Node [{LAPTOP-32SOQ4EE}{uX3EK5tVQOieVnq7d6JRVQ}] is selected as the current health node.
[2024-06-19T11:44:45,176][INFO ][o.e.c.r.a.AllocationService] [LAPTOP-32SOQ4EE] current.health="YELLOW" message="Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[.kibana-observability-ai-assistant-conversations-000001][0]]])." previous.health="RED" reason="shards started [[.kibana-observability-ai-assistant-conversations-000001][0]]"
我可以正常访问 elasticsearch,但是当我尝试启动 kibana 时它不起作用,并且我收到以下日志:
[2024-06-19T11:37:08.345+02:00][INFO ][plugins.alerting] using indexes and aliases for persisting alerts
[2024-06-19T11:37:37.820+02:00][WARN ][plugins.reporting.config] Generating a random key for xpack.reporting.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.reporting.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.
[2024-06-19T11:37:49.806+02:00][INFO ][plugins.cloudSecurityPosture] Registered task successfully [Task: cloud_security_posture-stats_task]
[2024-06-19T11:38:44.959+02:00][INFO ][plugins.securitySolution.endpoint:user-artifact-packager:1.0.0] Registering endpoint:user-artifact-packager task with timeout
of [20m], interval of [60s] and policy update batch size of [25]
[2024-06-19T11:38:44.961+02:00][INFO ][plugins.securitySolution.endpoint:complete-external-response-actions] Registering task [endpoint:complete-external-response-actions] with timeout of [5m] and run interval of [60s]
[2024-06-19T11:38:53.068+02:00][INFO ][plugins.assetManager] Server is NOT enabled
[2024-06-19T11:38:54.134+02:00][INFO ][plugins.screenshotting.chromium] Browser executable: C:\Users\amazi\Downloads\kibana-8.14.0\node_modules\@kbn\screenshotting-plugin\chromium\chrome-win\chrome.exe
[2024-06-19T11:39:56.825+02:00][ERROR][elasticsearch-service] Unable to retrieve version information from Elasticsearch nodes. connect ETIMEDOUT 192.168.10.28:9200
我怀疑这是由于黄色健康状态造成的,因为昨天一切都运行正常,我没有更改 .yml 文件中的任何内容,它们仍然具有默认设置。有什么想法吗?顺便说一下,我在 Windows 上
为了以防万一,这是我的 .yml 文件:elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
#cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
#node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
#network.host: 192.168.0.1
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Allow wildcard deletion of indices:
#
#action.destructive_requires_name: false
#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
#
# The following settings, TLS certificates, and keys have been automatically
# generated to configure Elasticsearch security features on 18-06-2024 10:09:19
#
# --------------------------------------------------------------------------------
# Enable security features
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
enabled: true
keystore.path: certs/http.p12
# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/transport.p12
truststore.path: certs/transport.p12
# Create a new cluster with the current node only
# Additional nodes can still join the cluster later
cluster.initial_master_nodes: ["LAPTOP-32SOQ4EE"]
# Allow HTTP API connections from anywhere
# Connections are encrypted and require user authentication
http.host: 0.0.0.0
# Allow other nodes to join the cluster from anywhere
# Connections are encrypted and mutually authenticated
#transport.host: 0.0.0.0
#----------------------- END SECURITY AUTO CONFIGURATION -------------------------
kibana.yml:
# For more configuration options see the configuration guide for Kibana in
# https://www.elastic.co/guide/index.html
# =================== System: Kibana Server ===================
# Kibana is served by a back end server. This setting specifies the port to use.
#server.port: 5601
# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values.
# The default is 'localhost', which usually means remote machines will not be able to connect.
# To allow connections from remote users, set this parameter to a non-loopback address.
#server.host: "localhost"
# Enables you to specify a path to mount Kibana at if you are running behind a proxy.
# Use the `server.rewriteBasePath` setting to tell Kibana if it should remove the basePath
# from requests it receives, and to prevent a deprecation warning at startup.
# This setting cannot end in a slash.
#server.basePath: ""
# Specifies whether Kibana should rewrite requests that are prefixed with
# `server.basePath` or require that they are rewritten by your reverse proxy.
# Defaults to `false`.
#server.rewriteBasePath: false
# Specifies the public URL at which Kibana is available for end users. If
# `server.basePath` is configured this URL should end with the same basePath.
#server.publicBaseUrl: ""
# The maximum payload size in bytes for incoming server requests.
#server.maxPayload: 1048576
# The Kibana server's name. This is used for display purposes.
#server.name: "your-hostname"
# =================== System: Kibana Server (Optional) ===================
# Enables SSL and paths to the PEM-format SSL certificate and SSL key files, respectively.
# These settings enable SSL for outgoing requests from the Kibana server to the browser.
#server.ssl.enabled: false
#server.ssl.certificate: /path/to/your/server.crt
#server.ssl.key: /path/to/your/server.key
# =================== System: Elasticsearch ===================
# The URLs of the Elasticsearch instances to use for all your queries.
#elasticsearch.hosts: ["http://localhost:9200"]
# If your Elasticsearch is protected with basic authentication, these settings provide
# the username and password that the Kibana server uses to perform maintenance on the Kibana
# index at startup. Your Kibana users still need to authenticate with Elasticsearch, which
# is proxied through the Kibana server.
#elasticsearch.username: "kibana_system"
#elasticsearch.password: "pass"
# Kibana can also authenticate to Elasticsearch via "service account tokens".
# Service account tokens are Bearer style tokens that replace the traditional username/password based configuration.
# Use this token instead of a username/password.
# elasticsearch.serviceAccountToken: "my_token"
# Time in milliseconds to wait for Elasticsearch to respond to pings. Defaults to the value of
# the elasticsearch.requestTimeout setting.
#elasticsearch.pingTimeout: 1500
# Time in milliseconds to wait for responses from the back end or Elasticsearch. This value
# must be a positive integer.
#elasticsearch.requestTimeout: 30000
# The maximum number of sockets that can be used for communications with elasticsearch.
# Defaults to `Infinity`.
#elasticsearch.maxSockets: 1024
# Specifies whether Kibana should use compression for communications with elasticsearch
# Defaults to `false`.
#elasticsearch.compression: false
# List of Kibana client-side headers to send to Elasticsearch. To send *no* client-side
# headers, set this value to [] (an empty list).
#elasticsearch.requestHeadersWhitelist: [ authorization ]
# Header names and values that are sent to Elasticsearch. Any custom headers cannot be overwritten
# by client-side headers, regardless of the elasticsearch.requestHeadersWhitelist configuration.
#elasticsearch.customHeaders: {}
# Time in milliseconds for Elasticsearch to wait for responses from shards. Set to 0 to disable.
#elasticsearch.shardTimeout: 30000
# =================== System: Elasticsearch (Optional) ===================
# These files are used to verify the identity of Kibana to Elasticsearch and are required when
# xpack.security.http.ssl.client_authentication in Elasticsearch is set to required.
#elasticsearch.ssl.certificate: /path/to/your/client.crt
#elasticsearch.ssl.key: /path/to/your/client.key
# Enables you to specify a path to the PEM file for the certificate
# authority for your Elasticsearch instance.
#elasticsearch.ssl.certificateAuthorities: [ "/path/to/your/CA.pem" ]
# To disregard the validity of SSL certificates, change this setting's value to 'none'.
#elasticsearch.ssl.verificationMode: full
# =================== System: Logging ===================
# Set the value of this setting to off to suppress all logging output, or to debug to log everything. Defaults to 'info'
#logging.root.level: debug
# Enables you to specify a file where Kibana stores log output.
#logging.appenders.default:
# type: file
# fileName: /var/logs/kibana.log
# layout:
# type: json
# Example with size based log rotation
#logging.appenders.default:
# type: rolling-file
# fileName: /var/logs/kibana.log
# policy:
# type: size-limit
# size: 256mb
# strategy:
# type: numeric
# max: 10
# layout:
# type: json
# Logs queries sent to Elasticsearch.
#logging.loggers:
# - name: elasticsearch.query
# level: debug
# Logs http responses.
#logging.loggers:
# - name: http.server.response
# level: debug
# Logs system usage information.
#logging.loggers:
# - name: metrics.ops
# level: debug
# Enables debug logging on the browser (dev console)
#logging.browser.root:
# level: debug
# =================== System: Other ===================
# The path where Kibana stores persistent data not saved in Elasticsearch. Defaults to data
#path.data: data
# Specifies the path where Kibana creates the process ID file.
#pid.file: /run/kibana/kibana.pid
# Set the interval in milliseconds to sample system and process performance
# metrics. Minimum is 100ms. Defaults to 5000ms.
#ops.interval: 5000
# Specifies locale to be used for all localizable strings, dates and number formats.
# Supported languages are the following: English (default) "en", Chinese "zh-CN", Japanese "ja-JP", French "fr-FR".
#i18n.locale: "en"
# =================== Frequently used (Optional)===================
# =================== Saved Objects: Migrations ===================
# Saved object migrations run at startup. If you run into migration-related issues, you might need to adjust these settings.
# The number of documents migrated at a time.
# If Kibana can't start up or upgrade due to an Elasticsearch `circuit_breaking_exception`,
# use a smaller batchSize value to reduce the memory pressure. Defaults to 1000 objects per batch.
#migrations.batchSize: 1000
# The maximum payload size for indexing batches of upgraded saved objects.
# To avoid migrations failing due to a 413 Request Entity Too Large response from Elasticsearch.
# This value should be lower than or equal to your Elasticsearch cluster’s `http.max_content_length`
# configuration option. Default: 100mb
#migrations.maxBatchSizeBytes: 100mb
# The number of times to retry temporary migration failures. Increase the setting
# if migrations fail frequently with a message such as `Unable to complete the [...] step after
# 15 attempts, terminating`. Defaults to 15
#migrations.retryAttempts: 15
# =================== Search Autocomplete ===================
# Time in milliseconds to wait for autocomplete suggestions from Elasticsearch.
# This value must be a whole number greater than zero. Defaults to 1000ms
#unifiedSearch.autocomplete.valueSuggestions.timeout: 1000
# Maximum number of documents loaded by each shard to generate autocomplete suggestions.
# This value must be a whole number greater than zero. Defaults to 100_000
#unifiedSearch.autocomplete.valueSuggestions.terminateAfter: 100000
# This section was automatically generated during setup.
elasticsearch.hosts: ['https://192.168.10.28:9200']
elasticsearch.serviceAccountToken: ***************
elasticsearch.ssl.certificateAuthorities: ['C:\Users\amazi\Downloads\kibana-8.14.0\data\ca_1718706315734.crt']
xpack.fleet.outputs: [{id: fleet-default-output, name: default, is_default: true, is_default_monitoring: true, type: elasticsearch, hosts: ['https://192.168.10.28:9200'], ca_trusted_fingerprint: ***************}]
提前致谢
我有一个 ES 实例,我将日志推送到该实例中。然后使用 ES 搜索这些日志。这并不理想,有计划对其进行更改,但这就是现状。很抱歉描述得比较长,但请耐心听我说,这个问题很简单。
目前的搜索过程如下:
- 我有一个包含 N 条日志行的索引
- 用户输入要搜索的短语
- 我使用以下命令构建 ES 查询:
- 查询中的这个短语
size=1
(所以我只找到一行)track_total_hits=true
from=0
sort=<something>
因此,这为我提供了包含特定查询的行的首次出现(因为它们已排序,即按时间戳排序)。我还获得了总命中数,因此我可以向用户显示:
- 找到的线
- 出现次数(初始搜索时始终为
1
) - 总点击数
因此用户知道这是 1/300 次出现,并且可以提示 UI 查找下一个。搜索是相同的,但如果用户想要搜索下一个出现,我只需传递from=1
,from=2
等等。而且这个的性能相当不错,因为我只需要从 ES 下载一行。
太棒了。但是,这一切都是在向用户显示日志的网站上进行的。我想要做的是,当用户进行初始搜索时(在进入下一个/上一个事件之前),我想向他们显示“光标位置之后”的第一行
例如,用户看到:
58 foo
59 bar
60 baz
[...]
所以我想将它向下滚动到第一行匹配的行之后58
,而不是之前。
问题是,我仍然想显示1/<something>
找到的匹配项。在这种情况下,初始搜索可能会返回例如第五个匹配项,即5/300
。用户可以转到上一个/下一个。
因此,解决方案是下载所有匹配的行(没有查询from=
和size=
查询)。然后对它们进行 for 循环,找到行号高于用户看到的行(即58
),并将其返回。通过这样做,我还可以计算“哪个出现”是那样,这样我就会知道在 UI 上显示5/300
。
问题是:我必须从 ES 下载所有行才能做到这一点。如果索引有数百万行,这可能会对性能造成巨大影响。所以我想知道的是:有没有办法告诉 Elastic:
- 获取所有匹配的行(匹配的短语)
- 在这里应用另一个过滤器(行号 > 某些内容)
- 获取此行,还返回“匹配行出现在哪一次”的信息(在所有匹配的行中,没有“行号”过滤器)
因此对于如下行:
54 content
55 content
56 content
57 content
58 foo
59 bar
60 baz
61 content
[...]
短语:content
,搜索“从第 58 行开始”,我会得到如下响应:
{
"line": {"line_number": 61, "content": "content"},
"total_hits": 300,
"occurrence": 5
}
我正在使用 ELK (elasticsearch-8.12.0-1.x86_64) 来存储 kong API 网关日志。我正在使用 ILM(索引生命周期管理)策略来管理索引保留,并且我在 Logstash 管道配置文件中提到了它。
我注意到新创建的索引是使用以下命名约定创建的,尽管它们是在不同日期创建的:
kong-2022-11-17-000001
kong-2022-11-17-000002
kong-2022-11-17-000003
kong-2022-11-17-000004
kong-2022-11-17-000005
kong-2022-11-17-000006
如何更改命名约定以包含创建日期,如下所示:
kong-2022-11-17-000001
kong-2022-11-17-000002
kong-2022-11-17-000003
kong-2022-12-25-000001
kong-2023-01-01-000001
/etc/logstash/kong.conf
elasticsearch {
hosts => ["https://elastic01:elastic_port" , "https://elastic02:elastic_port" , "https://elastic03:elastic_port"]
user => "elastic_user"
password => elastic_user_password
ssl => true
ssl_certificate_verification => false
cacert => "/etc/logstash/http_ca.crt"
ilm_rollover_alias => "kong"
ilm_pattern => "{now/d}-000001"
ilm_policy => "kong-index-policy"
kong-索引-模板
{
"index": {
"lifecycle": {
"name": "kong-index-policy",
"rollover_alias": "kong"
},
"mapping": {
"total_fields": {
"limit": "10000"
}
},
"refresh_interval": "5s"
}
}
kong指数政策
{
"policy": "kong-index-policy",
"phase_definition": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_age": "180d",
"max_primary_shard_size": "10gb"
},
"set_priority": {
"priority": 100
}
}
},
我尝试配置 ILM 策略来管理索引滚动并使用创建日期创建新索引,但它无法正常工作。
Update01:我尝试了以下命令:
PUT %3Ckong-%7Bnow%2Fd%7D-000001%3E { "aliases": { "kong": { "is_write_index": true } } }
但我有以下错误:
```
{
"error": {
"root_cause": [
{
"type": "illegal_state_exception",
"reason": "alias [kong] has more than one write index [kong-2024.06.05-000001,kong-2022-11-24-000009]"
}
],
"type": "illegal_state_exception",
"reason": "alias [kong] has more than one write index [kong-2024.06.05-000001,kong-2022-11-24-000009]"
},
"status": 500
}
```
为了解决该错误,我使用以下命令切换了kong-2022-11-24-000009索引,然后继续执行提供的解决方案:
```
POST /_aliases
{
"actions": [
{
"add": {
"index": "kong-2022-11-24-000009",
"alias": "kong",
"is_write_index": false
}
}]
}
```