Alert on Kibana Saved Searches using Elasticsearch Watcher

July 29, 2019
An Elasticsearch Watcher is typically paired with a Kibana Saved Search to identify the events which caused it to fire. This post demonstrates how a Watcher can use the Apache Lucene style query from a Saved Search, ensuring the Watcher remains aligned as the query evolves.

Introduction

Elasticsearch Watcher provides a flexible framework to monitor for events. A Watcher can be configured to perform periodic searches of Elasticsearch indexed data and take action when events are found - for example send an email, or index new documents.

Typically it is the output of a Saved Search that drives the development of a new Watcher aimed at automated event monitoring. For example, a monitoring dashboard may be created that embeds a Saved Search which searches for errors and failures in syslog events. A Watcher might then be created to perform the same search periodically to alert when events are returned from the search.

As monitoring requirements evolve over time, the search criteria in both the Saved Search and the Watcher must be kept aligned so that both report and alert on the same events. Ideally the Watcher should use the Saved Search as its input, meaning only the Saved Search need be tuned and maintained over time.

Luckily, there is a way to do this!

From reading the Watcher and Elasticsearch documentation, whilst comprehensive, it is not clear how this can be achieved. Once this had been identified, it became apparent that the Saved Search structure cannot be used as is.

This post first details the main problem when attempting to use Saved Searches in this way. Second, it walks through how the problem may be worked around. Finally it presents a concrete example of the method in action.

NOTE This post concentrates on the searches built upon the Apache Lucene style query that can be entered in the search bar in the Kibana User Interface, the section highlighted in the following image:

lucene-query-input

The Painless JSON Problem!

Watchers can be configured with multiple inputs. Two queries can be perform against Elasticsearch, for example, or a query and two HTTP requests can be performed. The inputs are executed in order, meaning that a chained input can be used to fetch a Saved Search and then perform a query against Elasticsearch using its configuration. While this sounds easy there is one problem that prevents it from being so.

To demonstrate the problem, a Saved Search is loaded in the browser. In the browser address bar the ID of the Saved Search forms part of the page URL:

search-url

Using the highlighted ID, and the Kibana Dev Tools, the Saved Search may be obtained:

search-url

The attribute which contains the Saved Search query is as follows - this is highlighted in the above image:

search.kibanaSavedObjectMeta.searchSourceJSON.query.query

Looking closer, the final query.query attribute is located within a JSON string. This cannot be used as is, so the JSON object string must be parsed before it can be used.

Typically, the painless language would be used to parse this as you would in any other programming language. Unfortunately though, painless has no support for working with JSON objects, so another way must be found.

Use Mustache to Parse JSON :{)

If Elasticsearch was queried directly a request might look like the following:

GET /cc-syslog-*/_search
{
    "query": {
        "term": {
            "message": "error"
        }
    }
}

Elasticsearch supports query style named search templates. A Mustache template can be given in a query which, when rendered, produces the actual query which is then executed. The given Mustache template is rendered and then parsed as a JSON string as if it had been received in a HTTP request like the above query.

The following example executes the same query as above, this time using a search template:

GET /cc-syslog-*/_search/template
{
    "source": """{"query":{"term":{"message":"{{value_to_match}}"}}}""",
    "params": {
        "value_to_match": "error"
    }
}

Both queries are effectively the same, and both produce the same search results.

Elasticsearch has enhanced its Mustache implementation with the tag toJson. This will convert the object identified between its start and end tag to a JSON string. While the item needing to be parsed in the Saved Search is already a string, a string is a valid JSON object, and when encoded using this tag it will simply be written as is.

Therefore, using a search template which employs the toJson tag, a query can be rendered using the JSON string from the Saved Search.

Typically, a rendered query will be executed and not echoed back to the requestor, i.e. access to the rendered query, and thus the parsed JSON string from the Saved Search, is not available. A clever little trick must be used to force Elasticsearch to echo the original JSON string back to the Watcher so that it can be used.

Elasticsearch supports a meta attribute which can be passed to aggregations in queries. This can contain a list of keys and values to provide with a query. In the response to a query, the meta attribute is echoed back to the requestor. A simple aggregation can be provided with a query so that the meta attribute can be populated with the JSON string from the Saved Search and accessed in the response - this query would normally be condensed into a single line surrounded by double-quotes as a single string, and has been reformatted here for clarity here:

{
    "query": {
        "term": {
            "_id": "{{ctx.payload.findSavedSearch.hits.hits.0._id}}"
        }
    },
    "aggs": {
        "id": {
            "terms": {
                "field": "_id"
            },
            "meta": {
                "searchSourceJSON": {{#toJson}}ctx.payload.findSavedSearch.hits.hits.0._source.search.kibanaSavedObjectMeta.searchSourceJSON{{/toJson}}
            }
        }
    }
}

The result to this query will contain the metadata under the attribute aggregations.id.meta.searchSourceJSON.query.query.

Concrete Example - Syslog Errors

Using a search template, the toJson Mustache template tag, the search meta attribute, and the original Saved Search, a Watcher can be implemented which will use the Apache Lucene style query from a Saved Search.

A chained input is used with the first input pulling in the Saved Search as follows - here the .kibana index is queried, as this is where Saved Searches are stored:

{
    "findSavedSearch": {
        "search": {
            "request": {
                "indices": [
                    ".kibana"
                ],
                "body": {
                    "query": {
                        "term": {
                            "_id": "search:0dff3240-af9f-11e9-b459-b1376c731de5"
                        }
                    }
                }
            }
        }
    }
}

Next, a search template is used to force the JSON string in the saved search results to be parsed. The template’s toJson tag and the meta attribute are used in the query to parse the JSON and echo it back - note that the .kibana index is queried since the rendered query must be a real and well-formed query and here will simply search for the Saved Search again, i.e. it’s the same query as the previous input:

{
    "parseSavedSearch": {
        "search": {
            "request": {
                "indices": [
                    ".kibana"
                ],
                "template": {
                    "source": """{"query":{"term":{"_id":"{{ctx.payload.findSavedSearch.hits.hits.0._id}}"}},"aggs":{"id":{"terms":{"field": "_id"},"meta":{"searchSourceJSON": {{#toJson}}ctx.payload.findSavedSearch.hits.hits.0._source.search.kibanaSavedObjectMeta.searchSourceJSON{{/toJson}} }}}}""",
                    "lang": "mustache"
                }
            }
        }
    }
}

Finally, using the parsed JSON string echoed back by the previous input, the real query can be performed - note the index containing the syslog events is used here, and also note the (NOT watcherState.keyword:processed) parameter passed in with the query, in short this prevents alerting on events already alerted on, a limit and a filter is also used to search the past 1 hour and 10,000 events, this will be covered a little more further on:

{
    "savedSearchQuery": {
        "search": {
            "request": {
                "indices": [
                    "cc-syslog-*"
                ],
                "body": {
                    "size": 10000,
                    "query": {
                        "bool": {
                            "must": {
                                "query_string": {
                                    "query": "(NOT watcherState.keyword:processed) AND ({{ctx.payload.parseSavedSearch.aggregations.id.meta.searchSourceJSON.query.query}})"
                                }
                            },
                            "filter": {
                                "range": {
                                    "@timestamp": {
                                        "gte": "now-1h"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

A simple script condition is used to test for at least 1 matched event:

"script": {
    "source": "return ctx.payload.savedSearchQuery.hits.hits.size() > 0",
    "lang": "painless"
}

Two actions are then employed. The first to is provide email notification that new events have been found and, as an example, simply report the count:

"send_email": {
    "email": {
        "profile": "standard",
        "to": [
            "stephen.vickers@nospaceships.com"
        ],
        "subject": "[saved-search-watcher] system-log-errors",
        "body": {
            "text": "{{ctx.payload.savedSearchQuery.hits.hits.size}} events have been discovered."
            }
        }
    }
}

The second action is slightly more complicated. It will re-index the events from the search and add the watcherState attribute to effectively mark the events as processed. This new attribute is then used to prevent the events from being matched in the next run of the Watcher by including the search term (NOT watcherState.keyword:processed) along with the Saved Search query - this can happen because the query window is 1 hour and the Watcher runs every 10 minutes, using this method ensures coverage of all events, and to ensure all events are processed when there are more than 10,000 that would be returned:

"save_results": {
    "transform": {
        "script": {
            "source": """
def docs = [];

for(hit in ctx.payload.savedSearchQuery.hits.hits) {
    def update = hit['_source'];

    update['_id'] = hit['_id'];
    update['_index'] = hit['_index'];
    update['watcherState'] = 'processed';

    docs.add(update)
}

return ['_doc': docs] """,
            "lang": "painless"
        }
    },
    "index": {
        "doc_type": "doc"
    }
}

Note that in the above example Kibana Dev Tools was used to add the Watcher, and the 3 double-quote sequence """ is used to author painless scripts using new-lines so they are easier to read.

The complete and finished watcher can be found in the NoSpaceships saved-search-watcher repository on GitHub. Follow the instructions given there on how to use the example Watcher.

Summary

This post demonstrates how a Kibana Saved Search can be used by an Elasticsearch Watcher to ensure ongoing consistency between alerts and dashboards, and to ease administration. Using a search template, the toJson Mustache template tag, the search meta attribute, and a Saved Search, a Watcher can be implemented that does not require updating when its corresponding Saved Search is enhanced.

The NoSpaceships saved-search-watcher GitHub repository provides a concrete example of this method.

If you have any questions, queries or feedback regarding this post please contact us >