Describing the Policy Conditions
Now that we know how to retrieve the data the next step consists of actually writing the conditions they must validate. This is done in the policy definition. There can be only one such definition in a given policy template.
A policy definition consists of a list of validations. Each validation may in turn describe multiple checks. A validation also defines one or more Escalations that trigger when a check fails and Resolutions that trigger when an underlying issue is fixed. For further details, see Triggering Actions. Finally, a validation also provides a default summary and details text templates (see Incident Message and Email Templates) used to render the incident message.
Each validate or validate_each is run independently and will generate either 0 or 1 incidents.
The syntax for a policy definition is:
policy <string literal> do
  validate[_each] $<datasource>|@<resources> do
    summary_template <string literal>
    detail_template <string literal>
    hash_include <field_name>, <field_name>, ...
    hash_exclude <field_name>, <field_name>, ...
    escalate $<escalation>
    escalate $<escalation>
    resolve $<resolution>
    resolve $<resolution>
    ...
    check <term>
    check <term>
    ...
    export <path expression> do
      resource_level <boolean>
      field <name> do
        label <string literal>
        format <string literal>
        path <path expression>
      end
      field ...
    end
  end
  validate...
end
Where:
- Each validation starts with validateorvalidate_each.- validateapplies the checks on the given datasource or resources as a whole.
- validate_eachiterates over the given datasource or resources (which must be an array or is wrapped into a single element array) and applies the checks on each element.
 
- summary_templateprovides a text template that gets applied to the escalation data to render the incident message summary.
- detail_templateprovides a text template that gets applied to the escalation data to render the incident message details. It will be displayed above the export table, if one is specified.
- hash_includeis array of fields in the escalation data to check in determining whether data has changed and thus actions should be re-run. By default, all fields are checked so if any value changes at all, all actions are run again. This includes emails and cloud workflows. In general, this field does not have to be specified. The general exception is when you have a value such as a timestamp that changes constantly.
- hash_excludeis an array of fields in the escalation data to exclude in determining whether data hash changed. This field is mutually exclusive with hash_include.
- escalateindicates Escalations to trigger when a check fails.
- resolveindicates Resolutions to trigger when all existing violations are resolved.
- checkidentifies a term that must return anything BUT- false,- 0, an empty string, an empty array or an empty object. If the term returns one of these values then the check fails, an incident is created and any associated escalation triggers.
- exportcontrols whether or not a table of resources is exported for the incident.- path expressionis a string literal corresponding to a jmes_path expression acting upon the violation data. The jmespath can be used to extract a table of resources if the resources exist as a subpath in data. This field is optional.
- resource_levelis a boolean stating if the data being exported is resource level data or not. If the data is resource level, available actions can be run on a select group of resources or all of them.
- fieldspecifies a field in the data, such as id. Each field corresponds to a column in the data table. Fields values should be simple types such as integers, strings, booleans, or arrays of simple values.
- nameis the object field key/name in the violation- datarow
- labelis a human readable label associated with the name and shows up as the header for the column. If omitted,- namewill be used.
- formatcontrols formatting for the column. Currently left, center, and right keywords are supported. By default, columns are left formatted.
- pathis a string literal corresponding to a jmes_path expression acting upon each resource. The jmespath can be used to extract a field from a embedded data structure or to rename a field. By default,- nameis used.
 
The policy engine runs each check in order and stops when a check fails. A check fails if the corresponding term returns false, 0, an empty string, an empty array or an empty object. In the case of validate_each the policy engine applies that algorithm for each element of the datasource or resources.
Each time a check fails the corresponding data is added to the violation data. In the case of validate this can only happen once and thus the escalation data ends up being the validated datasource or resources. In the case of validate_each this means that only the elements that fail a check are added to the escalation data.
The violation data is exported as a table in the incident view page.
Example:
Assume that the $reservations datasource has data like:
[
  {
    "account": {
      "id": 1,
      "name": "my account" 
    },
    "region": "us-west-1",
    "instance_type": "m1.small",
    "instance_count": 10,
    "end_time": "2020-01-01 01:02:03",
    "time_left": 10200 
  },
  ... 
]
policy "ri_expiration" do 
  validate_each $reservations do 
    summary_template "Reserved instances are nearing expiration." 
    detail_template <<-EOS 
Found {{ len data }} expired reservations in account id {{ rs_project_id }} 
EOS 
    export do 
      field "account_name" do 
        label "Account Name" 
        path "account.name" 
      end 
      field "account_id" do 
        label "Account ID" 
        path "account.id" 
      end 
      field "region" do 
        label "Region" 
      end 
      field "instance_type" do 
        label "Instance Type" 
      end 
      field "instance_count" do 
        label "Instance Count" 
      end 
      field "end_time" do 
        label "End Time" 
      end 
      field "time_left" do 
        label "Time Left In Seconds" 
        format "right" 
      end 
    end 
    hash_include "id", "end_time" 
    escalate $alert 
    check gt(dec(to_d(val(item, "end_time")), now), 3*24*3600))
  end 
end 
In the example above the policy defines a single validation with a single check. The check returns a boolean value which is false when the duration between a reserved instance expiration data and now is less than 3 days. In this case the alert escalation triggers. The violation data consists of an array that contains all the reservations that are expiring in less than 3 days.
A table of information is defined to display in the mail as well as display on the incident show page in the dashboard.
Triggering Actions
Actions are run anytime the underlying violation data changes. By default, all fields are used in determining whether the data changes. In the case above, the time_left field will be continually changing and causing actions like email to retrigger. hash_include and hash_exclude can be used to modify this behavior by excluding certain fields form this calculation. By supplying id and end_time to the hash_include method, we ensure that we only get new alerts when one of these two values is changed. We could have also achieved the same ends by doing hash_exclude "time_left" as well -- all other fields in the datasource be relatively stable.
Only top-level fields will be considered for hash_include and hash_exclude. If you have a nested structure such as:
[
  {
    "account": {
      "id": 1,
      "name": "my account" 
    },
    "region": "us-west-1",
    "instance_type": "m1.small",
    "instance_count": 10,
    "end_time": "2020-01-01 01:02:03",
    "time_left": 10200 
  },
  ... 
]
Then you may specify hash_include "config" to detect any changes in config but not individual fields within config itself. If you wish to hash only a specific field within config such as config.foo, then use a JavaScript based script block to transform the nested fields into top-level fields first.