Use percentiles_arrayif to calculate approximate percentile values for a numeric expression when a certain condition evaluates to true. This function is useful when you want an array of percentiles instead of a single percentile. You can use it to understand data distributions in scenarios such as request durations, event processing times, or security alert severities, while filtering on specific criteria.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
In Splunk SPL, you often use statistical functions such as perc<percent> or percN() to compute percentile estimates. In APL, you use percentiles_arrayif and provide a predicate to define which records to include in the computation.
index=main sourcetype=access_combined
| stats perc90(req_duration_ms) AS p90, perc99(req_duration_ms) AS p99
In ANSI SQL, you often use window functions like PERCENTILE_DISC or PERCENTILE_CONT or write multiple CASE expressions for conditional aggregation. In APL, you can achieve similar functionality with percentiles_arrayif by passing the numeric field and condition to the function.
SELECT
  PERCENTILE_DISC(0.90) WITHIN GROUP (ORDER BY req_duration_ms) AS p90,
  PERCENTILE_DISC(0.99) WITHIN GROUP (ORDER BY req_duration_ms) AS p99
FROM sample_http_logs
WHERE status = '200';

Usage

Syntax

percentiles_arrayif(Field, Array, Condition)

Parameters

  • Field is the name of the field for which you want to compute percentile values.
  • Array is a dynamic array of one or more numeric percentile values (between 0 and 100).
  • Condition is a Boolean expression that indicates which records to include in the calculation.

Returns

The function returns an array of percentile values for the records that satisfy the condition. The position of each returned percentile in the array matches the order in which it appears in the function call.

Use case examples

You can use percentiles_arrayif to analyze request durations in HTTP logs while filtering for specific criteria, such as certain HTTP statuses or geographic locations.Query
['sample-http-logs']
| summarize percentiles_arrayif(req_duration_ms, dynamic([50, 90, 95, 99]), status == '200') by bin_auto(_time)
Run in PlaygroundOutput
percentiles_req_duration_ms
0.7352 ms
1.691 ms
1.981 ms
2.612 ms
This query filters records to those with a status of 200 and returns the percentile values for the request durations.
  • avg: Returns the average of a numeric column.
  • percentile: Returns a single percentile value.
  • percentile_if: Returns a single percentile value for the records that satisfy a condition.
  • percentiles_array: Returns an array of percentile values for all rows.
  • sum: Returns the sum of a numeric column.