This page explains how to use the percentile aggregation function in APL.
percentile
aggregation function in Axiom Processing Language (APL) allows you to calculate the value below which a given percentage of data points fall. It is particularly useful when you need to analyze distributions and want to summarize the data using specific thresholds, such as the 90th or 95th percentile. This function can be valuable in performance analysis, trend detection, or identifying outliers across large datasets.
You can apply the percentile
function to various use cases, such as analyzing log data for request durations, OpenTelemetry traces for service latencies, or security logs to assess risk patterns.
percentile
aggregation in APL is a statistical aggregation that returns estimated results. The estimation comes with the benefit of speed at the expense of accuracy. This means that percentile
is fast and light on resources even on a large or high-cardinality dataset, but it doesn’t provide precise results.Splunk SPL users
percentile
function is referred to as perc
or percentile
. APL’s percentile
function works similarly, but the syntax is different. The main difference is that APL requires you to explicitly define the column on which you want to apply the percentile and the target percentile value.ANSI SQL users
PERCENTILE_CONT
or PERCENTILE_DISC
functions to compute percentiles. In APL, the percentile
function provides a simpler syntax while offering similar functionality.percentile
function to identify the 95th percentile of request durations, which gives you an idea of the tail-end latencies of requests in your system.Querypercentile_req_duration_ms |
---|
1200 |
avg
to calculate the average of a column, which gives you the central tendency of your data. In contrast, percentile
provides more insight into the distribution and tail values.min
function returns the smallest value in a column. Use this when you need the absolute lowest value instead of a specific percentile.max
function returns the highest value in a column. It’s useful for finding the upper bound, while percentile
allows you to focus on a specific point in the data distribution.stdev
calculates the standard deviation of a column, which helps measure data variability. While stdev
provides insight into overall data spread, percentile
focuses on specific distribution points.