Curve-Based Discrimination
Uses ROC, AUC, KS, Gini, and related curves to measure how well the model separates positives from negatives across the full score distribution.
Overview
The Curve-Based Discrimination bucket measures how well your modelβs scores separate positive from negative outcomes across the full score range, using classic evaluation curves.
This bucket is implemented for binary classification. Multiclass models can be evaluated via one-vs-rest or macro-averaged strategies by defining separate metrics per class.
It uses:
- Receiver Operating Characteristic (ROC) and AUC-ROC
- Gini coefficient
- KS (KolmogorovβSmirnov) statistic and score
Metrics
All metrics are derived from (fpr, tpr) pairs computed across a grid of thresholds on the positive-class score.
auc_roc
Area under the ROC curve, approximated via the trapezoidal rule:
For adjacent ROC points (fpr_i, tpr_i) and (fpr_{i+1}, tpr_{i+1}):
ΞAUC_i = (fpr_{i+1} β fpr_i) Γ (tpr_{i+1} + tpr_i) / 2
auc_roc = Ξ£_i ΞAUC_igini_coefficient (optional) Linear transform of AUC commonly used in risk/scoring domains:
gini_coefficient = 2 Γ auc_roc β 1ks_statistic
Maximum separation between the cumulative distributions of scores for positives vs negatives:
ks_statistic = max_threshold | CDF_positive(score β₯ t) β CDF_negative(score β₯ t) |ks_score
The threshold value t* at which ks_statistic is attained. This is often used as a candidate operating point.
Data Requirements
{{label_col}}β binary ground truth (0 for negative, 1 for positive){{score_col}}β probability or score for the positive class (continuous){{timestamp_col}}β event timestamp
Base Metric SQL β ROC Components
WITH base AS (
SELECT
{{timestamp_col}} AS event_ts,
{{label_col}} AS label,
{{score_col}} AS score
FROM {{dataset}}
),
grid AS (
SELECT generate_series(0.0, 1.0, 0.01) AS threshold
),
scored AS (
SELECT
time_bucket(INTERVAL '5 minutes', event_ts) AS ts,
g.threshold,
label,
score,
CASE WHEN score >= g.threshold THEN 1 ELSE 0 END AS pred_pos
FROM base
CROSS JOIN grid g
)
SELECT
ts AS ts,
threshold AS threshold,
SUM(CASE WHEN label = 1 THEN 1 ELSE 0 END) AS actual_pos,
SUM(CASE WHEN label = 0 THEN 1 ELSE 0 END) AS actual_neg,
SUM(CASE WHEN pred_pos = 1 AND label = 1 THEN 1 ELSE 0 END) AS tp,
SUM(CASE WHEN pred_pos = 1 AND label = 0 THEN 1 ELSE 0 END) AS fp
FROM scored
GROUP BY ts, threshold
ORDER BY ts, threshold;You can materialize this as {{bucket_5_curve_metrics_daily}}.
Computing AUC-ROC, Gini, and KS
AUC-ROC (Binary Only)
WITH roc_points AS (
SELECT
day,
threshold,
tpr,
fpr,
LAG(fpr) OVER (PARTITION BY day ORDER BY fpr) AS prev_fpr,
LAG(tpr) OVER (PARTITION BY day ORDER BY fpr) AS prev_tpr
FROM {{bucket_5_curve_metrics_daily}}
)
SELECT
day,
SUM(
COALESCE((fpr - prev_fpr) * (tpr + prev_tpr) / 2.0, 0.0)
) AS auc_roc
FROM roc_points
GROUP BY day;Gini Coefficient
SELECT
day,
auc_roc,
2 * auc_roc - 1 AS gini_coefficient
FROM {{bucket_5_auc_metrics}};KS Statistic and Score
WITH roc_points AS (
SELECT
day,
threshold,
tpr,
fpr
FROM {{bucket_5_curve_metrics_daily}}
),
cum AS (
SELECT
day,
threshold,
SUM(tpr) OVER (PARTITION BY day ORDER BY threshold DESC) AS cum_tpr,
SUM(fpr) OVER (PARTITION BY day ORDER BY threshold DESC) AS cum_fpr
FROM roc_points
)
SELECT
day,
MAX(ABS(cum_tpr - cum_fpr)) AS ks_statistic,
FIRST_VALUE(threshold) OVER (
PARTITION BY day
ORDER BY ABS(cum_tpr - cum_fpr) DESC
) AS ks_score
FROM cum
GROUP BY day;Plots
Plot 1β AUC + Gini Over Time
Uses:
auc_rocgini_coefficient
SELECT
time_bucket_gapfill(
'1 day',
timestamp,
'{{dateStart}}'::timestamptz,
'{{dateEnd}}'::timestamptz
) AS time_bucket_1d,
metric_name,
CASE
WHEN metric_name = 'auc_roc' THEN 'AUC ROC'
WHEN metric_name = 'gini_coefficient' THEN 'Gini Coefficient'
ELSE metric_name
END AS friendly_name,
COALESCE(AVG(value), 0) AS metric_value
FROM metrics_numeric_latest_version
WHERE metric_name IN (
'auc_roc',
'gini_coefficient'
)
[[AND timestamp BETWEEN '{{dateStart}}' AND '{{dateEnd}}']]
GROUP BY time_bucket_1d, metric_name
ORDER BY time_bucket_1d, metric_name;What this shows
This plot tracks both AUC and its corresponding Gini coefficient over time, providing two familiar views of discrimination strength.
How to interpret it
- Sustained drops in auc_roc or gini_coefficient usually indicate that the modelβs ability to separate positives from negatives has degraded.
- If AUC remains stable while business KPIs worsen, the issue may be threshold selection or data mix, not raw score discrimination.
- Gini is often the metric business stakeholders and regulators expect in credit/risk settings, so having both on one chart helps bridge ML and business views.
Plot 2 β KS Statistic Over Time
Uses:
ks_statisticks_score
SELECT
time_bucket_gapfill(
'1 day',
timestamp,
'{{dateStart}}'::timestamptz,
'{{dateEnd}}'::timestamptz
) AS time_bucket_1d,
metric_name,
CASE
WHEN metric_name = 'ks_statistic' THEN 'KS Statistic'
WHEN metric_name = 'ks_score' THEN 'KS Score'
ELSE metric_name
END AS friendly_name,
COALESCE(AVG(value), 0) AS metric_value
FROM metrics_numeric_latest_version
WHERE metric_name IN (
'ks_statistic',
'ks_score'
)
[[AND timestamp BETWEEN '{{dateStart}}' AND '{{dateEnd}}']]
GROUP BY time_bucket_1d, metric_name
ORDER BY time_bucket_1d, metric_name;What this shows
This plot tracks the maximum separation between positive and negative score distributions (ks_statistic) and where that separation occurs in score space (ks_score).
How to interpret it
- Higher
ks_statisticvalues indicate stronger separation; many credit models target KS in a specific band (e.g., 0.3β0.5). - Changes in
ks_scoretell you where in the score range the separation is happeningβif it drifts, your best cut-off point may be moving. - If KS drops while AUC is flat, the problem may be in specific parts of the distribution (e.g., tail behavior) rather than global ranking.
Plot 3β Combined AUC + KS
Uses:
auc_rocks_statisticks_score
SELECT
time_bucket_gapfill(
'1 day',
timestamp,
'{{dateStart}}'::timestamptz,
'{{dateEnd}}'::timestamptz
) AS time_bucket_1d,
metric_name,
CASE
WHEN metric_name = 'auc_roc' THEN 'ROC AUC'
WHEN metric_name = 'ks_statistic' THEN 'KS Statistic'
WHEN metric_name = 'ks_score' THEN 'KS Score'
ELSE metric_name
END AS friendly_name,
COALESCE(AVG(value), 0) AS metric_value
FROM metrics_numeric_latest_version
WHERE metric_name IN (
'auc_roc',
'ks_statistic',
'ks_score'
)
[[AND timestamp BETWEEN '{{dateStart}}' AND '{{dateEnd}}']]
GROUP BY time_bucket_1d, metric_name
ORDER BY time_bucket_1d, metric_name;
What this shows
This plot overlays global ranking quality (AUC) with maximum local separation (KS) on the same time axis.
How to interpret it
- When AUC and KS move together, the modelβs ranking power is consistently changing.
- Divergence (e.g., AUC flat, KS moving) suggests that certain score regions become more/less separative even though global ranking is unchanged.
- This is an excellent high-level monitoring view for risk committees and model governance reviews.
Plot 4 β KS-Only Variant (optional)
Uses:
ks_statistic
SELECT
time_bucket_gapfill(
'1 day',
timestamp,
'{{dateStart}}'::timestamptz,
'{{dateEnd}}'::timestamptz
) AS time_bucket_1d,
metric_name,
CASE
WHEN metric_name = 'ks_statistic' THEN 'KS Statistic'
ELSE metric_name
END AS friendly_name,
COALESCE(AVG(value), 0) AS metric_value
FROM metrics_numeric_latest_version
WHERE metric_name IN (
'ks_statistic',
)
[[AND timestamp BETWEEN '{{dateStart}}' AND '{{dateEnd}}']]
GROUP BY time_bucket_1d, metric_name
ORDER BY time_bucket_1d, metric_name;
What this shows
A lightweight KS-only time series focusing on separation strength.
How to interpret it
- Useful when you want a single scalar to monitor drift in discrimination.
- Can be wired into simple guardrails (e.g., βalert if KS falls below 0.25β).
Alternative SQL
WITH scored AS (
SELECT
time_bucket(INTERVAL '1 day', {{timestamp_col}}) AS bucket,
{{ground_truth}} AS label,
{{score_col}} AS score
FROM
{{dataset}}
),
-- Total positives / negatives per bucket
bucket_totals AS (
SELECT
bucket,
SUM(CASE WHEN label = 1 THEN 1 ELSE 0 END)::float AS num_pos,
SUM(CASE WHEN label = 0 THEN 1 ELSE 0 END)::float AS num_neg
FROM
scored
GROUP BY
bucket
),
-- Cumulative TP / FP as we sweep thresholds from high score to low score
roc_raw AS (
SELECT
s.bucket,
s.score,
SUM(CASE WHEN s.label = 1 THEN 1 ELSE 0 END)
OVER (
PARTITION BY s.bucket
ORDER BY s.score DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)::float AS cum_pos,
SUM(CASE WHEN s.label = 0 THEN 1 ELSE 0 END)
OVER (
PARTITION BY s.bucket
ORDER BY s.score DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)::float AS cum_neg
FROM
scored s
),
-- Turn cumulative counts into TPR/FPR points
roc_points AS (
SELECT DISTINCT
r.bucket,
r.score AS threshold,
CASE WHEN bt.num_pos > 0 THEN r.cum_pos / bt.num_pos ELSE 0 END AS tpr,
CASE WHEN bt.num_neg > 0 THEN r.cum_neg / bt.num_neg ELSE 0 END AS fpr
FROM
roc_raw r
JOIN bucket_totals bt USING (bucket)
),
-- Add (0,0) and (1,1) endpoints for a closed ROC curve
roc_with_endpoints AS (
-- (0,0) per bucket
SELECT
bt.bucket,
0.0::float AS fpr,
0.0::float AS tpr
FROM
bucket_totals bt
UNION ALL
-- All ROC points
SELECT
bucket,
fpr,
tpr
FROM
roc_points
UNION ALL
-- (1,1) per bucket
SELECT
bt.bucket,
1.0::float AS fpr,
1.0::float AS tpr
FROM
bucket_totals bt
),
-- Order ROC points and keep previous point for trapezoids
roc_ordered AS (
SELECT
bucket,
fpr,
tpr,
LAG(fpr) OVER (PARTITION BY bucket ORDER BY fpr, tpr) AS prev_fpr,
LAG(tpr) OVER (PARTITION BY bucket ORDER BY fpr, tpr) AS prev_tpr
FROM
roc_with_endpoints
),
-- Area Under the Curve (AUC) via trapezoidal rule
auc_per_bucket AS (
SELECT
bucket,
COALESCE(
SUM(
CASE
WHEN prev_fpr IS NULL THEN 0
ELSE (fpr - prev_fpr) * (tpr + prev_tpr) / 2.0
END
),
0
) AS area_under_curve -- AUC: ability to distinguish between classes
FROM
roc_ordered
GROUP BY
bucket
),
-- Kolmogorov-Smirnov Statistic: max |TPR - FPR|
ks_per_bucket AS (
SELECT
bucket,
MAX(ABS(tpr - fpr)) AS kolmogorov_smirnov_statistic
FROM
roc_with_endpoints
GROUP BY
bucket
)
SELECT
a.bucket AS bucket,
a.area_under_curve AS auc_roc,
k.kolmogorov_smirnov_statistic AS ks_statistic,
k.kolmogorov_smirnov_statistic AS ks_score
FROM
auc_per_bucket a
JOIN ks_per_bucket k USING (bucket)
ORDER BY
a.bucket;
Updated about 1 month ago