Skip to main content

Documentation Index

Fetch the complete documentation index at: https://trygradient.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Custom Scoring Rubrics

Gradient’s default rubric evaluates candidates across five categories (Deliverable Quality, Delegation, Description, Discernment, Diligence) totaling 100 points. You can customize this rubric to match your organization’s priorities.

Default rubric

CategoryPointsScoring method
Deliverable Quality40LLM Judge
Delegation10Hybrid
Description20LLM Judge
Discernment20Hybrid
Diligence10Event Analysis
Each category contains multiple sub-criteria with individual point values and descriptions.

Customizing the rubric

Update an assessment’s scoring rubric via the API:
curl -X PATCH "https://app.trygradient.ai/api/assessments/ASSESSMENT_ID" \
  -H "Authorization: Bearer gai_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "scoring_rubric": {
      "version": 1,
      "totalPoints": 100,
      "categories": [
        {
          "id": "deliverable_quality",
          "name": "Deliverable Quality",
          "points": 50,
          "scoringMethod": "llm_judge",
          "subCriteria": [
            {
              "id": "content_accuracy",
              "name": "Content Accuracy",
              "points": 20,
              "description": "Are claims supported by data sources?",
              "enabled": true
            },
            {
              "id": "analytical_depth",
              "name": "Analytical Depth",
              "points": 15,
              "description": "Does the work go beyond surface-level summary?",
              "enabled": true
            },
            {
              "id": "structure",
              "name": "Structure & Organization",
              "points": 15,
              "description": "Is it logically organized?",
              "enabled": true
            }
          ]
        }
      ],
      "modifiers": {
        "bonusMax": 5,
        "penaltyMax": -10
      }
    }
  }'

Scoring methods

Each category can use one of three scoring methods:

LLM Judge

An AI evaluator assesses quality against your rubric criteria. Best for subjective qualities like content depth and writing quality.

Event Analysis

Automated analysis of candidate behavior patterns. Best for measurable actions like data source usage and iteration count.

Hybrid

Combines both approaches. The final score blends AI evaluation with behavioral signals.

Tips for effective rubrics

Keep total points at 100. The scoring engine normalizes to 100 points for percentile calculations. Using a different total will produce unexpected percentile rankings.
  • Weight what matters most - If deliverable quality is more important than AI usage patterns for your role, allocate more points accordingly
  • Write specific sub-criteria descriptions - The LLM judge uses these descriptions to evaluate. Vague descriptions produce vague scores
  • Use customInstructions on sub-criteria to give the judge role-specific context (e.g., “For a data analyst role, prioritize accuracy of calculations over visual design”)
  • Disable sub-criteria you don’t need - Set enabled: false rather than removing them, so you can re-enable later