Search documentation

Search all documentation pages

Evaluation Sessions

Learn how to group evaluations into sessions for context and audit.

Overview

Evaluation sessions group related evaluations into a single logical workflow. Instead of treating each evaluation as an isolated event, sessions let the policy engine see the full history of what has happened so far -- which tools were called, what data was accessed, how many actions have occurred -- and make decisions based on that context.

When to Use Sessions

Sessions are useful whenever you have multi-step workflows where individual actions should be evaluated in context:

  • AI agent conversations -- Each tool call or message in a conversation is a separate evaluation, but the policy engine needs to know what the agent has already done.
  • Document processing pipelines -- A pipeline that reads, transforms, and exports data can be tracked as a single session.
  • Batch operations -- Group related API calls (e.g., bulk email sends) to enforce aggregate limits.
  • Multi-turn chat -- Track what topics have been discussed and what data has been shared across turns.

For simple, one-off evaluations (a single API request that does not relate to other requests), sessions are optional. If you are using an agent integration, sessions are auto-created for you -- you do not need to manage them manually unless you want to control the lifecycle.

Agent integrations automatically create a session if no sessionId is provided in the evaluation request. The auto-created session ID is returned in the response.

Session Lifecycle

Sessions follow a simple state machine:

ACTIVE  ──>  COMPLETED   (ended normally)
   |
   └──────>  TERMINATED  (ended early due to violation, timeout, or manual stop)
  • ACTIVE -- The session is open and accepting new evaluations. As evaluations run within the session, Aguardic tracks actions, data tags, tools used, and action counts automatically.
  • COMPLETED -- The session ended normally. No further evaluations can be added.
  • TERMINATED -- The session was ended early, typically due to a critical policy violation or a timeout.

Complete Workflow Example

Here is a full TypeScript example that creates a session, runs multiple evaluations within it, inspects the session state, and ends it.

1. Create the session

const API_BASE = "https://api.aguardic.com/v1";
const API_KEY = process.env.AGUARDIC_API_KEY;
 
const headers = {
  Authorization: `Bearer ${API_KEY}`,
  "Content-Type": "application/json",
};
 
// Create a new session
const sessionRes = await fetch(`${API_BASE}/evaluation-sessions`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    externalSessionId: "agent-conv-2025-03-10-abc",
    metadata: {
      userId: "user-456",
      channel: "web-chat",
    },
  }),
});
 
const session = await sessionRes.json();
const sessionId = session.data.id;
console.log("Session created:", sessionId);

2. Evaluate actions within the session

Pass the sessionId to each evaluation request. Aguardic automatically tracks each evaluation as an action in the session.

// First action: agent queries a database
const eval1 = await fetch(`${API_BASE}/evaluate`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    sessionId,
    input: {
      tool: "query_database",
      arguments: { query: "SELECT name, email FROM customers WHERE id = 123" },
    },
    targetKey: "tool-call",
  }),
});
const result1 = await eval1.json();
console.log("Action 1 outcome:", result1.data.outcome);
 
// Second action: agent drafts an email with the query results
const eval2 = await fetch(`${API_BASE}/evaluate`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    sessionId,
    input: {
      tool: "send_email",
      arguments: {
        to: "external@example.com",
        subject: "Customer Details",
        body: "Here are the customer details: John Doe, john@example.com",
      },
    },
    targetKey: "tool-call",
  }),
});
const result2 = await eval2.json();
console.log("Action 2 outcome:", result2.data.outcome);
 
// Third action: agent tries to access a restricted resource
const eval3 = await fetch(`${API_BASE}/evaluate`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    sessionId,
    input: {
      tool: "query_database",
      arguments: { query: "SELECT ssn, credit_card FROM customers" },
    },
    targetKey: "tool-call",
  }),
});
const result3 = await eval3.json();
console.log("Action 3 outcome:", result3.data.outcome);

3. Inspect the session

Retrieve the session to see the accumulated context and full action chain.

const inspectRes = await fetch(
  `${API_BASE}/evaluation-sessions/${sessionId}`,
  { headers }
);
const inspected = await inspectRes.json();
 
console.log("Action count:", inspected.data.actionCount);
console.log("Data tags:", inspected.data.dataTags);
console.log("Tools used:", inspected.data.toolsUsed);
console.log("Status:", inspected.data.status);
 
// Iterate over the action chain
for (const action of inspected.data.actions) {
  console.log(
    `  #${action.sequence}: ${action.toolName} -> ${action.outcome}`
  );
}

4. End the session

const endRes = await fetch(
  `${API_BASE}/evaluation-sessions/${sessionId}/end`,
  {
    method: "POST",
    headers,
    body: JSON.stringify({ status: "COMPLETED" }),
  }
);
const ended = await endRes.json();
console.log("Session ended:", ended.data.status);
console.log("Final action count:", ended.data.actionCount);

Context-Aware Policy Evaluation

The real power of sessions is context-aware policy evaluation. When you include a sessionId in an evaluation request, the policy engine automatically receives the session's accumulated context:

| Context Field | Description | |---------------|-------------| | actionCount | Total number of evaluations run so far in the session. | | dataTags | Aggregated data categories detected across all previous actions (e.g., ["pii", "financial"]). | | toolsUsed | All tools or operations invoked so far in the session. | | recentActions | The most recent actions in the session for pattern detection. |

This context is available to both deterministic and semantic rules. Here are examples of policies you can build with session context:

  • Rate limiting: Block if actionCount exceeds 20 in a single session.
  • Escalation: Warn if the same tool is called more than 5 times (repetitive behavior pattern).
  • Cross-action PII tracking: Block an email send if PII was detected earlier in the session (the dataTags will contain "pii" from a previous action).
  • Sensitive data exfiltration: Block if the agent queries a database containing PII and then tries to send an email to an external address within the same session.

Without sessions, each evaluation is independent -- the policy engine has no way to know that the agent queried PII data two steps ago before trying to email it externally.

Linking Sessions to Entities

Sessions can be linked to governed entities via the entityId field when creating a session. When a session is linked to an entity:

  • The entity's knowledge base provides additional context for semantic rule evaluation.
  • Entity-specific policies are included in the evaluation.
  • The session appears in the entity's activity history for audit purposes.
const sessionRes = await fetch(`${API_BASE}/evaluation-sessions`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    entityId: "patient-uuid-here",
    externalSessionId: "consultation-2025-03-10",
    metadata: { clinicId: "clinic-001" },
  }),
});

All evaluations within this session will now include the patient's context -- medication lists, treatment history, and any other documents in their knowledge base -- when evaluating semantic rules.

Best Practices

  • One session per logical workflow. Do not reuse sessions across unrelated workflows. Create a new session for each agent conversation, pipeline run, or batch operation.
  • Use externalSessionId for cross-referencing. Map Aguardic sessions to your own system identifiers (conversation IDs, job IDs, etc.) for easier debugging.
  • End sessions explicitly. While sessions will eventually expire if you set expiresAt, explicitly ending them with COMPLETED or TERMINATED provides a cleaner audit trail.
  • Use metadata for filtering. Add structured metadata (user ID, channel, environment) when creating sessions to make them easier to find and filter in the dashboard.

Next Steps