Evaluation Sessions

Learn how to group evaluations into sessions for context and audit.

Overview

Evaluation sessions group related evaluations into a single logical workflow. Instead of treating each evaluation as an isolated event, sessions let the policy engine see the full history of what has happened so far -- which tools were called, what data was accessed, how many actions have occurred -- and make decisions based on that context.

When to Use Sessions

Sessions are useful whenever you have multi-step workflows where individual actions should be evaluated in context:

AI agent conversationsEach tool call or message in a conversation is a separate evaluation, but the policy engine needs to know what the agent has already done.

Document processing pipelinesA pipeline that reads, transforms, and exports data can be tracked as a single session.

Batch operationsGroup related API calls (e.g., bulk email sends) to enforce aggregate limits.

Multi-turn chatTrack what topics have been discussed and what data has been shared across turns.

For simple, one-off evaluations (a single API request that does not relate to other requests), sessions are optional. If you are using an agent integration, sessions are auto-created for you -- you do not need to manage them manually unless you want to control the lifecycle.

Agent integrations automatically create a session if no sessionId is provided in the evaluation request. The auto-created session ID is returned in the response.

Session Lifecycle

Sessions follow a simple state machine:

ACTIVE  ──>  COMPLETED   (ended normally)
   |
   ├──────>  EXPIRED     (past expiresAt timestamp)
   |
   └──────>  TERMINATED  (ended early due to violation or manual stop)

ACTIVEThe session is open and accepting new evaluations. As evaluations run within the session, Aguardic tracks actions, data tags, tools used, and action counts automatically.

COMPLETEDThe session ended normally. No further evaluations can be added.

EXPIREDThe session's expiresAt timestamp has passed. No further evaluations can be added.

TERMINATEDThe session was ended early, typically due to a critical policy violation or manual stop.

Using the SDK

The simplest way to use sessions is with the official SDK. Create a session, pass the sessionId to each evaluation, and end the session when the workflow is complete.

import Aguardic from "@aguardic/sdk";
 
const aguardic = new Aguardic(process.env.AGUARDIC_API_KEY);
 
// 1. Create a session
const session = await aguardic.sessions.create({
  externalSessionId: "agent-conv-2025-03-10-abc",
  metadata: { userId: "user-456", channel: "web-chat" },
});
 
// 2. Evaluate actions within the session
const result1 = await aguardic.evaluate({
  sessionId: session.id,
  input: { tool: "query_database", arguments: { query: "SELECT name, email FROM customers WHERE id = 123" } },
  targetKey: "tool-call",
});
console.log("Action 1:", result1.outcome);
 
const result2 = await aguardic.evaluate({
  sessionId: session.id,
  input: { tool: "send_email", arguments: { to: "external@example.com", body: "Customer details: John Doe" } },
  targetKey: "tool-call",
});
console.log("Action 2:", result2.outcome);
 
// 3. End the session
const summary = await aguardic.sessions.end(session.id, { status: "COMPLETED" });
console.log("Actions:", summary.actionCount, "Tools:", summary.toolsUsed);

Don't have the SDK? See the JavaScript / TypeScript SDK for installation and setup. The rest of this guide shows the equivalent raw fetch calls for reference.

Raw API Example

Here is the same workflow using raw fetch calls if you are not using the SDK.

1. Create the session

const API_BASE = "https://api.aguardic.com/v1";
const API_KEY = process.env.AGUARDIC_API_KEY;
 
const headers = {
  Authorization: `Bearer ${API_KEY}`,
  "Content-Type": "application/json",
};
 
const sessionRes = await fetch(`${API_BASE}/evaluation-sessions`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    externalSessionId: "agent-conv-2025-03-10-abc",
    metadata: { userId: "user-456", channel: "web-chat" },
  }),
});
 
const session = await sessionRes.json();
const sessionId = session.data.id;

2. Evaluate actions within the session

Pass the sessionId to each evaluation request. Aguardic automatically tracks each evaluation as an action in the session.

const eval1 = await fetch(`${API_BASE}/evaluate`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    sessionId,
    input: {
      tool: "query_database",
      arguments: { query: "SELECT name, email FROM customers WHERE id = 123" },
    },
    targetKey: "tool-call",
  }),
});
const result1 = await eval1.json();
console.log("Action 1 outcome:", result1.data.outcome);
 
const eval2 = await fetch(`${API_BASE}/evaluate`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    sessionId,
    input: {
      tool: "send_email",
      arguments: {
        to: "external@example.com",
        subject: "Customer Details",
        body: "Here are the customer details: John Doe, john@example.com",
      },
    },
    targetKey: "tool-call",
  }),
});
const result2 = await eval2.json();
console.log("Action 2 outcome:", result2.data.outcome);

3. End the session

const endRes = await fetch(
  `${API_BASE}/evaluation-sessions/${sessionId}/end`,
  {
    method: "POST",
    headers,
    body: JSON.stringify({ status: "COMPLETED" }),
  }
);
const ended = await endRes.json();
console.log("Session ended:", ended.data.status);
console.log("Final action count:", ended.data.actionCount);

Context-Aware Policy Evaluation

The real power of sessions is context-aware policy evaluation. When you include a sessionId in an evaluation request, the policy engine automatically receives the session's accumulated context:

Field	Type	Description
`actionCount`	`number`	Total number of evaluations run so far in the session.
`dataTags`	`string[]`	Aggregated data categories detected across all previous actions (e.g., `["pii", "financial"]`).
`toolsUsed`	`string[]`	All tools or operations invoked so far in the session.
`recentActions`	`array`	The most recent actions in the session for pattern detection.

This context is available to both deterministic and semantic rules. Here are examples of policies you can build with session context:

Rate limitingBlock if actionCount exceeds 20 in a single session.

EscalationWarn if the same tool is called more than 5 times (repetitive behavior pattern).

Cross-action PII trackingBlock an email send if PII was detected earlier in the session (the dataTags will contain "pii" from a previous action).

Sensitive data exfiltrationBlock if the agent queries a database containing PII and then tries to send an email to an external address within the same session.

Without sessions, each evaluation is independent -- the policy engine has no way to know that the agent queried PII data two steps ago before trying to email it externally.

Best Practices

One session per logical workflowDo not reuse sessions across unrelated workflows. Create a new session for each agent conversation, pipeline run, or batch operation.

Use externalSessionId for cross-referencingMap Aguardic sessions to your own system identifiers (conversation IDs, job IDs, etc.) for easier debugging.

End sessions explicitlyWhile sessions will eventually expire if you set expiresAt, explicitly ending them with COMPLETED or TERMINATED provides a cleaner audit trail.

Use metadata for filteringAdd structured metadata (user ID, channel, environment) when creating sessions to make them easier to find and filter in the dashboard.

Next Steps

Sessions API Reference -- Full API documentation for session endpoints
Evaluate API -- How to send evaluations with session context
Audit Trail -- Investigate violations across sessions
AI Systems -- Register AI systems and link integrations

Your First Policy Audit Trail