11. Agent cache

11.1. Overview

Each ArmoniK agent maintains a local, file-based cache that avoids redundant fetches from the object storage. When a task depends on a result that was recently consumed by another task on the same node, the agent reads the data directly from disk instead of going back to the remote object storage.

The cache is entirely internal to the agent process. Workers do not have direct access to it.

11.2. Two-level folder design

The agent uses two distinct filesystem paths for data management.

Path (option)

Purpose

Accessible by worker

InternalCacheFolder

Persistent cache shared across tasks and agent processes

No

SharedCacheFolder

Per-task temporary staging folder exposed to the worker

Yes

11.2.1. Internal cache (InternalCacheFolder)

InternalCacheFolder is the actual cache. It is a flat directory where each file is named after the ResultId of the result it contains. Files accumulate across task executions and are only removed when eviction is triggered (see Cache eviction).

Because this folder is just a regular directory on the node, multiple agent processes running on the same physical or virtual node can point to the same path. This makes the cache effectively shared at the node level: if agent A already fetched a result, agent B on the same node will find it in the cache without a round trip to the object storage.

11.2.2. Per-task folder (SharedCacheFolder)

For each task it processes, the agent creates a unique temporary sub-directory under SharedCacheFolder (named after a randomly generated token). This folder holds:

  • The task payload.

  • All data dependencies that the task needs.

  • Any output results produced by the worker.

The worker is given a path to this folder and reads its inputs and writes its outputs there. The agent then reads the outputs back to upload them to the object storage and optionally copy them into the internal cache.

The folder is deleted during agent cleanup after the task completes.

11.3. Cache lifecycle

11.3.1. Pre-processing

Before executing a task, the agent resolves the ResultId of each dependency (payload and data dependencies) and tries to serve them from the internal cache:

  1. For each dependency, the agent looks for a file named <ResultId> in InternalCacheFolder.

  2. If found, the file is copied into the per-task folder so the worker can access it.

  3. If not found, the data is fetched from the object storage into the per-task folder.

  4. Newly fetched files are then copied into the internal cache (via a temporary file to ensure atomicity) so that future tasks on the same node can reuse them.

11.3.2. Post-processing

After the worker reports a successful output, the agent copies every result file produced by the worker from the per-task folder into InternalCacheFolder. Downstream tasks that depend on those results will therefore find them in the cache without hitting the object storage.

11.4. Why workers cannot access the cache

Workers only see the per-task folder. This is intentional:

  • Isolation: a worker should only access the data it is authorised to see for its current task.

  • Simplicity: the cache files are keyed by ResultId, which is an internal ArmoniK identifier. Exposing the raw cache to workers would require them to understand ArmoniK internals.

  • Safety: the cache is shared across tasks and potentially across agents. Allowing a worker to write to it directly could corrupt entries used by other tasks.

11.5. Cache eviction

Eviction is driven by disk usage and happens at the end of every task (during agent disposal).

  1. The agent reads the disk usage of the filesystem that hosts InternalCacheFolder.

  2. If (usedSpace / totalSize) > CacheEvictionThreshold, eviction is triggered.

  3. Files are sorted by their last-access or last-write time (whichever is more recent).

  4. Files are deleted from the oldest to the most recently accessed until the usage falls below the threshold.

Caching is disabled when CacheEvictionThreshold is set to 0 (the default).

11.6. Configuration

The cache is configured through the following environment variables:

Pollster__SharedCacheFolder=/cache/shared
Pollster__InternalCacheFolder=/cache/internal
Pollster__CacheEvictionThreshold=0.8

For a full description of each variable, see the Pollster variables documentation.

11.6.1. Sharing the cache across agents on the same node

To enable node-level cache sharing, mount the same directory on all agent containers and point InternalCacheFolder at it. The cache is purely additive (no coordination protocol is needed): if two agents write the same ResultId simultaneously the first write wins, and either copy is equally valid.

# Agent A and Agent B on the same node both use:
Pollster__InternalCacheFolder=/mnt/node-cache/armonik

NOTE: SharedCacheFolder should not be shared between agents. Each agent must have its own SharedCacheFolder because per-task sub-directories are created and deleted there independently.