Engineering organisations like Thales rely on large quantities of technical knowledge. When resolving a technical
problem, for example, users have to follow a multi-step procedure in which the steps are described with various
levels of detail, may not be up to date, or may not target the exact problem they are facing. Recent progress in
Large Language Models (LLM) showed capabilities for these models to reason over procedural knowledge but it
is still very difficult to evaluate if these models will be able to support users in executing complex, procedural tasks
in various scenarios.