Achieving Cyber Resilience through standards-based, machine-processable & executable Incident Response & Business Continuity Playbooks

Fysarakis Konstantinos, PhD; Chief Technology Officer; Sphynx Analytics Ltd, Cyprus

Introduction

Cybersecurity incidents are growing in frequency, complexity & impact, in tandem with the increasing pervasiveness of digital technologies, applications, and services in our societies and economies. There has been a surge in sophisticated and targeted attacks against critical infrastructures, public entities, and other key industries, as well as a rise in the prevalence of hybrid threats that combine cyber and physical elements, as routinely highlighted by recent threat landscape reports (e.g., ENISA’s Threat Landscape Reportss). This is, of course, exacerbated by current geopolitical tensions, making it particularly challenging to manage these complex and systemic cyber risks, especially in light of the growing sophistication and motivation of threat actors.

In this landscape, collaborative defence becomes a necessity. Interoperable cyber defender ecosystems that facilitate the free exchange of information, insights, analytics, and response across tools and teams can be an effective approach to identifying, assessing, and managing these risks and countering the associated threats. Ideally, defenders must be able to prepare, coordinate, automate, document, and share their response methodologies, to the extent possible. 

The above have been some of the key motivators behind relevant EU initiatives, such as the recommendations provided in the European Commission Recommendation on Coordinated Response to Large Scale Cybersecurity Incidents and Crises [1], which aims to provide a structured approach towards shared situational awareness, preparedness and coordinated incident response that also considers and involves all key EU actors involved in response to cybersecurity crises (e.g., ENISA, CSIRTs network, CERT-EU, Europol, Information Sharing and Analysis Centres – ISACs). Still, even the shifting regulatory landscape, albeit towards a positive direction for the European cybersecurity posture, also adds a learning curve and additional complications to the day-to-day activities of organisations that, for example, fall under the provisions of NIS2.

These have been core considerations when designing the relevant components of the PHOENI2X framework, as detailed in our first PHOENI2X blog post and the PHOENI2X position paper [2], in line with our recently-proposed Blueprint for Collaborative Cybersecurity Operations Centres with Capacity for Shared Situational Awareness, Coordinated Response, and Joint Preparedness [3]. 

PHOENI2X Resilience Orchestration, Automation & Response

A key objective of PHOENI2X has been the design & development of Resilience Orchestration, Automation and Response mechanisms, encompassing proactive and reactive business continuity, recovery and incident handling tasks. In more detail, to facilitate the cyber resilience of critical sectors, PHOENI2X provides proactive & reactive business continuity, recovery & incident response, integrating Resilience Orchestration, Automation and Response (ROAR) capabilities. 

To enable these capabilities and enhance the efficiency and effectiveness of OES cyber operations, machine-executable PHOENI2X Resilience Playbooks (RPs) are being defined, leveraging a structured format to encode anything from daily, repetitive, time-consuming tasks (e.g., alert triage, security assessments) to advanced business continuity and cyber & physical incident handling strategies (e.g., self-healing business processes, infrastructure auto-recovery), embedding Alerting, Reporting & Information Exchange actions. 

In more detail, the ROAR component of PHOENI2X is built upon the SPHYNX Incident Response (IR) tool; a system that enables the manual or automated execution of Collaborative Automated Course of Action Operations (CACAO [4]]) security playbooks. The CACAO standard was developed to satisfy the above-defined requirements for cyber-defender collaboration, providing a common machine-readable framework and schema for documenting cybersecurity operations processes, including defensive tradecraft and tactics, techniques, and procedures.

The ROAR system offers a graphical drag-n-drop interface for creating and editing CACAO security playbooks, that can later be executed or exported as CACAO JSON files following the CACAO specification. In essence, this component acts as a security orchestration, automation, and response (SOAR) solution supporting the prevention, detection, investigation, and response to cybersecurity attacks, offering: user-friendly specification and testing of incident response CACAO playbooks via a graphical editor that requires no coding; automated importing and execution of CACAO playbooks specified by 3rd parties; continuous run-time monitoring and analysis of CACAO playbook execution; orchestration of external tools as part of CACAO playbook executions; end-user notifications and probing, and; extensible integrations with other essential tools (e.g., ticketing & notification systems, security appliances).

Figure 1 shows how a playbook (in this case a preventative playbook related to the FuzzyPanda malware) can be defined within the ROAR GUI. The use of ROAR & CACAO-based playbooks ensures that a defined playbook can be shared between organisations and, when customised, to be executed as needed.

A diagram of a software program

Description automatically generated
A yellow tag with black text

Description automatically generated

Figure 1. Preventative playbook concerning the FuzzyPanda malware workflow (top) & ROAR representation (bottom).

The ROAR-defined playbooks can be as complex as needed, interacting with security appliances, messaging & ticketing systems, or even integrating other playbooks within their workflows, as supported by the CACAO specification. More details on the playbooks are provided below.

PHOENI2X Resilience Playbooks

Fundamental to the PHOENI2X concept and the framework’s ROAR capabilities described above, are the Resilience Playbooks (RPs) themselves. 

RPs provide a structured, machine-processable encoding of a sequence of actions comprising the organisation’s business continuity, recovery, and incident response operations. Each action represents a fundamental activity (e.g., add a rule to a firewall). Thus, through RPs, organisations are able to specify, automate the execution (via the purpose-built execution and orchestration engine), monitor the progress, and assess the effectiveness of all their business continuity, recovery, and incident response-related processes.

Furthermore, RPs are: adaptable and contextualizable using inputs from the AI-assisted Situational Awareness, Prediction & Response capabilities of PHOENI2X and post-incident analyses; customisable to the intrinsic regulatory requirements, policies, and IT and OT environment of each OES; shareable across organisation boundaries at machine-speed, to augment the preparedness and coordinated response capabilities of all actors; supporting what-if analyses, with the specification of orchestrations involving components and capabilities that are not (yet) present in the organisation, to drive decision making; and translatable to support assessment and training in a realistic, simulated/emulated cyber range environment.

To achieve the above, RPs in PHOENI2X adopt the CACAO standard, as mentioned. CACAO is a cybersecurity-specific schema and taxonomy for creating, documenting, and sharing playbooks in a structured and standardized format across organizational boundaries and technological solutions. Figure 2 illustrates the composition of a CACAO playbook. Briefly, a CACAO playbook comprises metadata, workflow steps with control logic, a set of commands to be performed, targets and agents that perform the commands, data markings that specify the playbook’s handling and sharing requirements, and extensions that introduce functionality ad-hoc. For integrity and authenticity, CACAO playbooks can be digitally signed and countersigned. The signature design supports incorporating the signature in the playbook or releasing it separately as a detached signature.

A screenshot of a computer

Description automatically generated

Figure 2. CACAO playbook structure (source: OASIS)

CACAO defines various playbook types to support different cybersecurity-related operational roles and functions, including notification (proposed and developed in the context of PHOENI2X), detection, investigation, prevention, mitigation, remediation, attack, and engagement. 

In PHOENI2X, specifically, to cover all business continuity, recovery, and incident response aspects and the corresponding PHOENI2X prevention, preparedness, response and recovery capabilities, the following types of playbooks are supported and specified:

  • Business Continuity Playbooks, focusing on business processes and maintaining required service levels, including Proactive Business Continuity Plan (pBCP) Playbooks, Reactive Business Continuity Plan (rBCP) Playbooks, and Disaster Recovery Plan (DRP) Playbooks.
  • Incident Response Playbooks, focusing on the technical side of cyber resilience, including Prevention Playbooks, Security Assessment Playbooks, Detection Playbooks, Mitigation Playbooks, Remediation Playbooks, and Investigation Playbooks
  • Scenario (What-if analysis) Playbooks, focusing on the encoding of orchestrations involving capabilities and assets that are not yet part of the organisation, to be assessed via the Cyber Range features of the platform, providing insights about these scenarios and their (cost)effectiveness, to inform decision making.
  • Alerting, Reporting & Information Exchange Playbooks, focused on the notification and reporting (e.g., alerting another PHOENI2X instance about a security event, generation of report to be sent to National Authority) as well as the sharing of information (e.g., anything from IOCs, to RPs, and relevant AI models).

The above types of playbooks, in relation to the incident occurrence timeline, are shown in Figure 3.

A diagram of a diagram of a diagram

Description automatically generated

Figure 3. Different types of Resilience Playbooks envisioned in PHOENI2X.

It is worth mentioning that the consortium, with the University of Oslo leading relevant efforts, has developed and open-sourced JSON validation schemas for CACAO Version 2.0. The validation schemas are hosted at the official GitHub of the OASIS CACAO technical committee and are considered the de facto validation mechanism for all implementations.

Similarly, through our implementation we identified that the way we design and visualize playbooks inherently supports understandability and readability. However, the CACAO specification does not provide any standard approach in graphically representing playbooks. As a result, sharing and importing CACAO playbooks across solutions leads to inconsistent visualization. On that regard, the consortium led the development of a new technical specification within the OASIS CACAO technical committee, namely, the CACAO Layout Extension. The Layout Extension allows to represent CACAO playbooks accurately and consistently across implementations. The technical specification has undergone through a public review and is currently in the process of being approved as a committee specification. Design and implementation details about the CACAO layout extension can be found in the technical specification.

Current Status & Next Steps

The efforts in the first cycle of project focused on adapting & extending the underlying SPHYNX IR tool to support the full set of capabilities envisioned for the ROAR component of PHOENI2X and the relevant interaction & integrations (e.g., specification & execution of baseline set of RPs, basic integration with other components, such as SPA, CTI, SIEM & other baseline toolset components, the exact interactions & associated interactions mechanisms with other PHOENI2X components). Additional interfacing and reporting capabilities were built into ROAR to facilitate the integration with the rest of the PHOENI2X components, not just for integrations that support the execution of Playbook actions, but also at the front-end level.  An initial integration of ROAR with the PHOENI2X dashboard (based on the Forensics Visualisation Toolkit of partner AEGIS) is presented in Figure 4 below.

A screenshot of a computer

Description automatically generated

Figure 4. Integration of ROAR actions onto the Forensics Visualisations Toolkit -based dashboard of PHOENI2X.

The execution capabilities focused on supporting the execution of the RPs (IR-focused RPs, mainly) for the first demonstrators of the project across all 3 use cases, as well as providing the foundations for the extended capabilities that will be pursued during the second cycle of the project.

Considering the latter, the focus will now shift to the support of additional types of RPs, but also mainly of the novel types of playbooks introduced in PHOENI2X, i.e., the Business Continuity playbooks. Further additional integrations with external systems will be pursued, to be showcased in the (even more complex) final demonstrators of PHOENI2X

In terms of RP in particular, for the first cycle of the project the emphasis has been on the delivery of Incident Response playbooks to support the use cases and the relevant demonstrators, with some preliminary work on the more novel application of playbooks for Business Continuity, while for the second cycle additional IR playbooks will be designed and tested, while also delivering the required BC-focused & scenario playbooks as well.

The initial set of incident response playbooks covers each of the core categories, i.e. (i) Prevention, (ii) Security Assessment; (iv) Detection; (v) Mitigation; (vi) Remediation; and (vii) Investigation, took place during the first cycle of the project. Further, these were demonstrated & validated, as they were tailored to the demonstration scenarios of each of the project’s use case demonstration scenarios (e.g., see Figure 5), as part of the assessment of the first integrated version of the PHOENI2X framework.

A diagram with text and a few rectangles

Description automatically generated with medium confidence

Figure 5. IR playbook used in Energy Use Case, to isolate attacker via SDN Controller.

Future work in terms of IR Playbooks includes the definition of the final set of incident response playbooks, covering all core IR playbook categories shown in Figure 3, as well as the demonstration & validation of IR-focused playbooks covering all final demonstrator scenarios, as part of the final demonstration & validation activities of the integrated PHOENI2X framework.

Considering the BC-focused playbooks, work in the first cycle of the project includes the identification of business continuity intricacies and requirements for each of the use case environments, through dedicated business continuity workshops (lead by partner APIROPLUS Solutions). These activities supported the development of the initial high-level BC plan activation playbook (see Figure 6), based on established standards (namely following ISO 22301 for the overall process, 27035 for incident management & 22317 for the Business Impact Assessment), forming the baseline for the use case -specific playbooks. 

A diagram of a graph

Description automatically generated

Figure 6. High-level BC process playbook

Future work in terms of BC Playbooks will focus on defining the final set of business continuity playbooks, covering the intricacies and requirements of each of the individual use case environments of PHOENI2X, and the four different types of playbooks supported (as shown in Figure 3). These will be demonstrated & validated (leveraging the Resilience Orchestration, Automation & Response – ROAR – enabler) in the context of the final integrated PHOENI2X framework deployments and associated use case demonstrators.

References

  1. “Commission Recommendation (EU) 2017/158,. 13 Sep 2017 on coordinated response to large-scale cybersecurity incidents and crises”, https://op.europa.eu/en/publication-detail/-/publication/e7f7a728-9cff-11e7-b92d-01aa75ed71a1
  2. Fysarakis, K., Lekidis, A., Mavroeidis, V., Lampropoulos, K., Lyberopoulos, G., Vidal, I. G. M., … & Koufopavlou, O. (2023, July). PHOENI2X–A European Cyber Resilience Framework With Artificial-Intelligence-Assisted Orchestration, Automation & Response Capabilities for Business Continuity and Recovery, Incident Response, and Information Exchange. In 2023 IEEE International Conference on Cyber Security and Resilience (CSR) (pp. 538-545). IEEE.
  3. Fysarakis, K., Mavroeidis, V., Athanatos, M., Spanoudakis, G., & Ioannidis, S. (2022, December). A Blueprint for Collaborative Cybersecurity Operations Centres with Capacity for Shared Situational Awareness, Coordinated Response, and Joint Preparedness. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 2601-2609). IEEE.
  4. OASIS, Collaborative Automated Course of Action Operations (CACAO) Security Playbooks Version 2.0, Nov. 2023. https://docs.oasis-open.org/cacao/security-playbooks/v2.0/cs01/security-playbooks-v2.0-cs01.html