War Room Incident Management: Playbook Overview
Create Incident War Room Playbook: An Overview
A customer-impacting incident can mean all-hands-on-deck until the issue is mitigated. Typically, you’d have a runbook or a wiki that you’d follow to begin assessing the impact, triage, analysis, etc. Somewhere in that process, you have steps to set up team communications for the incident at hand. This involves more basic, administrative-type tasks that someone would need to complete before you can really get the ball rolling on resolving the issue. For example, you might open a war room in Slack, create a Zoom meeting, page the on-call person, and possibly more non-technical processes related to declaring an incident. Most of the time the person doing this work is the same highly skilled person that will do the technical work associated with the incident.
Automate War Room Setup
StackPulse automates those manual, administrative tasks for you with what we call a Playbook. A Playbook can be fired off automatically when a Trigger is met. In this scenario, the Trigger is declaring an incident using our Incident Management functionality. This enables your team to respond quickly to the incident and start working towards remediation sooner.
The Create Incident War Room playbook can be taken off the shelf from our GitHub Playbook repository and brought into your StackPulse tenant. Be sure to check out all the Playbooks we have available for anyone to use. These range from Incident Management and Orchestration to Alert Enrichment and System Diagnostics. The example we’re going to look at today falls into the Incident Management category.
How it Works
Let’s look at this Playbook in more detail and see how it can save you valuable time when it matters most. For this Playbook, we will interact with a few different platforms, Slack, Zoom, PagerDuty, and StackPulse for declaring the incident and running the Playbook.
A playbook can be run on demand, or run automatically based on a Trigger. The Trigger could be something like a monitoring event from Datadog received by StackPulse. In this case, the trigger is creating an incident within StackPulse.
Now that we have created an incident, the Create War Room Playbook will run automatically.
First, a Zoom meeting and Slack Channel is created, then the reporter of the incident is invited to the Slack channel.
The Slack channel is then attached to the Incident that was created in StackPulse.
Next, StackPulse will post in the Slack channel a link to the incident in StackPulse, and a link to the Zoom meeting.
Lastly, the StackPulse Slack Bot will reach out to the Incident reporter over Slack, asking if they’d like to page the on-caller. If they choose Yes, an incident is created within PagerDuty.
As you can see this is a relatively simple Playbook. However, these ordinary tasks are part of a very important process when working an incident. Getting these things out of the way quickly and automatically paves the way for quicker resolution — allowing your team to focus on the incident itself instead of building out communication channels.
The Create Incident War Room playbook is only one of many out-of-the-box Playbooks you can find in our Playbook repository. Don’t see one for your use case? Get started building your own today, or get in touch with our team to see how we can help.
- Check out more pre-built Playbooks to help you save time and deliver more reliable services
- Learn more about the benefits of code-based, executable Playbooks for incident response.
- Get started with the free edition of the StackPulse Reliability Platform