Skip to main content

So, you’re having an incident

Taken from the GDS Way.

Incident priority table

Classification Type Example Response time Update frequency
P1 Critical Complete outage of the entire Form Builder platform, or ongoing unauthorised access. Also all forms are inaccessible. 20 minutes 1 hour
P2 Major Substantial degradation of service. A single public facing form is inaccessible. 60 minutes 2 hour
P3 Significant Users experiencing intermittent or degraded service due to platform issue. 2 hours Once after 2 working days
P4 Minor Component failure that does not immediately impact a service, or an unsuccessful DoS attempt. 1 Working Day Once after 5 working days

MOJ Forms currently does not have any SLA’s in place nor any requirement for out of hours support. The above response times and update frequencies relate to standard working days.

1. Establish an incident lead

Establish who your incident lead is. Find out who noticed the problem and if anyone else is investigating and fixing it. If that person is you, assume the role of incident lead.

2. Inform your team and stakeholders

Inform your team using your chosen tool, like Slack. If the incident involves a data or security breach, notify the Cyber Security team who’ll help you manage the incident. Contact them using the (find out the correct Slack channel to use).

3. Prioritise the incident

Prioritise the incident and start tracking actions, updates and communications. Create a new incident report, which can be found in the Form Builder’s incident folder, and use it to track updates and progress. Name the report Incident Report YY-MM-DD <short title>. In the event of a P1 consider notifying Product OWner at this point.

4. Form an Incident Response team

Form a team with both an incident lead and a communications lead. The communications lead will make sure relevant parties are updated according to the incident priority table.

5. Investigate

Make sure you keep your incident report up to date. If the incident involves a data breach follow your team’s GDPR documentation. If the incident is a data or security breach you should follow steps 6, 7 and 8. If the incident is not cyber security-related, skip to step 9. Remember to update regularly via your chosen communication channels.

6. Contain

You should determine the right containment procedures. In some cases, you may require a forensic clone. Things to consider:

  • Minimise impact to users
  • Maintain availability if possible
  • Take offline if required. If you do this then go to step 7
  • If the incident is nefarious then consider forensic requirements where possible
  • Document all commands used during the investigation and keep the documentation up to date - include how the evidence has been preserved
  • Store any forensic images taken during the investigation in a secure location, to prevent accidental damage or ### tampering

7. Eradicate

Eradication may be necessary to remove components of the incident that remain on your systems, such as traces of malware. To help with eradication you should:

  • identify all affected hosts
  • remove all malware and other artifacts left behind by the attackers
  • reimage and patch the affected system
  • check backups, code, images and the affected systems are protected against further attacks

8. Recover

Recovery is necessary to reduce the impact on user confidence and to reduce the likelihood of further successful attacks.

You should:

  • confirm the affected systems are patched and hardened against the recent attack, and possible future attacks
  • decide what day and time to restore the affected systems back into production (if they were taken offline)
  • check the systems you’re restoring to production are not compromised in the same way as the original incident
  • consider how long to monitor the restored systems for, and what to look out for

9. Communicate to a wider audience

If the incident is serious (P1 or P2) you’ll need to contact a wider MoJ audience and potentially your service users.

Your communications lead must manage:

  • external and internal communications
  • incident escalations

External and internal communications

Make sure internal and external parties, like Cyber Security or your service users are fully informed at every stage of your incident management process. For example, if your team uses the StatusPage service to trigger notifications to subscribed users make sure this is up to date. Post regular updates to the status of an incident in the #form-builder-incidents Slack channel. This helps people across MoJ without having to find and follow multiple notification mechanisms for the different programmes.

Incident escalations

Notify escalation contacts of all high priority incidents (P1/P2). For Form Builder notify the Product Owner.

Report cyber security incidents

The incident lead must inform the Cyber Security team.

10. Resolve the incident

Hold an incident and lesson learned review following a blameless post mortem culture so your service can improve. The incident report itself has a section dedicated to this. Fill out the Form Builder incident log sheet.

This page was last reviewed on 14 October 2021. It needs to be reviewed again on 14 January 2022 .
This page was set to be reviewed before 14 January 2022. This might mean the content is out of date.