Security Incident Management
Security incident management focuses heavily on resolving incidents quickly to ensure that employees and users alike aren’t hit with too much downtime. By identifying, managing, recording and analyzing security threats or incidents in real-time, security incident management provides a robust and comprehensive view of any security issues within an IT infrastructure.
NIST have divided the incident response into the following four steps :
- Preparation – Prepare for handling incidents.
- Detection and Analysis – Identify potential security incidents through monitoring and report all incidents.
- Containment, Eradication and Recovery – Respond to the incident by containing, investigating, and resolving it.
- Post-incident Activity – Learn and document key takeaways from every incident.
But these steps are usually divided into eight steps to have a better view of the incident management.
- Preparation – what has been done to train the team and users to take responsible measures to help to detect and handle the incident. The checklist to handle the incident is also part of the preparation.
- Detection – also called identification phase, this is the most important part of the incident management. The detection phase should include an automated system that checks logs. The users’ awareness about security is also paramount. Time is of the essence.
- Response – also called containment, this is the phase where the team interacts with the potential incident. First step is to contain the incident by preventing it to affect others systems.
- Depending of the situation, the response can be to disconnect the network, shutdown the system, or to isolate the system. This phase typically starts with forensically backing up the system involved in the incident. Volatile memory capturing and dumping is also performed in this step before the system is powered off.
- Depending of the criticality of the affected systems, the production can be heavily affected or maybe even stopped, it is important to have the management’s approval. The response team will have to update the management on the importance of the incident and the estimated time to resolution.
- Mitigation – during this phase, the incident should be analyzed to find the root cause. If the root cause is not known, the restoration of the systems may allow the incident to occur again. Once the root cause is known, a way to prevent the incident from occurring again can be applied.
- The systems can then be restored or rebuild from scratch, to a state where the incident can’t occur again. It is especially important to make sure to prevent this incident from happening to other systems. Changing the firewall rule set or patching the system is often a way to do this.
- Reporting – this phase starts at detection and finishes with the addition of the incident response report to the knowledge base. The reporting can take multiple forms depending on how public the communication is.
- For the non-technical people of the organization, a formatted mail explaining the problem without technical terms and the estimated time to recover. If users are required to take action, it should be clearly explained with supporting screenshots everyone can do it.
- For the technical team, the communication should include details, estimated time to recover, and perhaps the details to the incident response team’s resolution. Maybe a bridge call would have to be done.
- Recovery – during this phase, the system is restored or rebuilt. The business unit responsible for the system only has the ability to decide when the system should go back in production. Depending of the actions taken during the mitigation, it’s possible that there’s still a problem. Therefore, close monitoring is required after the system returns to production.
- Remediation – this phase is done during the mitigation phase. Once the root-cause analysis is over, the vulnerabilities should be mitigated. Remediation starts when the mitigation ends. If the vulnerabilities exist in the system’s recovery image, the recovery image needs to be be generated with the fix applied. All systems not affected by the incident but are still vulnerable should be patched ASAP. It’s important to neutralize the threat in this phase.
- Lessons Learned – this phase is often the most neglected one but it can prevent a lot of future incidents. The incident should be added in a knowledge base, along with steps taken, and if users or members of the response team need additional training. The Lessons Learned phase can improve the preparation phase dramatically.
Security Information and Event Management (SIEM)
SIEM, or Security Information and Event Management, is a comprehensive cybersecurity approach that combines the functionalities of Security Information Management (SIM) and Security Event Management (SEM). SIEM technology collects event log data from a range of sources, from various sources, such as firewalls, intrusion detection systems, and antivirus software. This information is used to identify activity that deviates from the norm with real-time analysis, and some SIEMS can even take appropriate action.
The key components of SIEM include:
- Log Management – SIEM solutions collect and store logs from multiple security devices and applications, providing a centralized log management, analysis, and reporting platform.
- Event Correlation – Event correlation involves analyzing security events and identifying patterns or relationships that indicate potential threats. SIEM solutions use advanced correlation algorithms to detect suspicious activities and generate real-time alerts.
- Threat Detection – SIEM solutions can identify potential security threats, such as malware infections, unauthorized access, and data breaches by collecting and analyzing data from various sources.
- Incident Response – SIEM solutions provide real-time alerts and reporting to help security teams respond to incidents more effectively, enabling them to contain, investigate, and remediate security threats.