Performing routine checks like checking your smoke alarms in your home and making sure your car has been serviced at regular intervals are extremely important. Equally as important is the routine maintenance and checks on your information technology systems. Most organisations already have monitoring in place but periodic maintenance must also take place. So whats the difference between monitoring and performing maintenance? Even if your a constantly monitoring your equipment you might ask yourself, why should you also do maintenance?.
Monitoring is something you usually do in real time and you will always continue to do. Maintenance is something that would typically have a checklist attached to a process and will be done on regular intervals. A maintenance checklist is something that should always be done to ensure that everything is checked off and not missed. In this blog we will go through setting up a solid Server and Network Maintenance schedule and what I would include in my Maintenance checklist on a daily, weekly, monthly and yearly basis. I would also recommend that you create a calendar and make sure that these maintenance tasks are completed. These suggestions that are outlined below are aimed to be as general as possible but can form a basis upon the creation of your own maintenance checklist.
- Physical check of your equipment. Every morning go into your server room and take a look and make sure that there is no red or amber lights, no weird sounds or smells coming from your equipment.
- Check backups and replications.
- Check Windows services on all your Windows servers
- Test backups. Make sure you can restore a VM from scratch successfully
- Check application and system event logs.
- Check and delete temp files.
- Also if you have remote desktop roaming profiles delete those profile that have not unloaded correctly.
- Go to your endpoint protection and purge out inactive machines.
- View your IIS logs and if all is good it might be an idea to safely purge them.
- Look for critical windows updates and install them.
- Check your SANS volumes and make sure you haven’t over provisioned and all your volumes are healthy.
- Check your UPS. jump onto the management console and make sure you are not getting close to your max utilisation, and all the batteries are all good. Enter your battery renewal date into your Maintenance Calendar. This is usually every 3 years to replace batteries.
- Check Backup retention
- Restart all servers not restarted in the past 6 months
- Check your warranties and support contracts for your business critical hardware and software. Review and understand support agreements for Hardware and software
- Test DR procedures and documentation
- Update network Diagrams
- Examine performance metrics against baselines
- Audit and reset Service accounts passwords
- Perform dcdiag tests on all DC’s in all domains
- Check the utilisation and latency on your remote sites or VPN connections and also on your main WAN connections.
- Record performance baselines – WAN Connection sped and average utilisation. Ping latency to remote offices. NGFW throughput
- Check firmware on your Switches and Firewall, raid controller, SANs servers firmware’s and Bios and check date and time is still set correctly
- Endpoint protection – Check dashboard and logs. Check logs and licences. Check infected machines.
- Check security logs on servers and firewalls. Look for unauthorised access attempts.
- Review the firewall policies – access or NAT policies. Some may be redundant and needs a clean up. There may be some ports open that are no longer needed.
- Review security groups in AD and check membership of the main groups.
- Check the health of your Domain controllers and make sure your domain controllers are replicating successfully. Also, perform a manual replication.
- Check account lockouts for suspicious activity.
- Check disk space availability of File servers.
- Check administrative and sensitive group membership.
- Verify time configuration.
- Run Group policy Infrastructure Status reports.
- Check default Computers account and move computer accounts to the relevant OU.
- Check for inactive user or computer account.
- Review release security updates and install if needed. (Approve in WSUS).
- Review Disabled/Locked out/ Expiring and expired user accounts.
- review upcoming expiring certificates.
- Check the group policies. Remove any policies no loner needed. Run the GPRESULT command on a machine and inspect roup policies being applied.