One of the most important roles of a Systems administrator or I.T technician is the maintenance of the server room and all equipment that is stored in a server room or comms room. Proper server room maintenance is crucial to protect equipment and sensitive data stored and transmitted between the systems. It’s not just the room that requires maintenance, server and network maintenance is also required and must be assessed on a regular basis. Whether you’re responsible for an array of remote virtual private servers, managing a cloud server farm or the servers for a localised intranet, a maintenance checklist will help you save time and effort in ensuring that your room and the stored systems are running efficiently and at optimal performance.
Your specific maintenance needs of your server room will vary based upon the size and location of the room, how many hours each day it is in operation, as well as the density of the equipment within the rack, airflow between each device and how much heat is generated by equipment. Heat levels vary based on several factors, including how much and what types of equipment operate in a space. Keep equipment well spread apart to provide proper ventilation and help reduce the risk of equipment overheating.
Some server room maintenance only needs to be actioned once a month, while other tasks related to the health of your server and network gear should be completed on a daily or weekly basis. At the time servers are set up it’s important to identify and correct any issues related to the cabling, placement and configuration of the equipment. Proper setup can prevent a lot of common issues while protecting the longevity of equipment.
Server Room Maintenance Tasks
When creating your list of maintenance tasks your goals will focus on reducing unplanned downtime and the risk of downtime, safety and security, improved capability to provide services, and so on. If you know what you are trying to accomplish, then a few simple metrics can tell you whether you are accomplishing your goal—a critical point if you’re trying to fund an initiative of this sort.
When planning your routine, be sure you include physical equipment checks for switches and routers, circuit breakers, power supplies, cabling, HVAC systems, fire detection, and prevention systems. Plans should include certain standard features:
Create your own routine
Its important to have a routine and procedure to follow to ensure that nothing is missed. A checklist to guide you is provided later in this article, but your data centre should develop its own checklist, tuned to your specific needs. Check your equipment manuals to see what intervals and routines are recommended by the manufacturers; some OEMs even provide standardised checklists detailing the preventive maintenance that is optimal for your equipment.
Schedule regular checks
Its important to plan your routine checks regularly. If you are like most teams you will struggle to find a window where you can schedule a small outage as part of your maintenance. Not all of your tasks will require an outage but to properly perform some tasks an outage may be necessary. If you can not find time, pay attention to what windows of time when the fewest people are logged in and using the system—are best to perform maintenance activities.
Don’t wait for an actual failure! Plan your routine maintenance in advance. If you have older machines that function in a small, airless server room, you probably want to perform inspections and cleaning routines more frequently than if you have new equipment in a well-ventilated room.
Address physical and mechanical maintenance.
Most of your server room maintenance tasks will address physical and mechanical maintenance. There will often be a separate checklist for server and network maintenance. For instance, UPS and batteries benefit greatly from regular maintenance to ensure they are healthy and ready to go in case of power outages. Other systems that should be maintained include HVAC, generators, and physical plant items like doors, emergency exits, cabling, etc. Good maintenance in these areas can also reduce power usage, enabling your data centre to run more cleanly and efficiently.
Keep it clean
Dust can be a serious issue in a server room and can create a hazard by blocking airflow, and also by limiting the movement of physical parts. Cleaning shouldn’t wait for your preventive maintenance routine, but it should be a regular part of it.
Make safety part of the routine
Safety is important, not only for yourself and co-workers but its also important to keep a safe environment for your equipment. Server rooms and Data centres have many hazards, especially electrical hazards. Ensure anyone who needs to perform maintenance knows how to do so safely; likewise, ensure a safe environment for people who work there.
Document maintenance plan
If you document your procedures and maintenance history, and the outcomes of each procedure, you’ll ensure maintenance is being implemented according to plan, and you can assess the effectiveness of your overall plan. This information can also be of invaluable assistance in case of an actual system failure—it can help you identify problems, or at least rule out certain issues. Also, review your maintenance history to identify chronic equipment problems and trouble spots. Ensure you have up-to-date inventory.Know what you have, how old it is, and where it is. This will help you execute a preventive maintenance plan efficiently.
Educate your team
Provide ongoing education so that your technicians know what issues can occur, scenarios to watch for rectify if they appear and hazards to look for, and how to avoid them.
IR Scans
If your business has the funds, or if you can hire a vendor, consider having an IR (infrared) scan, which can help identify physical problems. The IR scan specifically looks for unusually high temperatures, which can signal deteriorating equipment due to vibrations, blocked vents, and other problems.