Fault Management is the process by which applications / site coding or data problems are detected, diagnosed and fixed in order that normal operations can resume. The stages are:

1st Stage - Detect the Problem

This is when the problem first comes to light. It can be highlighted by monitoring equipment, application logs or error messages. It prevents a certain cause of action. It can be severe and cause loss of service or minor such as unable to action a request.

2nd Stage - Isolate the Problem

The next step is to diagnose the problem. Sometimes the problem is easily isolated but at other times it may take more time. Application bugs do exist and we use certain debugging tools to trace the path through the code and the database to find the cause.

3rd Stage - Inform Client

The third stage is to inform the client that the problem has been detected and diagnosed. A solution is outlined. Sometimes this may result in a Change to the Application / Site and the Change Management Process being adopted. It could also be hardware (Comms, server, memory and hosting) and third party software issues (operating systems live patches, security or viruses, or data software) that need to be resolved.

Final Stage - Resolve the Problem

The final stage is to implement the solution by either fixing the fault or bug, changing the application or adding a temporary solution to allow all to continue as normal until a permanent solution is available. The application / site will then return to normal operation.