Friday, February 18, 2011

High Availability - Terminology (II)

Planned outage/downtime


Planned outages include maintenance, offline backups and upgrades. These can often be scheduled outside periods when high availability is required.

 

Unplanned outage/downtime


While planned outages are a necessary evil, an unplanned outage can be a nightmare for a business. Depending on the business in question and the duration of the downtime, an unplanned outage can result in such overwhelming losses that the business is forced to close. Regardless of the nature, outages are something that businesses usually do not tolerate. There is always pressure on IT to eliminate unplanned downtime totally and drastically reduce, if not eliminate, planned downtime.

Note that an application or computer system does not have to be totally down for an outage to occur. It is possible that the performance of an application degrades to such a degree that it is unusable. As far as the business or end user is concerned, this application is down, although it is available.

Unplanned Shutdowns have various causes. The main reasons can be categorized into:
  • Hardware FailuresFailure of main systems components such as CPUs and memory; or peripherals such as disks, disk controllers, network cards; or auxiliary equipment such as power modules and fans; or network equipment such as switches, hubs, cables, etc., can be the causes of hardware failures.
  • Software Failures – The possibilities of failure of software mostly depends upon the type of software used. One of the main causes for software failure is applying a patch. Sometimes, if a patch does not match the type of implementation, then the application software may start to behave in a strange way, bringing down the application and reversing the changes, if possible. Sometimes, an upgrade may also cause a problem. The main problem with upgrades will be performance related or the misbehaving of any third party products, which depend upon those upgrades.
  • Human Errors – An accidental action of any user can cause a major failure on the system. Deleting a necessary file, dropping the data or a table, updating the database with a wrong value, etc., are a few examples
Read more »

No comments:

Post a Comment