Availability
{{Short description|Term in reliability engineering}} {{distinguish|Availability (thermodynamics)|Availability heuristic}} {{Redirect|Available}}
In [[reliability engineering]], the term '''availability''' has the following meanings:
- The degree to which a [[system]], [[subsystem]] or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, ''i.e.'' a random, time.
- The probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment.
Normally [[high availability]] systems might be specified as 99.98%, 99.999% or 99.9996%. The converse, '''unavailability''', is 1 minus the availability.
==Representation== The simplest representation of '''availability''' (''A'') is a ratio of the expected value of the [[uptime]] of a system to the aggregate of the expected values of up and down time (that results in the "total amount of time" ''C'' of the observation window)
: A = \frac{E[\mathrm{uptime}]}{E[\mathrm{uptime}]+E[\mathrm{downtime}]} = \frac{E[\mathrm{uptime}]}{C}
Another equation for '''availability''' (''A'') is a ratio of the Mean Time To Failure (MTTF) and Mean Time Between Failure (MTBF), or
: A = \frac{MTTF}{MTTF + MTTR} = \frac{MTTF}{MTBF}
If we define the status function X(t) as
: X(t)= \begin{cases} 1, & \text{sys functions at time } t\ 0, & \text{maintenance} \end{cases}
therefore, the availability ''A''(''t'') at time ''t'' > 0 is represented by
: A(t)=\Pr[X(t)=1]=E[X(t)]. ,
Average availability must be defined on an interval of the real line. If we consider an arbitrary constant c>0, then average availability is represented as
: A_c = \frac{1}{c} \int_0^c A(t),dt.
Limiting (or steady-state) availability is represented byElsayed, E., ''Reliability Engineering'', Addison Wesley, Reading, MA,1996 : A = \lim_{c \rightarrow \infty} A_c.
Limiting average availability is also defined on an interval [0,c] as,
: A_\infty =\lim_{c \rightarrow \infty} A_c = \lim_{c \rightarrow \infty}\frac{1}{c} \int_0^c A(t),dt,\quad c > 0.
Availability is the probability that an item will be in an operable and committable state at the start of a mission when the mission is called for at a random time, and is generally defined as uptime divided by total time (uptime plus downtime).
=== Series vs Parallel components === [[File:Series vs parallel components.png|alt=series vs parallel components|thumb|397x397px|series vs parallel components]]
Let's say a series component is composed of components A, B and C. Then following formula applies:
Availability of series component = (availability of component A) x (availability of component B) x (availability of component C){{Cite book |title=System Sustainment: Acquisition And Engineering Processes For The Sustainment Of Critical And Legacy Systems |year=2022 |isbn=9789811256868 |last1=Sandborn |first1=Peter |last2=Lucyshyn |first2=William |publisher=World Scientific }}{{Cite book |title=Reliability and Availability Engineering: Modeling, Analysis, and Applications |year=2017 |isbn=978-1107099500 |last1=Trivedi |first1=Kishor S. |last2=Bobbio |first2=Andrea |publisher=Cambridge University Press }}
Therefore, combined availability of multiple components in a series is always lower than the availability of individual components.
On the other hand, following formula applies to parallel components:
Availability of parallel components = 1 - (1 - availability of component A) X (1 - availability of component B) X (1 - availability of component C) [[File:System availability chart.png|alt=10 hosts, each having 50% availability. But if they are used in parallel and fail independently, they can provide high availability.|thumb|10 hosts, each having 50% availability. But if they are used in parallel and fail independently, they can provide high availability.]] In corollary, if you have N parallel components each having X availability, then:
Availability of parallel components = 1 - (1 - X)^ N
Using parallel components can exponentially increase the availability of overall system. For example if each of your hosts has only 50% availability, by using 10 of hosts in parallel, you can achieve 99.9023% availability.
Note that redundancy doesn’t always lead to higher availability. In fact, redundancy increases complexity which in turn reduces availability. According to Marc Brooker, to take advantage of redundancy, ensure that:{{Cite book |title=Understanding Distributed Systems, Second Edition: What every developer should know about large distributed applications |isbn=978-1838430214 |last1=Vitillo |first1=Roberto |date=23 February 2022 |publisher=Roberto Vitillo }}
You achieve a net-positive improvement in the overall availability of your system
Your redundant components fail independently
Your system can reliably detect healthy redundant components
Your system can reliably scale out and scale-in redundant components.
=== Methods and techniques to model availability ===
[[Reliability block diagram|Reliability Block Diagrams]] or [[Fault Tree Analysis]] are developed to calculate availability of a system or a functional failure condition within a system including many factors like:
- Reliability models
- Maintainability models
- Maintenance concepts
- Redundancy
- Common cause failure
- Diagnostics
- Level of repair
- Repair status
- Dormant failures
- Test coverage
- Active operational times / missions / sub system states
- Logistical aspects like; spare part (stocking) levels at different depots, transport times, repair times at different repair lines, manpower availability and more.
- Uncertainty in parameters
Furthermore, these methods are capable to identify the most critical items and failure modes or events that impact availability.
=== Definitions within systems engineering === '''Availability, inherent (Ai)''' {{cite web|title=Inherent Availability (AI) |url=https://dap.dau.mil/glossary/Pages/2045.aspx |work=Glossary of Defense Acquisition Acronyms and Terms |publisher=Department of Defense |access-date=10 April 2014 |url-status=dead |archive-url=https://web.archive.org/web/20140413164657/https://dap.dau.mil/glossary/Pages/2045.aspx |archive-date=13 April 2014 }} The probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment. It excludes logistics time, waiting or administrative downtime, and preventive maintenance downtime. It includes [[corrective maintenance]] downtime. Inherent availability is generally derived from analysis of an engineering design:
The impact of a repairable-element (refurbishing/remanufacture isn't repair, but rather replacement) on the availability of the system, in which it operates, equals [[mean time between failures]] MTBF/(MTBF+ [[mean time to repair]] MTTR).
The impact of a one-off/non-repairable element (could be refurbished/remanufactured) on the availability of the system, in which it operates, equals the [[mean time to failure]] (MTTF)/(MTTF + the [[mean time to repair]] MTTR).
It is based on quantities under control of the designer.
'''Availability, achieved (Aa)''' {{cite web|title=Achieved Availability (AI) |url=https://dap.dau.mil/glossary/Pages/1380.aspx |work=Glossary of Defense Acquisition Acronyms and Terms |publisher=Department of Defense |access-date=10 April 2014 |url-status=dead |archive-url=https://web.archive.org/web/20140413164705/https://dap.dau.mil/glossary/Pages/1380.aspx |archive-date=13 April 2014 }} The probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment (i.e., that personnel, tools, spares, etc. are instantaneously available). It excludes logistics time and waiting or administrative downtime. It includes active preventive and corrective maintenance downtime.
'''Availability, operational (Ao)''' {{cite web|title=Operational Availability (AI) |url=https://dap.dau.mil/glossary/Pages/Archived/1476.aspx |work=Glossary of Defense Acquisition Acronyms and Terms |publisher=Department of Defense |access-date=10 April 2014 |url-status=dead |archive-url=https://web.archive.org/web/20130312154509/https://dap.dau.mil/glossary/Pages/Archived/1476.aspx |archive-date=12 March 2013 }} The probability that an item will operate satisfactorily at a given point in time when used in an actual or realistic operating and support environment. It includes logistics time, ready time, and waiting or administrative downtime, and both preventive and corrective maintenance downtime. This value is equal to the mean time between failure ([[MTBF]]) divided by the mean time between failure plus the mean downtime (MDT). This measure extends the definition of availability to elements controlled by the logisticians and mission planners such as quantity and proximity of spares, tools and manpower to the hardware item.
Refer to [[Systems engineering]] for more details
=== Basic example=== If we are using equipment which has a [[mean time to failure]] (MTTF) of 81.5 years and [[mean time to repair]] (MTTR) of 1 hour:
: MTTF in hours = {{math|1=81.5 × 365 × 24 = 713940}} (This is a reliability parameter and often has a high level of uncertainty!)
: Inherent availability (Ai) {{math|1= = 713940 / (713940+1) = 713940 / 713941 = 99.999860% }}
: Inherent unavailability {{math|1= = 1 / 713940 = 0.000140%}}
Outage due to equipment in hours per year = 1/rate = 1/MTTF = 0.01235 hours per year.
==Literature== '''Availability''' is well established in the literature of [[stochastic modeling]] and [[optimal maintenance]]. Barlow and Proschan [1975] define availability of a repairable system as "the probability that the system is operating at a specified time t." Blanchard [1998] gives a qualitative definition of availability as "a measure of the degree of a system which is in the operable and committable state at the start of mission when the mission is called for at an unknown random point in time." This definition comes from the MIL-STD-721. Lie, Hwang, and Tillman [1977] developed a complete survey along with a systematic classification of availability.
Availability measures are classified by either the time interval of interest or the mechanisms for the system [[downtime]]. If the time interval of interest is the primary concern, we consider instantaneous, limiting, average, and limiting average availability. The aforementioned definitions are developed in Barlow and Proschan [1975], Lie, Hwang, and Tillman [1977], and Nachlas [1998]. The second primary classification for availability is contingent on the various mechanisms for downtime such as the inherent availability, achieved availability, and operational availability. (Blanchard [1998], Lie, Hwang, and Tillman [1977]). Mi [1998] gives some comparison results of availability considering inherent availability.
Availability considered in maintenance modeling can be found in Barlow and Proschan [1975] for replacement models, Fawzi and Hawkes [1991] for an R-out-of-N system with [[spare part|spare]]s and repairs, Fawzi and Hawkes [1990] for a series system with replacement and repair, Iyer [1992] for imperfect repair models, Murdock [1995] for age replacement preventive maintenance models, Nachlas [1998, 1989] for preventive maintenance models, and Wang and Pham [1996] for imperfect maintenance models. A very comprehensive recent book is by Trivedi and Bobbio [2017].
==Applications== [[Availability factor]] is used extensively in [[Power station|power plant engineering]]. For example, the [[North American Electric Reliability Corporation]] implemented the [[Generating Availability Data System]] in 1982.{{cite web | url=http://www.nerc.com/pa/RAPA/gads/Publications/GADS---Mandatory%20Reporting%20of%20Conventional%20Generation%20Performance%20Data%20Final.pdf |archive-url= https://ghostarchive.org/archive/20221009/http://www.nerc.com/pa/RAPA/gads/Publications/GADS---Mandatory%20Reporting%20of%20Conventional%20Generation%20Performance%20Data%20Final.pdf |archive-date=2022-10-09 |url-status=live | title=Mandatory Reporting of Conventional Generation Performance Data | publisher=North American Electric Reliability Corporation | work=Generating Availability Data System | date=July 2011 | access-date=13 March 2014 | pages=7, 17}}
==See also==
- [[Dependability]]
- [[Reliability engineering]]
- [[Safety engineering]]
- [[List of system quality attributes]]
- [[Spurious trip level]]
- [[Condition-based maintenance]]
- [[Fault reporting]]
- [[High availability]]
- [[RAMS]]
==References== {{Reflist}}
==Sources== *{{FS1037C MS188}}
- K. Trivedi and A. Bobbio, ''Reliability and Availability Engineering: Modeling, Analysis and Applications'', Cambridge University Press, 2017.
==External links==
- [http://www.eventhelix.com/RealtimeMantra/FaultHandling/reliability_availability_basics.htm Reliability and Availability Basics]
- [http://www.eventhelix.com/RealtimeMantra/FaultHandling/system_reliability_availability.htm System Reliability and Availability]
- [http://www.weibull.com/hotwire/issue79/relbasics79.htm Availability and the Different Ways to Calculate It]
- [https://evocon.com/kb/how-to-track-technical-availability/ How to track and improve Technical Availability?]
{{Authority control}}
[[Category:Telecommunication theory]]
From MOAI Insights

디지털 트윈, 당신 공장엔 이미 있다 — 엑셀과 MES 사이 어딘가에
디지털 트윈은 10억짜리 3D 시뮬레이션이 아니다. 지금 쓰고 있는 엑셀에 좋은 질문 하나를 더하는 것 — 두 전문가가 중소 제조기업이 이미 가진 데이터로 예측하는 공장을 만드는 현실적 로드맵을 제시한다.

공장의 뇌는 어떻게 생겼는가 — 제조운영 AI 아키텍처 해부
지식관리, 업무자동화, 의사결정지원 — 따로 보면 다 있던 것들입니다. 제조 AI의 진짜 차이는 이 셋이 순환하면서 '우리 공장만의 지능'을 만든다는 데 있습니다.

그 30분을 18년 동안 매일 반복했습니다 — 품질팀장이 본 AI Agent
18년차 품질팀장이 매일 아침 30분씩 반복하던 데이터 분석을 AI Agent가 3분 만에 해냈습니다. 챗봇과는 완전히 다른 물건 — 직접 시스템에 접근해서 데이터를 꺼내고 분석하는 AI의 현장 도입기.
Want to apply this in your factory?
MOAI helps manufacturing companies adopt AI tailored to their operations.
Talk to us →