## Abstract

In general, the fatigue life of a safety critical pressure component is estimated using best-fit fatigue life curves (S-N curves). These curves are estimated based on underlying in-air condition fatigue test data. The best-fitting approach requires a large safety factor to accommodate the uncertainty associated with large scatter in fatigue test data. In addition to this safety factor, reactor component fatigue life prognostics requires an additional correction factor that in general is also estimated deterministically. This additional factor known as the environmental correction factor *F*_{en} is to cater the effect of the harsh coolant environment that severely reduces the life of these components. The deterministic *F*_{en} factor may also lead to further conservative estimation of fatigue life leading to unnecessary early retirement of costly reactor components. To address the above-mentioned issues, we propose a data-analytics framework which uses Weibull and Bootstrap probabilistic modeling techniques for explicitly quantifying the uncertainty/scatter associated with fatigue life rather than estimating the lives based on a best-fit based deterministic approach. We assume the proposed probabilistic approach would provide the first hand information for assessing the maximum and minimum effects of pressurized water reactor water on the reactor component. In the discussed approach, in addition to the probabilistic fatigue curves, we suggest using a probabilistic environment correction factor *F*_{en}. We assume the probabilistic fatigue curve and *F*_{en} would capture the S-N data scatter associated with the bulk effect of material grades, surface finish, strain rate, etc. on the material/component fatigue life.

## 1 Introduction

The fatigue lives of various reactor components are estimated based on the ASME design curve. The design curve in the ASME Boiler & Pressure Vessel Code Section III is based on the best-fitting curve of in-air data [1]. The ASME design curve was conservatively developed to consider some uncertainty effects (e.g., data scatter associated with test specimen surface finish, material grade and heat, test temperature, etc.). When the pressurized water reactor (PWR) and the boiling water reactor (BWR) were originally designed, it was required to estimate the fatigue lives of those reactor components using the ASME's in-air design curves. However, the new regulatory requirements mandate to consider the harsh effect of the reactor coolant on material fatigue performance. This new requirement is a necessity particularly for increasing the lives of PWR and BWR beyond their original design lives of 40 years. This necessitates considering the effect of the reactor coolant environment, for predicting the fatigue life of PWR or BWR components that are subjected to the coolant environment [2,3]. To consider the effect of the coolant environment, an *environmental correction factor**F*_{en} has to be used to correct the predicted fatigue lives based on the in-air fatigue design curve. However, the best-fit fatigue design curve and *F*_{en} factor are deterministically used for fatigue life prediction of reactor components without explicitly modeling the scatter/uncertainty associated with the fatigue test data set.

The scatter in fatigue data points are often resulted from the use of different grades, surface finish and heat of material used for test specimens, different test temperatures, strain rates, etc. The use of a deterministic or best-fit fatigue design curve along with the deterministic environmental correction factor may not alone estimate the life of reactor components accurately due to possibly different material grades, heat, surface finish, and varying operating temperature and loading rates (associated stress/strain rates) of actual component compared to the material grades, heat, surface finish, test temperature, and stress/strain control rates of the laboratory test specimens. Note that the deterministic design curves and the environmental correction factor are estimated based on fatigue test data of polished laboratory scale small specimens. Hence, probabilistic estimation of the fatigue life curve and environmental correction factor might capture the bulk effect of uncertainty associated with material grades, heat, surface finish, test temperature, etc. of components, without explicitly factoring out the individual effects which sometime is not only highly complex but also impractical (for example, while factoring out the effect of surface finish in the test specimen versus the actual component). It is practically impossible to conduct hundreds of test associated with different conditions and parameters linked to many varying conditions of the actual structural system. For example, a reactor component can be subjected to hundreds of different strain rates associated with different thermal–mechanical transients. To model statistically the S-N curve meaningfully, it may require conducting hundreds of fatigue tests for each and every transient associated with different strain rates. Same is the case for other parameters (e.g., metal grade, e.g., 316ss versus 304ss, surface finish, etc.). For example, it may not be practically possible to model separately the grade effect of 316ss and 304ss and use those models for reliability modeling of reactor components (with varied stainless-steel grades manufactured by different vendors), although it is ideally possible to incorporate the effect of various parameters and their interactions if the relevant test data are available in sufficient numbers, (considering the focus of this paper is statistical/probabilistic modeling). Hence, given the limitation of available underlying data sets, it is not a practical idea to develop a meaningful statistical model with respect to individual parameters (such as strain rate, surface finish, material grades, etc.). In this paper, rather we present a probabilistic modeling framework that captures the bulk effect of most of those fatigue-affecting parameters, while only considering strain amplitude as the only one independent variable. This type of assumption is not uncommon considering the basic ASME code best-fit or design curve does not segregate the effect of surface finish, metal grades, etc. rather considers their bulk effect by considering the strain/stress amplitude as the only independent variable. Through this paper, we only suggest to use a probabilistic S-N modeling framework in fatigue life modeling rather than simply depending on the best-fit type modeling approaches. We anticipate this preliminary framework can have practical significance in the context of probabilistic modeling of component reliability, although there is a high potential to further improve this modeling framework.

In the present paper, we propose a data-analytics-based probabilistic framework that directly uses the fatigue data under in-air and PWR water conditions rather than using the best-fit in-air curve and the deterministic environmental correction factor. The probabilistic approach has the following two major advantages:

The probabilistic model can be used to quantify the safety margin as a level of failure probability.

The reliability of the resulting model/approach can be quantified and visualized.

The proposed data-analytics-based probabilistic framework discusses two distinct steps/concepts:

The Weibull probabilistic model [4–14] to explicitly quantify the scatter and associated confidence band associated with the underlying S-N data. As an illustrative example, we estimated the probabilistic fatigue life based on the Weibull distribution and using the stainless-steel fatigue life data (both in-air and PWR water conditions) available in the literature [3]. The Weibull probabilistic modeling underlying parameters were estimated using the maximum likelihood estimation (MLE) technique.

The Bootstrap probabilistic model [14–16] to explicitly quantify the modeling error (if any). For example, after the estimation of the previously-mentioned Weibull-based model, it is also important to quantify the uncertainty (or modeling error) in the resulting model or estimates.

## 2 Probabilistic Fatigue Life Modeling

### 2.1 Underlying Fatigue Life Data and Associated Best-Fit S-N Curve.

In the discussed example cases, we used the fatigue test data of austenitic stainless steel available in the literature [3]. The test temperature ranged from 100 to 325 °C for high temperature (HT) air and PWR water conditions, and the strain rate data ranged from 10^{−5} to 0.3%/s for the PWR water condition [3]. These fatigue life data consist of several stainless-steel grades, such as 316 and 304. Figure 1 shows the extracted fatigue life data for the in-air and PWR water conditions.

*ɛ*

_{a}) and fatigue life in RT air (

*N*

_{Air−RT}) for austenitic stainless steel. This curve can be expressed as follows:

In Eq. (1), the strain amplitude of 0.112 (having a unit of %) implies the *endurance limit* below which the fatigue failure does not occur. The ASME design curve (i.e., dashed black lines in Fig. 1) is calculated using an adjustment life factor of 12 and stress/strain factor of 2 as mentioned in the literature [2].

### 2.2 S-N Data Scatter Modeling Using the Weibull Probabilistic Technique.

*ɛ*

_{a}and

*N*

_{A}(or

*N*

_{W}) at high temperatures would be similar to that between

*ɛ*

_{a}and

*N*

_{Air−RT}(i.e., Eq. (2)), where

*N*

_{A}and

*N*

_{W}indicate the fatigue life in HT air and PWR water conditions, respectively. Therefore, we assumed the following relationship:

*a*)

*b*)

*θ*

_{A1},

*θ*

_{A2},

*θ*

_{A3},

*θ*

_{W1},

*θ*

_{W2}, and

*θ*

_{W3}are unknown parameters that should be estimated from the data (e.g., using strain versus life data given in the literature [3]). To note that

*θ*

_{A3}and

*θ*

_{W3}are unknown parameters (which are to be estimated) and in some way related to the endurance limit of fatigue.

Before starting the probabilistic model, we first used the functional or polynomial fitting based approach to model the S-N data. This is to check whether or not the conventional polynomial fitting approach can model the entire S-N curve. However, we found the functional fitting method based best-fit curve is hard to pass through the regions near or at endurance limit (see Fig. 2). This behavior is as expected since the functional fitting based methods always try to best fit through the maximum number of data points. Since not many data points are available near or at endurance limit, hence the best-fit curve does not pass through those regions.

*a*)

*b*)

*F*

_{A}and

*F*

_{W}are the cumulative distribution functions (CDFs);

*β*

_{A}and

*β*

_{W}are shape parameters; and

*η*

_{A}and

*η*

_{W}are the scale parameters of the Weibull distribution for the HT air and PWR water conditions, respectively. The parameter

*β*determines the aging behavior of the material. For example, the failure rate (or hazard function) increases if

*β*> 1, which indicates that the material undergoes time-dependent degradation. In most cases,

*β*is usually considered as a material constant. Therefore, we assume that

*β*

_{A}and

*β*

_{W}are not influenced by

*ɛ*

_{a}.

*η*denotes a characteristic fatigue life having a unit of cycles. It is the quantile at which the CDF of the Weibull distribution reaches approximately 0.632 (i.e., 1 −

*e*

^{−1}). When

*β*= 1, then

*η*is exactly the same as that of the expected value of the Weibull distribution. Because

*N*

_{A}and

*N*

_{W}are assumed to be random variables, we can choose the characteristic life

*η*

_{A}and

*η*

_{W}as representative values of fatigue life. Therefore, Eqs. (3

*a*) and (3

*b*) can be rewritten as

*a*)

*b*)

Thus, in this case, we can summarize the objectives of model estimation as follows:

*HT air case*:Estimate four parameters (

*β*_{A},*θ*_{A1},*θ*_{A2},*θ*_{A3}) with 96 exact data including two variables (*N*_{A},*ɛ*_{a}).*PWR water case*:Estimate four parameters (

*β*_{W},*θ*_{W1},*θ*_{W2},*θ*_{W3}) with 199 exact and four right-censored data including two variables (*N*_{W},*ɛ*_{a}).

The blue solid lines in Fig. 3 show the estimated relationship between *ɛ*_{a} and *η* (i.e., Eqs. 5(*a*) and 5(*b*)), and the 90% confidence bounds of the fatigue failure probability described by the modified Weibull distribution (i.e., Eqs. (A2*a*) and (A2*b*) in the Appendix). This figure shows that the MLE model fits all data sets well and exhibits a trend that is similar to the reference ANL [2] and ASME [1] curves. It is shown that the confidence interval of the strain amplitude becomes narrower when the corresponding fatigue life of prediction is getting longer. This behavior is naturally understandable considering that the lower strain amplitude is closer to the endurance limit. To note that when a material life is near to its endurance limit, the material behavior is usually elastic and hence relatively deterministic. In addition, the original experiment data show the narrowing down of scatter (refer Fig. 2) as the strain amplitude decreases. The discussed probabilistic model just captures the behavior of the original experiment data which is along the line of fundamental physics associated with the endurance limit. Therefore, we can conclude that the estimated Weibull-based fatigue model is plausible.

Regarding HT air conditions, the resulting probabilistic fatigue life model and ANL RT air mean curve are very similar, although the temperature data ranges between the two models are different (e.g., the HT air data temperature range is 100–325 °C). This implies that the temperature effect could be ignored in the air environmental condition, at least below 325 °C. Meanwhile, for the case of PWR water (refer Fig. 3(b)), it is clearly shown that the resulting probabilistic fatigue design curves (mean and confidence bounds) do not encompass the in-air mean curve within the 90% confidence band. Therefore, it is necessary to consider the environmental effect under PWR water conditions (either by using the usual deterministic *F*_{en} approach or by using the proposed probabilistic model discussed in this paper).

### 2.3 Modeling Error Estimation Using Bootstrap Probabilistic Technique.

We presented a Weibull-based probabilistic model and estimated its parameters in Sec. 2.2. As a next step, we will quantify the estimated model uncertainty using the *Bootstrap* method, which is a powerful method for quantifying the uncertainty of model estimators [16]. Through this method, it is possible to quantify the uncertainty using only experimental data and without any pre-assumption of the distributing function of the estimation uncertainty (e.g., considering Gaussian distribution). Therefore, this method can be applied when a parametric interval estimation is complex [14].

Before the application of the Bootstrap method, it should be noted that the confidence interval of the model (estimated by the MLE method) and estimation uncertainty (model error) are different. The confidence interval does not care about the reliability (or applicability) of the estimated model itself. For example, as mentioned in Sec. 2.2, the narrow confidence interval at the low strain amplitude is caused by the endurance limit of the estimated model. However, in the MLE estimation step, we do not care about the accuracy of the estimated endurance limit. That is why we used the Bootstrap method to explicitly quantify the modeling error (if any).

The Bootstrapping procedure can be simply divided into two parts: (1) *Bootstrap sampling* and (2) *Bootstrap estimation*.

A total of 200

*Bootstrap sample sets*are generated from the sample set. The sample set refers to the fatigue life data set used in probabilistic modeling of each environmental condition.Draw a random sample (e.g., a pair of

*ɛ*_{a}−*N*_{A}data) from the sample set.Iterate Step i up to the sample set size (e.g., 96 times for HT air) with replacement. This is a one bootstrap sample set.

Iterate Steps i and ii 200 times to generate 200 bootstrap sample sets.

A total of 200

*Bootstrap estimates*are estimated from each Bootstrap sample set using the MLE method described in Sec. 2.2.

Figure 4 shows the estimates from the original sample set and bootstrap sample sets in each case. It is confirmed that all of the bootstrap estimates converged during the numerical estimation step.

The Bootstrap estimates in Fig. 4 are sufficiently distributed near the sample set estimates without bias. In this case, it seems that the uncertainties in the above eight estimators likely follow a Gaussian distribution. However, one should note that the relationships among the estimators are dependent. Therefore, discussing the uncertainty of each estimator separately is not meaningful.

In Fig. 5, we draw all of the 200 bootstrap estimation curves to illustrate the relationship between *ɛ*_{a} and *η* for the HT air and PWR water cases in order to illustrate the estimation uncertainty of Eqs. (5*a*) and (5*b*). It is known that the bootstrap uncertainty can be approximated to its real estimation uncertainty with a sufficiently large sample size and bootstrapping iterations [16]. In Fig. 4, the uncertainty in the estimated model is represented as the width of light blue bands. Thus, it can be concluded that the model estimation is uncertain when the concerned strain amplitude is too low or too high, basically in the regions where not many experimental or sample data point is available.

With respect to Fig. 5, it should be noted that the confidence interval of the model (estimated by the MLE method) and estimation uncertainty (visualized by the Bootstrap method) are different. For example, the data scatter (in experimental S-N data) is modeled through the Weibull probabilistic model and the associated uncertainty in data scatter can be visualized through the confidence band. Whereas the bootstrap model explicitly models the uncertainty or error associated with the model. For example, lower availability of a number of experimental data points can lead to higher modeling error or modeling uncertainty. The two blue dashed lines in Fig. 5 represent the probabilistic model confidence bound associated with S-N data scatter. Whereas the shade in Fig. 5 shows the model error or model uncertainty. To further clarify, the Weibull confidence interval does not care about the reliability of the estimated model (or modeling error). However, it tries to capture the physics and dependency of the data in a probabilistic way. For example, near the endurance limit it is expected that the material behavior would be more elastic and hence would be more deterministic leading to a lower confidence band. That exactly being captured by the Weibull-MLE probabilistic model (this can be seen from Fig. 5, confidence bound represented by blue dashed lines). Contrary to this behavior the model-error bounds (which are determined through the Bootstrap technique and represented by the shade in Fig. 5) increase near and beyond the endurance limit (*N* = 10^{6}). This is also understandable considering that not much data are available near the endurance limit and beyond (due to the requirement of longer duration fatigue tests and associated cost).

## 3 Discussion on Empirical Cumulative Distribution Function and Probabilistically Estimated *F*_{en} Confidence Band

### 3.1 Comparison of Model Generated With Empirical Cumulative Distribution Function.

There are two variables in the Weibull-based fatigue life model: (1) fatigue life (i.e., *N*_{A} or *N*_{W}) and (2) strain amplitude (i.e., *ɛ*_{a}). The failure probability can be calculated if these two variables are determined.

Figure 6(a) shows the estimated Weibull CDF for the HT air with the sample set (i.e., black line) and bootstrap sample sets (i.e., gray lines) when $\epsilon a=0.2%$ or 0.6%. The original or raw data points are used for plotting the corresponding empirical CDFs. The empirical CDFs are estimated considering strain amplitudes (*ɛ*_{a}) of either (0.2 ± 0.01%) or (0.6 ± 0.01%), respectively. Likewise, Fig. 6(b) shows the estimated Weibull CDF for PWR water with the sample set (i.e., black line) and bootstrap sample sets (i.e., gray lines) when $\epsilon a=0.3%$ or 0.6%. The corresponding empirical CDFs are estimated considering strain amplitudes (*ɛ*_{a}) of either (0.3 ± 0.01%) or (0.6 ± 0.01%), respectively.

It is shown that the failure probability increased (i.e., shifted left) when the given *ɛ*_{a} value increased and/or the environmental conditions became corrosive (i.e., PWR water). This finding agrees with the actual fatigue behavior. Furthermore, Fig. 6 shows a good correlation between the CDFs estimated in the Weibull model and the empirical CDFs estimated directly based on the raw or sample data. This shows that the above Weibull-based model is suitable for probabilistic prediction of fatigue life.

### 3.2 Probabilistically Estimated *F*_{en} Confidence Band.

*F*

_{en}. The confidence band of

*F*

_{en}can be estimated based on different cases of Weibull parameters and using Eq. (6) given below. Note that in Eq. (6), the Weibull parameters for in-air and PWR water conditions were estimated using the MLE and Bootstrap methods discussed in Secs. 2.2 and 2.3. Figure 7 shows the result of estimated

*F*

_{en}values as a function of strain amplitude, which were determined using Eq. (6).

Figure 7 shows that there is large uncertainty in the estimated *F*_{en} values for low- and high-strain amplitudes. The reason for large uncertainty is associated with the lack of data available in those strain amplitude. This is similar to the discussion made in Sec. 2.3. Nevertheless, Fig. 7 shows the approximate reliable region of *F*_{en} confidence band where the Bootstrap modeling error is relatively low. Note that the model-error bounds (which were estimated through the Bootstrap technique discussed in Sec. 2.3) are represented by the sky-blue shade in Fig. 5. Figure 7 results can be used as first hand information to assess the maximum and minimum effect of PWR water for a given strain amplitude.

## 4 Conclusion

This work discusses a probabilistic modeling framework for estimating environmental fatigue lives. The literature [3] stainless-steel fatigue life data under in-air and PWR water condition were used as an illustrative example. The following conclusions can be drawn:

The probabilistic fatigue life model is developed based on the Weibull distribution and estimated using the MLE method to account for the censored data. It is confirmed that the Weibull-based model is consistent with the raw data and the empirical CDF.

The uncertainty in the estimated fatigue model (i.e., the modeling error) can be explicitly quantified using the Bootstrap technique. It is shown that the estimated fatigue life model is uncertain when the given strain amplitude is too low or too high. This is because of the lack of experiment or sample data points in low- and high-strain amplitude regions.

The estimated scatter-band of environmental correction factor

*F*_{en}can give a first hand information on the minimum and maximum effect of the PWR water environment on reactor component fatigue lives for a given strain amplitude.Although the above types of results are very preliminary and based on a smaller data set taken from JNES-SS-1005 [3], the main purpose of this paper is to demonstrate a data-analytics-based approach for probabilistic life estimation both under in-air and PWR water environmental conditions. Note that the explicit effects of temperature, strain rate, surface finish, stainless-steel grade, heat, and PWR water chemistry were not considered in the proposed Weibull-based probabilistic model and the subsequent probabilistic

*F*_{en}estimations, rather it is assumed that their combined or bulk effect is responsible for the scatter in the underlying S-N data. If appropriate test data set associated with the individual parameters is available, the explicit representation of these parameters is possible by considering a multi-dimensional probabilistic model. Nevertheless, the proposed preliminary model discusses the fundamental background behind the probabilistic life estimation framework.

## Acknowledgment

This work was conducted under the sponsorship of DOE Light Water Sustainability (LWRS) program, program manager Dr. Keith Leonard. This work was also supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (Grant No. 2018R1D1A3B07050665; Funder ID: 10.13039/501100003725), and Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korean government (Grant No. 20184010201660; Funder ID: 10.13039/501100007053, Advanced track for large-scale heat exchanger of Industrial plants). The Korean agencies funded the PhD internship of Mr. Jae Phil Park to conduct the majority of the discussed research work at Argonne National Laboratory.

The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science Laboratory, is operated under Contract No. DE-AC02-06CH11357. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. http://energy.gov/downloads/doe-public-accessplan

## Nomenclature

*f*_{A}=probability density function of HT air Weibull distribution

*f*_{W}=probability density function of PWR water Weibull distribution

*l*_{A}=log-likelihood function for HT air Weibull model

*l*_{W}=log-likelihood function for PWR water Weibull model

- $x^$ =
estimator of

*x**F*_{A}=CDF of Weibull distribution for HT air

*F*_{en}=environmental correction factor

*F*_{W}=CDF of Weibull distribution for PWR water

*L*_{A}=likelihood function for the HT air Weibull model

*L*_{W}=likelihood function PWR water Weibull model

*N*_{A}=fatigue life in HT air (cycles)

*N*_{W}=fatigue life in PWR water (cycles)

*N*_{Air−RT}=fatigue life in RT air (cycles)

*β*_{A}=shape parameter of Weibull distribution for HT air

*β*_{W}=shape parameter of Weibull distribution for PWR water

*ɛ*_{a}=strain amplitude (%)

*η*_{A}=scale parameter of Weibull distribution for HT air (cycles)

*η*_{W}=scale parameter of Weibull distribution for PWR water (cycles)

*θ*_{A1},*θ*_{A2},*θ*_{A3}=parameters in the relationship between

*ɛ*_{a}and*N*_{A}*θ*_{W1},*θ*_{W2},*θ*_{W3}=parameters in relationship between

*ɛ*_{a}and*N*_{W}

### Appendix

*L*for the HT air and PWR water data can be expressed as follows:

*a*)

*b*)

*f*

_{A}and

*f*

_{W}are probability density functions of the Weibull distribution as defined below:

*a*)

*b*)

*a*)

*b*)

*l*for the likelihood function

*L*(Eqs. (A1

*a*) and (A1

*b*)) is defined as follows:

*a*)

*b*)

*l*value. That is, the estimates should satisfy the following simultaneous equations:

*a*)

*b*)