Closure to “Discussion of ‘Factors of Safety for Richardson Extrapolation’ ” (2011, ASME J. Fluids Eng., 133, p. 115501) PUBLIC ACCESS

[+] Author and Article Information
Tao Xing

Department of Mechanical Engineering,  College of Engineering, University of Idaho, P.O. Box 440902, Moscow, ID 83844-0902xing@uidaho.edu

Frederick Stern

 IIHR-Hydroscience and Engineering, C. Maxwell Stanley Hydraulics Laboratory, The University of Iowa, Iowa City, IA 52242-1585frederick-stern@uiowa.edu

The private communication is available to the public upon request.

J. Fluids Eng 133(11), 115502 (Dec 09, 2011) (6 pages) doi:10.1115/1.4005030 History: Received April 10, 2011; Revised September 07, 2011; Published December 09, 2011; Online December 09, 2011

The Technical Brief by Roache [1] presents ten items of discussion of our factor of safety (FS) method for solution verification [2]. Our responses are listed below item-by-item using the same numbering as Roache. The nomenclature mostly follows our own and not Roache’s such as pRE for the order of accuracy calculated using the Richardson Extrapolation as opposed to the observed order of convergence and the GCI and GCI2 methods as opposed to the GCI0 and the real GCI methods. However, we agree with Roache to use FS for the factor of safety used in all the verification methods. In response to item (10), we have used our approach to evaluate two new variants of the GCI method and one new variant of the FS method.

(1) The GCI and FS methods can be written in the following general form: Display Formula

The FS method is substantially different from and not a variant of the GCI method. In the FS method we use P=pRE/pth to determine FS and always use p=pRE. Only the FS method, compared with different variants of the GCI method, provides a reliability R larger than 95% and a lower confidence limit (LCL) greater than or equal to 1.2 at the 95% confidence level for the true mean of the parent population of the actual factor of safety. This conclusion is true for different studies, variables, ranges of P values, and single P values where multiple actual factors of safety are available. FS is a smooth linear function of P and has no jumps.

There are a few variants of the GCI method. We have used the definition of the GCI method, which arguably is the most common version/interpretation applied in the literature [3-5]. The GCI1 method was proposed by Logan and Nitta [6]. The guideline for the GCI2 method was communicated to us by Roache [7] in his criticisms of an earlier version of our FS method [8], which he now refers to as the real GCI method. Roache’s most recent book [9] does point out that the choice of FS and p requires user judgment calls; however, no single guideline was provided. The lack of a single guideline clearly has caused considerable confusion. We take no responsibility for this confusion. Statistical analysis showed that none of the GCI variants shows R  > 95% and LCL > 1.2 for different studies, variables, ranges of P values, and single P values where multiple actual factors of safety are available. As a result, there are high risks to use these GCI variants in certain circumstances, especially for P>1. Except the original GCI method, all variants of the GCI method have jumps of FS versus P.

Our purpose is not to add to this confusion but rather to evaluate the performance of the outcomes of selecting any of these variants of the GCI method and compare with the FS method using our approach. The correction factor and pRE are used to define the GCI2 and other verification methods as defined by Eqs. (10)–(15) in Ref. [2] in order to compare their relative conservativeness using the same error estimate δRE.

(2) We disagree with Roache to refer to the GCI as the GCI0 method and the GCI2 as the GCI method for reasons given in item (1). The lack of a single guideline for selecting FS and p and when to use which variant of the GCI method is highlighted by Roache’s current discussion. Roache accepts using pRE when it is within a 5% difference of pth in item (2), whereas later in item (10), Roache considers two other judgment calls as reasonable.

The GCI2 method discards the “coarse” grid solution in the uncertainty estimate when P>1, which is difficult to justify. For example, four grid solutions from the coarsest grid 4 to the finest grid 1 can build two grid triplet studies, (1, 2, 3) and (2, 3, 4). Grid convergence studies for industrial applications often show the oscillation of pRE such that (1, 2, 3) could estimate P>1 but (2, 3, 4) could estimate P<1. Based on the GCI2 method, S3 should be discarded in the uncertainty estimate for (1, 2, 3) but not for (2, 3, 4). Of course, we agree that ideally one would conduct additional grid triplet studies until the solution is at or as close as possible to the asymptotic range; however, clearly this is not always possible especially for industrial applications [10].

We agree that a grid-triplet study with P=0.08 is not desirable. However, it is not uncommon for solution verification studies (e.g., local pRE ranges from 0.012 to 8.47 in Ref. [4]). Additionally, Roache’s criticism of using P=0.08 is inconsistent with one of his previous conclusions that there is no necessity to discard results with pRE<1 (P<0.5 for a second order method in Ref. [11]).

(3) The fact that “the use of the GCI1 method is closer to a 68% than a 95% confidence level” was one of the conclusions by Logan and Nitta [6]. This conclusion was not just based on the dataset with intentional choice of grid studies with oscillations in both exponent p and output quantity. As stated in page 367 in Ref. [6], “However, for our contrived and mechanics example NS=18 sets (most of which were non-smooth), the use of GCI = 1.25 is much closer to a 68% confidence estimate than 95%.” It was also recommended in Refs. [6] and [2] that a sample with the number of grid convergence studies much larger than 100 is needed to draw general conclusions.

We did not recommend the GCI1 method but rather evaluated it using much larger sample sizes than Ref. [6]. For the largest sample 3 with size N=329, the reliability R (Eq. (19) in Ref. [2]) is 90.3% for the GCI1 method.

(4) We disagree with Roache’s evaluation in Ref. [11] where it states that “Briefly, the net result is 14 NC (nonconservative) of 176 entries, or 8.0%.” Only 151 of the 176 grid triplet studies have the actual error E. This results in 24 nonconservative of 151 (note there are nine nonconservative grid-triplet studies that estimate U=E). So, the reliability for the GCI method [12] is actually 84.1%, which agrees very well with the reliability 83.9% estimated using our 329 grid-triplet studies (sample 3 in Ref. [2]).

Based on our own evaluation above and the fact that Cadafalch et al.  [12] used FS=1.25 for P>1, the method they applied was not the GCI2 method and more likely the GCI method. The claim of “an original and reasonable variant of the real GCI” [1] again is confusing.

(5) We take 95% coverage as the common uncertainty target for both experiments and computations [5]. Although the GCI2 method only misses the overall reliability by 0.8% for sample 3, more importantly it fails to provide sufficient conservatism for other samples including the reliabilities of 91.4%, 90%, and 87.5% for samples 5, 8, and 16, respectively [2]. It is possible that another dataset could slightly change our evaluations. Nonetheless, the current sample size is large and the range of P values is wide such that a further increase of the number of samples is not likely to significantly alter the FS method and its results.

(6) The FS method was calibrated/validated against the available dataset. Note that calibration/validation requires that the true error can be evaluated, i.e., the solution numerical benchmark (SNB) or solution analytical benchmark (SAB) is known. We welcome additional validation of the FS method and if necessary re-calibration and improvement, but again SNB or SAB must be known. The claim of Roache and others of the 95% reliability for the GCI method is undocumented and based on anecdotal information. We doubt that SNB or SAB is available for many of the cases cited by Roache and others. It should be a simple matter to provide proper documentation.

Note that the FS method is more conservative than the GCI2 method except for 1<P<1.136 due to the jump of the factor of safety at P=1 for the GCI2 method. If the FS method is not conservative enough for another dataset, the GCI2 method will likely be worse.

The claim that the GCI2 method has been stable for over 12 years is not well founded. Due to the lack of a single guideline on the choice of FS and p, different variants of the GCI method have been used by different users based on their own judgment calls. For example, Cadafalch et al.  [12] did not use the GCI2 method, and Logan and Nitta [6] used the GCI1 method. Furthermore, the GCI method may have been applied to O(1000) cases but no statistical evidence for reliability has been documented.

(7) We disagree with Roache’s suggestion that the FS method has problems in predicting monotonic convergence for fine grids. The uncertainty estimates in Table 6 for the FS method in Ref. [2] for the three finest grid triplets are not monotonically decreasing since P shows large oscillations, and the factor of safety for the second finest grid triplet (2, 3, 4) at P=1.49 is much larger than that for the other methods evaluated at the same P. However, the larger factor of safety is required to ensure the reliability for P>1. For the three grid triplets discussed, it is interesting to evaluate the convergence ratio R for the fine grid solution S1 (RS1), P (RP), and UG (RUG). All the five verification methods have the same RS1 and RP, which show monotonic convergence. The GCI, GCI1 , and CF methods show monotonic convergence for UG , whereas the GCI2 and FS methods show monotonic divergence (RUG=2.74) and oscillatory divergence (RUG=-4.53), respectively.

The oscillation of P may be caused by many factors. Grid 4 is still too coarse for the solution to be in the asymptotic range. Additionally, reducing the iterative error to machine zero is very difficult for large-scale computations. With the small grid refinement ratio r=24, solution changes ɛ will be small, and the sensitivity to grid-spacing and time step may be difficult to identify compared with iterative errors UI. As shown in Fig. 6(b) in Ref. [10], UI,1/ɛ12=61.6% for the cases in Table 6 [2]. When r increases, UI/ɛ will likely decrease. For example, the grid uncertainty decreases from 5.04 for (2, 4, 6) to 4.02 for (1, 3, 5) with UI,1/ɛ13=20% for r=2. However, it should be noted that a large r may be problematic, too, as different grids may resolve different flow physics.

There are some other cases that the GCI, GCI1 , GCI2 , CF, and FS methods show non-monotonic convergence for multiple grid-triplet studies, including the “well-behaved” problems Cadafalch et al.  [12] and Roache [11] used to evaluate the conservativeness of the GCI method. For the radial velocity using the SMART scheme in the study of premixed methane/air laminar flat flame on a perforated burner [13-15], the uncertainty estimates using the FS and GCI2 methods monotonically decreased whereas the other three methods did not as the grid is refined. Another example is for the uncertainty estimates for temperature at a monitored location for a two-dimensional natural convection in square cavities at Ra=106, which had five grid-triplet studies with r=2 [16]. Uncertainty estimates using the five verification methods discussed in Ref. [2] first monotonically decreased as the grid is refined but suddenly increased for the finest grid-triplet. Thus, it is unreasonable to blame the FS method as the reason for such behavior.

The verification results for our industrial application example are far from the asymptotic range. Although we evaluated the convergence characteristics for the 98 verification variables using P and |E| as functions of Δxfine/Δxfinest [2], a standard criterion for achieving the asymptotic range is still lacking. A possible criterion is that monotonic convergence should be established based on evaluation of the convergence ratio R for fine grid solution S1 (towards SC), P (towards 1), and U (monotonically decreasing) for multiple (at least three) grid-triplets with the same grid refinement ratio r and UIU. In some cases, oscillatory convergence may be acceptable; however, this would require many grid triplets [17]. Although R still needs to be evaluated for all the variables in our dataset, 41.5% of the variables that have more than two grid-triplet studies do show that S1 approaches SC, P approaches 1, and U monotonically decreases as the grid is refined. For the other 58.5% of the variables, S1 also approaches SC as shown by monotonically decreased error magnitude |E|, but P and UG often show mixed convergence conditions as the grid is refined.

(8) As discussed in item (6), without statistical evidence, the claim of the conservativeness of the GCI2 method is undocumented. Furthermore, we doubt very much how many applications have SNB or SAB. If they do, we will be glad to add them to our dataset. The work by Dr. C. J. Freitas and his group is not publicly available [9]. Therefore, the claim of achieving the 95% reliability is again undocumented and based on anecdotal information.

(9) We agree that the actual factor of safety is undefined when a solution not in the asymptotic range happens to predict the true value. If this happens, it should be excluded from the dataset used to derive the FS method. However, monotonic convergence ensures that the uncertainty estimate is always greater than zero so that a zero error will be automatically bounded by the uncertainty.

The contrived example created by Roache only proves that the average actual factor of safety (X¯) cannot be used alone to determine if a solution verification method is conservative enough. But it can be used to determine the relative conservativeness between different verification methods.

It should be noted that we used both the reliability R and LCL as defined by Eq. (22) in Ref. [2] to develop the FS method and determine if a method is conservative enough. Larger X¯ does not necessarily mean larger R (readers can refer to sample 6 in Ref. [2]).

(10) As requested by Roache, we use our approach to evaluate two new variants of the GCI method proposed by Oberkampf and Roy [18] (GCIOR ) and by Roache [1] (GCI3 ). Display Formula

Display Formula
To address Roache’s concern of using pRE when pRE>>pth, we also evaluate an alternative form of the FS method (FS1 method). The FS1 method is the same as the FS method for P<1 but uses pth instead of pRE in the error estimate for P>1. Thus, Eq. (14) in Ref. [2] becomes Display Formula
Following the same procedure described in Sec. 2.4 of Ref. [2], FS0=2.45, FS1=1.6, and FS2=6.9 are recommended, and the final form of the FS1 method is Display Formula
To compare the relative conservativeness between different verification methods, the three new methods are rewritten in terms of the same error estimate δRE. Display Formula
Display Formula
Display Formula
The factors of safety for all the verification methods discussed so far are shown in Fig. 1. One problem of the GCI2 method is the jump of factor of safety across the asymptotic range at P=1. For two grid-triplet studies with one at P=0.999 and the other at P=1.001, the factor of safety suddenly increases from 1.25 to 3 even though P only varies by less than 0.2%. Eça et al.  [19] gave similar comments on this issue: “However, it is not easy ‘to accept’ a jump of a factor of 2.4 in the uncertainty when the observed order of accuracy may vary by only 0.1.” Similar problems exist for the GCIOR and GCI3 methods when pRE differs from pth by 10%. It should be noted that the GCIOR method set the lower limit of pRE to be larger than 0.5, which corresponds to P0.25 for a nominal second order method. Thus, the factor of safety for P<0.25 for the GCIOR method shown in Fig. 1 is only a result of the mathematical reformulation. Figure 1 also shows that the GCIOR and GCI3 methods are much more conservative than the other methods for 0.25<P<0.9 and coincide with the GCI2 method for P>1.1. The FS1 method is less and more conservative than the FS method for 1<P1.235 and P>1.235, respectively.

The GCIOR , GCI3 , and FS1 methods are evaluated using statistical analysis of the 25 samples following Ref. [2], with focus on samples 3 to 25. Table 1 shows the statistics for samples 3 to 8 [2] based on six different P ranges for the three new methods. The FS1 method has the same reliability as the FS method for samples 3 to 8. The GCIOR and GCI3 methods almost have the same reliability, but the GCI3 method is a little more conservative. Compared to the GCI2 method, the GCIOR and GCI3 methods improve the reliability for P<1 to be larger than 95% but are not conservative enough for P1, especially near the asymptotic range. Examination of 18.2% of the data for 1.1P<2.0, which cover samples 7 and 8, shows that only the FS and FS1 methods achieve 95% reliability, but the GCIOR and GCI3 methods achieve only 90%. The largest X¯ for samples 3-5, sample 6, sample 7, and sample 8 are the GCI3 , GCI2 , FS, and FS1 methods, respectively. For all the verification methods, the LCLs are larger than 1.2 for all the P ranges.

Table 2 shows the statistics at the seventeen P values (samples 9 to 25) ranging from 0.705 to 1.205. For samples 9 to 19 (P<0.99), all the verification methods achieve reliabilities larger than 95% except 93.1% for the GCIOR method at P=0.905, 87.5% for the three GCI methods at P=0.955, and 84.6% for the GCIOR and GCI3 methods at P=1.105. The largest X¯ for samples 9-12, samples 13-20, samples 21-24, and sample 25 are the GCIOR and GCI3 , FS and FS1 , GCI2 , and FS1 methods, respectively. Only the FS and FS1 methods satisfy the requirement that LCL>1.2 for samples 9 to 25. The GCI2 method has LCL<1.2 for sample 20; the GCIOR method has LCL<1.2 for samples 13, 16, 17, 18, 20, and 22; and the GCI3 method has LCL<1.2 for samples 20 and 22.

The actual factor of safety for sample 3, sample 3 averaged using ΔP=0.01, and the upper and lower band of the confidence interval X¯±tSX¯ for samples 9 to 25 are shown in Fig. 2. t is the factor for the student-t distribution and SX¯ is the standard deviation of the mean of the sample, as defined in Ref. [2]. The GCIOR and GCI3 methods do not satisfy LCL>1.2 near the asymptotic range. Compared to the FS method (Fig. 4(e) in Ref. [2]), the FS1 method shows a larger actual factor of safety when solutions are farther from the asymptotic range for P>1.

The choice of FS and p in the GCI method requires user judgment calls, for which no single guideline is currently available. We recommend that a single guideline be provided.

The GCIOR and GCI3 methods have almost the same reliability. But the GCI3 method is a little more conservative. Compared to the GCI2 method, the GCIOR and GCI3 methods improve the reliability for P<1. However, they are too conservative for P<0.9 using a factor of safety 3 and not conservative enough for P1.1.

The FS1 and FS methods are the same for P1. For pth=2 and r=2, the FS1 method is less and more conservative than the FS method for 1<P1.235 and P>1.235, respectively. As a result, the FS1 method may have an advantage for uncertainty estimates when P>2 where the FS and other verification methods likely predict unreasonably small uncertainties due to small error estimates. However, since the current dataset is restricted to P<2, the pros/cons of using the FS or FS1 method cannot be validated. Thus, until additional data is available for P>2, all verification methods should be used with caution for such conditions and, if possible, additional grid-triplet studies conducted to obtain P<2.

The authors’ statistical approach based on many analytical and numerical benchmarks provides a robust framework for developing solution verification methods. The authors welcome additional validation of the FS method and, if necessary, re-calibration and improvement using additional rigorous verification studies with SAB or SNB available. More research is needed to establish the criterion for achieving the asymptotic range along with its use in providing high quality numerical benchmarks.

This study was sponsored by the Office of Naval Research under Grant No. N000141-01-00-1-7, administered by Dr. Patrick Purtell.

Copyright © 2011 by American Society of Mechanical Engineers
View article in PDF format.



Grahic Jump Location
Figure 1

Factor of safety for different verification methods with pth=2 and r = 2

Grahic Jump Location
Figure 2

Actual factor of safety for sample 3, sample 3 averaged using ΔP=0.01, and X¯±tSX¯ for samples 9 to 25: (a) GCIOR method, (b) GCI3 method, and (c) FS1 method


Table Grahic Jump Location
Table 1
Statistics for different ranges of P values using non-averaged actual factor of safety
Table Grahic Jump Location
Table 2
Statistics excluding outliers at seventeen P values


Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In