Assessment of the value of groundwater age time-series for characterizing complex steady-state flow systems using a Bayesian approach

In this work, the effectiveness of transient environmental tracer data in reducing the uncertainty associated with the inference of groundwater residence time distribution was evaluated. A Bayesian Markov Chain Monte Carlo method was used to infer the parameters of presumed residence time distribution forms—exponential and gamma—using concentrations of five tracers, including CFC-11, CFC-12, CFC-113, SF6, and 85Kr. The transient tracer concentrations were synthetically generated using the residence time distributions obtained from a model of the Plœmeur aquifer in southern Brittany, France. Several measures of model adequacy, including Deviance Information Criteria, Bayes factors, and measures based on the deviation of inferred and true cumulative residence time distribution, were used to evaluate the value of groundwater age time-series. Neither of the presumed forms of residence time distributions, exponential and gamma, perfectly represent the simulated true distribution; therefore, the method was not able to show a definitive preference to one over the other in all cases. The results show that using multiple years of tracer data not only reduces the bias of inference (as defined by the difference between the expected value of a metric of inferred residence time distribution and the true value of the same metric), but also helps quantify the uncertainty more realistically. It was found that when one year of data is used, both models could almost perfectly reproduce the observed tracer data, even when the inferred residence time distributions differed substantially from the true one. When the number of years of tracer data is increased to four years, the uncertainty associated with the distribution parameters and the model structural uncertainly increased, as the presumed forms were not able to reproduce all the data accurately. This resulted in a more realistic assessment of model uncertainty due to structural error. It was also found that regardless of the prescribed age distribution form, the Bayesian method does a better job of capturing the cumulative ages at older ages; however, it is not able to reproduce the early ages well. The ability of the model to capture older ages improves as a greater number of years of tracer data is used, in cases of both presumed exponential and gamma distributions.


Introduction
Environmental tracers have been increasingly used for characterizing groundwater residence time distribution (RTD) (Leray et al., 2012;Massoudieh et al., 2012), and calibrating hydrogeological models (Portniaguine and Solomon, 1998;Reilly et al., 1994;Robertson and Cherry, 1989;Solomon et al., 1992). They characterize the underground flow patterns from recharge to discharge zones and provide an integrated insight into the system. Concentrations of CFCs, SF 6 , 3 H, 39 Ar, 14 C, He, and 85 Kr, for instance, provide essential information over a broad range of temporal scales, from several months to several decades (IAEA, 2006;Newman et al., 2010). The interest in this type of data has promoted the quest for tracers with different atmospheric concentration histories and decay rates, in order to expand the range of ages that can be effectively characterized (Busenberg and Plummer, 2008;Newman et al., 2010;Sturchio et al., 2004). The ultimate goal is to characterize the full RTD beyond some of its moments, which requires more tracers than are often available. The set of environmental tracers that can be used for this purpose is limited. The tracers must be relatively stable with respect to the ages they are used to characterize, and they must not react or undergo adsorption into the solid matrix (Glynn and Plummer, 2005). They should be ideally inert tracers with low or no biodegradability or decay rates that are known with high confidence.
Along with seeking additional potential environmental tracers, we assess here the effectiveness of the temporal variation of the concentrations of the commonly used ones. This variation might come from a transient evolution of the flow patterns, such as what occurs when starting a pumping (Leray et al., submitted for publication), and from the transient tracer atmospheric concentrations (Troldborg et al., 2008). The time series of environmental tracer concentrations have been used in a limited number of studies, and mainly for interpreting the field data (Troldborg et al., 2007, three-year period); (Long and Putnam, 2009, 16-year period); (Osenbrück et al., 2006, five-year period); (Sültenfub et al., 2011, five-year period); (Bauer et al., 2001, three-year period). Manning et al. (2012) compiled a 13 year time series of multiple dissolved-gas age tracers from springs to constraint the form of RTD and concluded that the time series of age tracers is potentially a valuable tool for constraining age distribution in sampled water. In catchment hydrology, time-series of seasonally variable tracers in precipitation including 18 O, 2 H and 3 H have been widely used to estimate transit time distributions using direct deconvolution and indirect methods based on lumped parameter models. A comprehensive review on different techniques used in estimation of transit time distributions in catchment hydrology can be found in McGuire and McDonnell (2006). Parametric and non-parametric methods have been used in the past to infer the transfer functions from tracer tests Luo et al., 2006). Because of inherent limitations of knowledge of the natural flow patterns, although the value of temporal chronicles has been emphasized, it has not been quantitatively evaluated. Temporal tracer data can both further corroborate and scrutinize previous inferences about the forms of age distribution. Temporal chronicles have been used to help identify the parameters of an a priori known form of the RTD (Ivey et al., 2008). The form of the distribution, however, is itself part of the identification problem. It has also been proposed that the transient variations of tracer age might help predict the general form of the RTD (Zhang, 2004). However, assessment of this idea is difficult, as it requires an independent knowledge of the flow patterns, which is out of reach in most natural sites.
In this study a Bayesian approach is used to evaluate the additional information content of tracer data collected at different times in comparison to the use of single samples obtained at a single time. The Bayesian approach allows us to quantify the uncertainly associated with estimation of the parameters of RTDs and therefore can provide an insight into how additional data help reduce these uncertainties. A synthetic aquifer case for which the flow patterns can be both quite complex and perfectly controlled is used for this study. It is common to test methodologies first on synthetic controlled cases to examine their relevance to natural cases. Steady-state flow conditions are considered and the capacity of the same environmental tracer, sampled at different times at the same position, to provide the needed information on the hydrosystem is evaluated. The steady-state assumption is acceptable over the long term if the overall withdrawal or recharge patterns at a site do not substantially change over the study period. Under a steady-state condition, the concentration of a tracer sampled at different dates is controlled solely by its past atmospheric chronicle (Leray et al., submitted for publication). If the historic atmospheric concentrations vary strongly enough and the sampling dates are sufficiently separated, samples of the single tracer taken at multiple times, in effect, can contain the equivalent information content of a number of different tracers. In reality, the same element taken at different times might not provide as much information as many unrelated tracers, due to the gradual changes in the atmospheric inputs, as well as the noise that results from observation error and other spatial and temporal variability. Nonetheless, it might still be valuable as a complementary tool for evaluating the relevance of classically used RTDs (Haitjema, 1995;Maloszew-ski and Zuber, 1982;Amin and Campana, 1995) and providing ways to cross-validate the inferred RTDs.

Methodology
Our analysis and methods are organized into four stages: construction of the hydrogeological model; simulation of solute transport and the derivation of the time-dependent environmental tracer concentrations; independent interpretation of these data to obtain estimated RTDs; and the ultimate comparison between the estimated and simulated RTDs. The tracers concentrations used for the analysis was obtained in one case from a hypothetical gamma RTD and in the second case from simulated RTD from a hydrogeological model of the Ploemeur aquifer in North West France. The simulated tracer concentrations as a result of the hydrogeological model were treated as observed tracer data for inferring the RTD using several presumed lumped parameter models. Several measures of model sufficiency have been used to evaluate and compare the performance of the approach in determining the age distribution, as well as uncertainties associated with the estimated RTDs when different periods of tracer data are used for the RTD inference.

Simulated aquifer characteristics
The hydrogeological model is based on the Ploemeur aquifer, a highly heterogeneous, hard-rock aquifer located on the south coast of Brittany (France). This aquifer has been exploited for water supply of the nearby city at a rate of 1.1 Â 10 6 m 3 /year since 1991. This aquifer offers a realistic hydrodynamic context for our methodological study, which is to assess the value of groundwater age time-series in inferring age distributions.
Based on the inversion of gravimetric data (Ruelleu et al., 2010) and observations (Touchard, 1999), the geological conceptual model of the Ploemeur site is composed of two transmissive structures at the kilometric scale, which are embedded in less permeable micaschists. The first transmissive structure, named the contact zone, is a shallow-dipping fractured zone that constitutes the interface between an underlying leucogranite, Ploemeur granite, and the overlying micaschists. The second structure is a North 20°normal fault. The almost impervious Ploemeur granite forms the southern hydraulic barrier and part of the substratum. Another leucogranite, Guidel granite, forms the northern hydraulic barrier. The mean thickness of the system composed of the shallow-dipping structure and the micaschists has been estimated at 180-280 m (Ruelleu et al., 2010) ( Fig. 1).
We will use the steady-state model developed by Leray et al. (2012) for which the transmissivity of the contact zone has been constrained both by hydraulic tests and by the mean piezometric level at the pumping well, while the hydraulic conductivity of the micaschists should not be too small to ensure that the area around the pumping does not seep. The hydraulic properties of the different rock units are summarized in Table 1.
The potential recharge rate, R, is assumed to be approximately 200 mm per year, based on previous estimations (Carn, 1990;Leray et al., 2012;Touchard, 1999). The supplying area to the pumping well located in the shallowest part of the aquifer then amounts to a few square kilometers and is limited to a North-South direction by the two granites. The long-term pumping well was found to induce a radial flow all along the fractured contact zone almost parallel to the dip plan of the contact zone because of its high conductivity contrast with the overlying micaschists. Flows through the micaschists are vertical near the surface and slightly bend perpendicular to the fractured zone at depth. As the goal of this paper is clearly methodological and without detailed information about the porosity field, we used an homogeneous porosity (1%). The calibration of a model would require a more complex representation of the porosity field in case of complex systems such as fractured ones (Cook et al., 2005). However, in our case, the nontrivial shape of the residence time distribution obtained from the synthetic model shows that the model complexity is sufficient to perform our analyses. Particularly, the form of RTD is determined by the aquifer structure. The smaller depth of the aquifer close to the pumping well, skews the RTD toward younger ages by shortening locally the flow paths Leray et al. (2012). The RTD is overall controlled by the superposition and interactions between different reservoirs and circulation scales; i.e., by local as much as global effects. Thus, we do not control the shape of the distribution, nor do we expect a classical reservoir model with an exponential or dispersion model with inverse-Gaussian distribution to reproduce these specific features. This particularly makes the synthetic example relevant to real cases.

Derivation of environmental tracer concentrations
Solute concentration is classically determined by solving the advection dispersion equation (Bear, 1991;de Marsily, 1986). In this study, we disregarded local dispersion and diffusion, as we expected that mixing occurs mainly at the pumping well, where sampling is carried out, and is much larger than the mixing due to the diffusion-dispersion processes in this media (LaBolle and Fogg, 2001). The advection equation is solved backwards in time using a particle-tracking scheme Wilson, 1999, 2001). 5 Â 10 6 particles are injected at the well according to a flowweighted distribution. The RTD, p(t), is obtained by simply looking at the distribution of the equally likely travel times that it takes for particles released at the well to arrive at the recharge locations. Mean concentration, c i,j , which is the concentration of tracer i in sample j at the pumping well, is determined by the convolution of p(t) with the atmospheric chronicle, c in,i (t w,j À t) ( Kreft and Zuber, 1978;Maloszewski and Zuber, 1982): where c in,i represents the atmospheric concentration of tracer i and k i is its decay rate. Eq. (1) shows the time dependency of the sampled concentration, which, under steady-state assumption, comes exclusively from the non-linear variation of the atmospheric concentration of the given environmental tracer (Leray et al., submitted for publication;Zhang, 2004). The steps to infer the residence time distribution, p(t), in this work include first assuming a functional form representing p(t) [i.e., p(t)=f(t; / 1 , / 2 , ...)], where / 1 , / 2 , ... are the parameters needed to fully determine the distribution, and then estimating the parameters / 1 ; / 2 ; ... using a statistical inverse modeling approach. Various sources of uncertainty associated with the inferred parameters include observation error due to measurement error and the spatial and temporal variability of tracer concentrations; model structural error (epistemic uncertainty) due to the fact that the actual RTD cannot be represented accurately by any presumed functional forms and the uncertainties associated with environmental factors, such as tracer atmospheric concentration records and parameters affecting the transport of tracers in the aquifer. Assuming that an RTD function with given parameters / 1 , / 2 , ...results in tracer concentrations C =[c i,j ] nÂm (a matrix) and that the observed tracer concentrations can be represented as e C ¼c i;j ÂÃ nÂm , while assuming (1) no prior information is present where C is the variance-covariance matrix of the observation error (or a transformation of it), pð e C j/ 1 ; / 2 ; ...; CÞ is the likelihood function, and pð e C Þ is a normalizing factor that is equal to the integral of the numerator over the entire parameter space. Using a log-normally distributed observation error structure, and assuming that the observation error variance for log-transformed observed concentrations, given the true concentrations of all the tracers, are all the same, the variance-covariance matrix can be written as a scalar value r multiplied by an identity matrix, and the likelihood function can be written as: Combining Eqs. (2) and (3) results in the following equation for the posterior distribution of the parameters defining the RTD: A Markov Chain Monte Carlo (MCMC) method (Gamerman and Hedibert, 2006;Kaipio and Somersale, 2004) was used to draw a large number of samples from Eq. (4). A C++ code is written to perform the MCMC simulation. The number of Markov chains can be determined by the user. In the example application presented in the next section, 8 chains were used and 1,000,000 samples resulted in convergence of the MCMC method. The first 100,000 samples were left out as ''burn-in'' period. The criteria suggested by Geweke (1992) and Geweke and Tanizaki (2001) was used to evaluate the convergence of the MCMC algorithm. For more details on the Bayesian inference, see Massoudieh et al. (2012). Fig. 2 shows the time evolution of C in for the environmental tracers used in this study: CFC-11, CFC-12, CFC-113, SF 6 and, 85 Kr in the Northern Hemisphere atmosphere (IAEA, 2006). CFCs have been exhibiting an increasing atmospheric concentration since the late 1940s, which stabilized in the mid-1990s. SF 6 has been monotonically increasing since the late 1950s. 85 Kr began to increase in 1980 and flattened out in 2000. Table 2 synthesizes the mean concentration, C w of the five different tracers used in this study and apparent ages for each tracer at each year. These concentrations are obtained using RTD p(t), produced by the chosen calibrated flow model (Table 1 and previous section) with a porosity of 1%. They constitute the data time series that covers the eight-year period from 2003 to 2010. This eight-year period is consistent with medium-term planning relative to site studies and projects duration. One interesting observation is that for the three tracers with highly non-linear atmospheric concentration trend in the recent past (i.e. CFC-11, CFC-12 and CFC-113) the apparent ages not only deviate from the known mean age but are also non-consistent within years while the apparent ages are more consistent for SF 6 and 85 Kr with approximately linearly changing atmospheric concentrations in the recent past.
In order to make the synthetic tracer concentration resemble real-world observations more realistically, a noise that was produced based on a log-normally distributed structure and a standard deviation of 5% was added to each calculated concentration to represent the error due to measurement, heterogeneity, and model structural error.

Comparison of model performance and uncertainty
Two measures, Deviance Information Criteria (DIC) and Bayes factors, were used for evaluating the goodness of different RTD forms. DIC (Spiegelhalter et al., 2002), a measure that is used to compare the goodness of each model structure, takes into account how good a model structure can reproduce the observed data,  where f is a standardizing term that is only a function of the observed data, H represents the parameters that are random-vector distributed based on the posterior distribution, and H is the expected values of the posterior distribution of parameters. Because 2 log f ð b CÞ is only a function of the observed data, it is not affected by the model structure, and therefore, it becomes irrelevant when comparing different forms of RTD functions using the same data. The DIC was easily calculated using the results of the MCMC calculations. A smaller DIC means a better model.
To compare different age distribution forms in terms of their ability to reproduce the measured data, the Bayes factor (Jeffreys, 1935;Kass and Raftery, 1995) was used. The Bayes factor for comparing two models, M 1 and M 2 , assuming an equal prior probability for the models, is defined as: B 12 represents the ratio of the odds that model 1 is the true model to the odds that model 2 is the true model, pð b CjH; MÞ is the likelihood of observing concentrations b C given model M, and pðHÞ is the prior density of the parameters. The odds of each model being the correct model when the only choices of model structures are models 1 through m can be calculated as: where m is the total number of RTD distribution forms being evaluated.

Hypothetical gamma residence time
To evaluate the effect of transient tracer information on reduction of the uncertainty associated with inferred RTDs when there is no model structural error, the method was applied first to a case where the concentrations of four tracers-CFC-11, CFC-12, CFC-113, and SF 6 -were obtained based on a known gamma residence time distribution, using Eq. (1), for eight consecutive years, 2003-2010. The gamma distribution was considered because first it is relatively simple and is determined using only two parameters while it also has the flexibility to reproduce skewness and tailing observed in commonly seen breakthrough curves. A 5% randomly generated observation error based on a log-normal and multiplicative error structure was added to each calculated tracer concentration to represent measurement and sampling error and the error due to spatial variability. The Bayesian approach was then used to estimate the parameters defining a gamma, inverse Gaussian (dispersion model) and a log-normal distribution and for cases where one, two, four, and eight years of tracer data were used. The gamma and inverse Gaussian distributions were used to evaluate the capability of multiple years of data to identify the right form of distribution by discriminating between the correct (gamma) and incorrect (inverse Gaussian and lognormal) distributions. Fig. 3 shows the presumed (true) RTD (solid line) and 60 samples of inferred plausible RTDs when one, two, four, and eight years of data were used, while assuming gamma and lognormal distributions. As can be seen, as the number of years of tracer data used for inferring the RTDs increases, the uncertainty associated with the distribution parameters decreases. In order to quantify the uncertainty associated with the inferred RTDs, a measure representing the spread of the posterior probability distribution of a scalar measure extracted from the RTD around the true value can be used: where n is a random variable representing the scalar measure extracted from the RTD, and n t is the true value of n that is known if the true RTD is known. Any single scalar quantity extracted from the RTD (e.g., mean, median, cumulative distribution at a certain age) can be considered to represent n. Here, we define n as the fraction of water younger than a certain age L; therefore, n L ¼ R L 0 pðtÞdt. In addition, the bias of estimation can be calculated as b L = E(n L )-À n t . Using the samples drawn from the posterior distribution obtained from the MCMC algorithm, the values of the uncertainty measure, d, and the bias measure, b L , can be calculated easily. Fig. 4 shows the 95% credible intervals of modeled concentrations of the four tracers vs. the ''observed'' concentrations for the two cases were two years and eight years of data used for parameter estimation of RTDs using presumed gamma and log-normal distributions. The results for the inverse-Gaussian distribution have not been shown to keep the figures more readable. As it can be seen, as the number of years used for parameter estimation increases, the ability of the inverse-Gaussian model to reproduce the data is declined compared to the gamma distribution. Table 3 shows Bayes factors, uncertainty measure d 5 ¼ E n 5 À n t;5 ÀÁ 2 h i 1=2 , and bias b 5 . The subscript 5 indicates that these measures were calculated using the cumulative RTD at 5 years. A smaller d 5 means a smaller uncertainty is quantified for cumulative age distribution at 5 years for the three presumed RTD forms. I k is the odd of one model being the correct one if the only possible models where the ones considered (i.e. gamma, inverse-Gaussian and lognormal). When one and two years of tracer data are used, the method is not able to definitively identify the right model as the odds of each model to be the true is not significantly different. However, when four years of tracer data is used a 90% odd is assigned to the correct (gamma) model and when eight years of tracer data is used the correct model is definitively selected. As is expected when the correct form of RTD is assumed (gamma), using samples from a larger number of years of tracer data results in a smaller uncertainty and a smaller bias of estimation. However, when the correct form of RTD is not used, using a larger number of years does not result in an improvement in uncertainty and bias. This is an important conclusion, as in practical cases, the form of RTD is not known and there is a high probability that the simple forms selected do not match the real distribution. In these cases, the uncertainty and bias of inference do not improve when using samples from multiple years. The implication of this outcome is that additional data can affirm or dispute the presumed form of distribution. The Bayes factors show the odds of each model structure to be the correct form when the only choices are the ones considered (in this case, gamma, inverse Gaussian and lognormal). What can be learned from this experiment is that when one form of RTD describes the tracer data substantially better than other forms do, using transient data can strengthen the confidence toward the right form and more definitively reject the inappropriate forms.

Simulated residence time distribution from the conceptual flow model based on the Ploemeur aquifer
Gamma and exponential models were applied to the tracer concentrations calculated using the simulated RTD for the Ploemeur aquifer. Here, the simulated RTD is deemed the ''true'' RTD, and   Table 2. These calculated tracer concentrations are considered observed tracer concentrations and are referred to as such henceforth. All the tracers were considered to be conservative (non-decaying) except 85 Kr, for which a decay rate of 0.0641 yr À1 (equivalent to a half-life of 10.8 years) was considered. In order to compare the effect of one additional tracer against using multiple years, the analysis was performed once using all five tracers and once using only four of the tracers, excluding 85 Kr. A random 5% log-normal and multiplicative error was added to each of the calculated tracer concentrations to represent measurement error and the effect of temporal and spatial heterogeneities on measured concentrations. Fig. 5 shows the ''true'' RTD and 60 samples produced based on random realizations generated from the posterior distributions of the RTD parameters, shown using gray lines. The 60 realizations can be thought of as plausible RTDs inferred by the algorithm, resulting in the true observed tracer concentrations. It can be seen that neither the gamma nor exponential distribution is able to reproduce the RTD on the full time range. The correct estimate at older ages (those over three years) corresponds with a poor match of the early parts of the ''true'' RTD (ages less than three years). More precisely, the very early part of the distribution for ages under half a year is generally overestimated, while the intermediary part, around one year, is underestimated. The shapes of the gamma and exponential distributions are too simple to represent the early part of the ''true'' distribution. In the gamma distribution, the large dispersion of distributions at early ages also shows that the early Table 3 Bayes factors, uncertainty and bias of RTD inference for the hypothetical case using gamma and exponential forms.  5. Simulated age distribution using the inverse particle tracking approach for the Ploemeur aquifer and 60 realizations from the inferred plausible age distributions based on five tracers including CFC-11, CFC-12, CFC-113, SF 6 and 85 Kr. The top row represent the results when a gamma RTD is assumed and the bottom row shows the results when the RTD is assumed to be exponential. The number of years used for inference of the RTD are one year in panels a and d (left), four in panels b and e (middle) and eight in panels c and f (right).

Adequacy of analytical age distributions
part of the distribution is not very sensitive to the tested tracers, and that the information content of the environmental tracers is limited at very early ages. Differences between the ''true'' distribution and the analytical approximation come from the topographi-cal and geological details close to the well. First, the well pumps water at a depth with a minimum time necessary for the water to reach the well from the surface. No such assumptions are taken into account in the gamma and exponential distributions. Second, Fig. 6. Simulated age distribution using the inverse particle tracking approach for the Ploemeur aquifer and 60 realizations from the inferred plausible age distributions based on four tracers including CFC-11, CFC-12, CFC-113, and SF6. The top row represent the results when a gamma RTD is assumed and the bottom row shows the results when the RTD is assumed to be exponential. The number of years used for inference of the RTD are one year in panels a and d (left), four in panels b and e (middle) and eight in panels c and f (right). the shape of the ''true'' distribution between zero and three years is sensitive to the details of the topography. More details about the effects of near well topography on the distribution of the very young (<3 years) of groundwater is provided in Leray et al., 2012. These details produce the irregular shape of the distribution, and they cannot be homogenized into a smooth curve for the later time that concerns a much larger part of the domain. In fact, the part of the RTD greater than three years is captured much better using both the gamma and exponential forms. If the estimated distributions cannot be used for the very early ages, it does not preclude their applicability at later ages. In both cases of four and five tracers, the uncertainty associated with the earlier parts of the distributions increases as the number of years of data used is increased from one to four, and then it slightly decreases (Fig. 6). This may appear counter-intuitive, as more data would seem to reduce the uncertainty. However, the reason for the increase in uncertainty when the number of years of data increases from one to four can be explained better by looking at Fig. 7, which shows the ''observed'' tracer concentrations and the 95% credible intervals of posterior concentrations.

Information content of environmental tracer concentrations
The credible intervals of tracer concentrations when using the four-year time series are about twice as large as those obtained when using a single year (Fig. 7, middle line compared to bottom line). When four years of data are used, the model is more constrained and the observation and model structural errors are reflected more appropriately in the uncertainty associated with the posterior parameters. The multiplication of the data introduces some possible tradeoffs, with some models possibly being better than others at some ages and worse at other ages. It does not mean that the models are more uncertain, but that the uncertainty of the model is better constrained.
Another observation from Fig. 7 is that the temporal trends in the concentrations of some of the tracers provide more information about the RTD than other tracers. For example, the temporal pattern of the concentration of 85 Kr is completely disrupted by the addition of the 5% random noise, while the temporal trend is preserved for most of the other tracers. This is due to the fact that the historical atmospheric 85 Kr concentrations have been relatively flat in recent years (Fig. 1). Conversely, in the case of SF 6 , and to a lesser degree, CFCs, evaluating the 95% credible intervals of tracer concentrations for different years shows that the temporal variations in input concentrations have a statistically significant impact on the observed concentrations at the well. The smaller uncertainty associated with the posterior SF 6 compared to the CFCs can be attributed to the fact that observed SF 6 concentrations varied monotonically by year for the past eight years, while the variation in CFCs was not strictly monotonic. Table 4 Measures of goodness of models including Normalized Bayes Factor and DIC and measures based on the ability of the method to predict cumulative residence time at 10 years for exponential and gamma RTDs as a result of using four and five tracers and one, four and eight years of tracer data. Various measures of adequacy of the models are summarized in Table 4. The method does not seem to provide a definitive conclusion about the forms of RTDs. When four tracers are used, the method tends to assign a larger likelihood (although not substantially) to the gamma RTD (except in the case of four years of data), and when five tracers are used, the exponential and gamma models are given roughly equal odds. This can be due to the fact that neither the gamma nor the exponential model perfectly represents the ''true'' RTD, and neither model is able to fit all parts of it perfectly, especially the early part, as seen before. The DIC values also show that as more data are used, a smaller discrimination between the models is observed. It is interesting to point out that when four tracers are used, the exponential model is given preference in terms of DIC in all cases, while when five tracers are used, the gamma model is always better. Because DIC explicitly accounts for model complexity in determining the goodness of fit, this can be interpreted as meaning that four tracers are not adequate for characterizing the more complex gamma distribution, but when five tracers are used, the gamma distribution can be better determined.
It is also important to note that the 85 Kr tracer that is added in the case of five tracers contains more information about the younger ages, due to its relatively short half-life.

Prediction capacity of analytical age distributions
The uncertainty measure, d 10 , and the bias, b 10 , decrease sharply as the number of years of data is increased (Table 4). Uncertainty d 10 is reduced by a factor of 3-7 when using eight years of data rather than one year, while b 10 is reduced by a factor of 3 to almost 100, depending on the number of tracers and distribution type (exponential or gamma). The reduction is systematic, except for the case of the exponential model from four to eight years of data. The distribution of the fraction of water age greater than ten years, n 10 , gives a more accurate illustration of the interest of the age data time series (Figs. 8 and 9). Using only one year of data, the distribution of n 10 is both biased and uncertain, to the point that the ''true'' value is given a very low probability. With four years of data, the bias is significantly reduced. With eight years, the uncertainty is further reduced. Several years of data reduce both the bias and the uncertainty. The good predictions on n 10 are also confirmed by the small values of bias b 10 and uncertainty d 10 for the eight-year case, except for the exponential case without 85 Kr (Table 4, first line). This highlights the effectiveness of the age data time series for the predictions of the relatively ''older'' water age fraction.
The very small influence of the distribution nature on the predictions is also noteworthy. Whether the exponential or the gamma distribution is used barely affects the distributions of n 10 ,a s predictions are much more sensitive to the fraction of ages greater than 10 years than to the distribution shape. This might be explained by the strong consistency of both distribution shapes at periods longer than ten years (Fig. 5). Depending on the targeted application, the shape of the distribution might not be an essential factor for the predictions.

Conclusion
In this paper, the value of age data time series for the inference of groundwater age distribution was studied. For this purpose, a hypothetical gamma and a simulated age distribution obtained using an aquifer model of the Ploemeur site were used to generate synthetic tracer concentrations for five tracers-3 CFCs, SF 6 , and 85 Kr-at the end of eight consecutive years ending in 2010. Bayesian inference by means of MCMC was used to infer the parameters of presumed mathematical age distribution forms, including exponential and gamma, when variable numbers of years of data were used. In addition, the effect of excluding 85 Kr in the inference uncertainty was evaluated by repeating the analysis without 85 Kr. Using DIC, the main conclusions that can be drawn from this study are: -In the case where the tracer data were obtained using the hypothetical age distribution, using a larger number of data will result in a much more pronounced preference given to the right RTD form in terms of DIC. Fig. 9. True fraction of water with age greater than 10 years (n 10 ) and the inferred posterior distribution of n 10 obtained using four tracers: CFC-11, CFC-12, CFC-113, and SF 6 when one (left), four (middle), and eight (right) years of tracer data are used assuming gamma (top) and exponential (bottom) RTDs.
-The two mathematical RTD forms considered here (exponential and gamma) are too simple to be able to capture the early part of the true RTDs. The fact that the earlier ages cannot be captured well is due to their dependency on local, non-homogeneous aquifer characteristics in the proximity of the pumping well. -The actual bias and uncertainty of the inferred age distributions based on comparing the cumulative ten-year ages (n 10 ) decrease as the number of years of tracer data increases. Both bias and uncertainty evaluated based on 10 years cumulative age decrease when a larger number of years of data is used in the case when five tracers were used. When using four tracers one year of data resulted in a smaller spread for n 10 with a substantially larger bias compared to the case when four years of data was used, which might imply high confidence about the outcome in the absence of knowledge about the true RTDs. This means that more data does not necessarily lead to a reduction in the uncertainty as expressed by the spread of the parameters, but it leads to better assessment of the uncertainty. This is especially true when the true RTD cannot be fully captured by the assumed mathematical forms. -In the case of the simulated RTD, the method cannot definitively pick one of the RTD forms over the other one. However, based on the DIC measure, when the additional tracer, 85 Kr, is included in the analysis, the more complex form (gamma) is given higher odds, while the exponential form is given preference when only four tracers are used. -Regardless of the distribution shape, the reliability of the inferred RTD improves with regard to predicting the ten-year cumulative age, and the uncertainties and bias associated with gamma and exponential models are close to each other. -Shape-free RTDs can liberate us from presuming mathematical forms that restrict the age distribution to a limited range of shapes. However, due to larger degrees of freedom, they require a greater number of tracers to be constrained. Transient tracer information is a promising way to provide the amount of data needed for such an approach.