Inverting passive margin stratigraphy for marine sediment transport dynamics over geologic time

Passive margin stratigraphy contains time‐integrated records of landscapes that have long since vanished. Quantitatively reading the stratigraphic record using coupled landscape evolution and stratigraphic forward models (SFMs) is a promising approach to extracting information about landscape history. However, there is no consensus about the optimal form of simple SFMs because there has been a lack of direct tests against observed stratigraphy in well‐constrained test cases. Specifically, the extent to which SFM behaviour over geologic space and timescales should be governed by local (downslope sediment flux depends only on local slope) versus nonlocal (sediment flux depends on factors other than local slope, such as the history of slopes experienced along a transport pathway) processes is currently unclear. Here, we develop a nonlocal, nonlinear SFM that incorporates slope bypass and long‐distance sediment transport, both of which have been previously identified as important model components but not thoroughly tested. Our model collapses to the local, linear model under certain parameterizations such that best‐fit parameter values can indicate optimal model structure. Comparing 2‐D implementations of both models against seven detailed seismic sections from the Southeast Atlantic Margin, we invert the stratigraphic data for best‐fit model parameter values and demonstrate that best‐fit parameterizations are not compatible with the local, linear diffusion model. Fitting observed stratigraphy requires parameter values consistent with important contributions from slope bypass and long‐distance transport processes. The nonlocal, nonlinear model yields improved fits to the data regardless of whether the model is compared against only the modern bathymetric surface or the full set of seismic reflectors identified in the data. Results suggest that processes of sediment bypass and long‐distance transport are required to model realistic passive margin stratigraphy and are therefore important to consider when inverting the stratigraphic record to infer past perturbations to source regions.


| INTRODUCTION
Reconstructing landscape evolution trajectories-and the environmental boundary conditions that governed them-from the geologic past is a key goal in geomorphology. Such reconstructions are challenging because erosion processes continually destroy past topography, leaving only minor traces of ancient landscapes (e.g., river terraces; Molnar et al., 1994;Schanz et al., 2018;Yuan et al., 2022) from which to deduce past landscape boundary conditions. Fortunately, every source has its sink; all sediment eroded from a terrestrial drainage basin must go somewhere. The sedimentary record, in regions where it is preserved and where there exists plausible long-term connectivity between source and sink, therefore represents our best hope of inferring time-resolved records of landscape change and its tectonic and climatic drivers with reasonable accuracy and precision. One geologic setting with particularly high potential for the preservation of relatively complete records of terrestrial erosion is marine passive margin basins, which contain Earth's most complete archives of sediment sourced from adjacent, eroding terrestrial environments (e.g., Allen & Allen, 2013;Steckler et al., 1988).
Passive margin stratigraphy can, under the right conditions, be used to reconstruct past tectonic and climatic perturbations to Earth's surface (e.g., Baby et al., 2018;Ding, Salles, Flament, Mallard, et al., 2019;Pazzaglia & Brandon, 1996;Poag, 1992;Poag & Sevon, 1989). Whilst the stratigraphic record can suffer from signal buffering, stratigraphic incompleteness and signal shredding (e.g., Jerolmack & Paola, 2010;Sadler, 1981;Straub et al., 2020), the variability that leads to these effects is thought to yield average behaviour that can be predicted at passive margin evolution timescales (tens to hundreds of Ma). Passive margin stratigraphy may reflect large-scale, long-lasting perturbations to landscapes provided that those perturbations have amplitudes and durations that exceed the background level of 'noise' in the sedimentary system (Straub et al., 2020). Historically, efforts to read the stratigraphic record of passive margins have focused on the study of sediment thickness, volume, texture, lithological/ mineralogical make-up and chemistry, yielding interpretations about past terrestrial erosion dynamics (e.g., Poag & Sevon, 1989). As numerical stratigraphic forward models (SFMs) became more common (e.g., Burgess, 2012;Burgess et al., 2006;Granjeon & Joseph, 1999;Steckler et al., 1993Steckler et al., , 1996Syvitski & Hutton, 2001), stratigraphic modellers began to use inverse techniques to extract environmental forcing information from forward simulation of the stratigraphic record (e.g., Bornholdt et al., 1999;Bornholdt & Westphal, 1998;Cross & Lessenger, 1999;Imhof & Sharma, 2006Lessenger & Cross, 1996;Falivene et al., 2014;Zhang et al., 2021). The great potential of that record for revealing past landscape evolution has led to efforts to couple landscape evolution models (LEMs) and SFMs (e.g., Ding, Salles, Flament, Mallard, et al., 2019;Granjeon & Joseph, 1999;Salles, 2019;Salles et al., 2018;Salles & Hardiman, 2016;Zhang et al., 2020) to build full source-to-sink models, and in some cases to use large ensembles of those models to directly invert observed stratigraphy for terrestrial erosion dynamics (e.g., . The idea underpinning such inversions is that misfit between observed and modelled stratigraphy can be minimized to reveal best-fit values for relevant forcing parameters such as rock uplift rate, assuming that the model is an accurate representation of erosion, transport and deposition processes integrated over geologic time. Many previous efforts focused on margin spatial scales and ca. 100 Ma timescales have used an approach in which marine sediment transport is conceptualized as being linearly dependent on local bathymetric slope, which when combined with mass conservation yields a linear diffusion-like model (e.g., Braun et al., 2013;Kenyon & Turcotte, 1985;Moretti & Turcotte, 1985;Paola, 2000;Rivenaes, 1992Rivenaes, , 1997Ross et al., 1994;Rouby et al., 2013;Zhang et al., 2020). However, this approach might not be capable of producing large-scale stratal geometries that agree with observations. In the stratigraphy of many passive margin basins, we observe substantial accumulations of sediment hundreds of kilometres from shore on the continental rise and abyssal plain that must have bypassed the higher gradient continental slope (Lowe, 1976;Syvitski et al., 1988) and then been transported long distances over negligible slopes on the basin floor (Hereema et al., 2020;Luchi et al., 2018;Talling et al., 2012;Wynn et al., 2002).
The sole dependence of sediment flux on local slope neglects both sediment transport over very low slopes and the potential influence of nonlocal transport processes, | 2113 EAGE SHOBE et al. or those processes for which the distribution of sediment travel distances is heavy tailed such that some sediment moves long distances relative to the scale of the model grid (e.g., Foufoula-Georgiou et al., 2010). Transport processes are especially likely to deviate from local slope-dependent behaviour when sediment particles are fine enough to be suspended in the water column as observed in turbidity currents and other marine mass flows (e.g., Mohrig et al., 1998;Parker et al., 1986). In a nonlocal conceptualization of downslope sediment transport, erosion or deposition at a point has some dependence on surface slope elsewhere (Doane et al., 2018;Furbish & Roering, 2013). Nonlocal processes like sediment plumes from river mouths, turbidity currents, marine landslides and debris flows are responsible for much of the long-distance transport observed along passive margins and are therefore relevant for any model that seeks to simulate passive margin stratigraphy. Such processes and deposits may not be fully consistent with the assumptions or predictions of local, linear transport models because they may require nonlocal and/or nonlinear conceptualizations of sediment transport dynamics.
Stratigraphic forward modelling studies have moved beyond local, linear diffusion models to incorporate nonlocal sediment transport dynamics with varying degrees of complexity (e.g., Ding, Salles, Flament, Mallard, et al., 2019;Falivene et al., 2019;Granjeon, 2014;Granjeon & Joseph, 1999;Harris et al., 2016;Sømme et al., 2009;Syvitski & Hutton, 2001). However, the extent to which nonlocality should play a role in large-scale SFMs remains unclear, as previous comparisons between local and nonlocal transport formulations have not always revealed clear differences (Granjeon, 2014), and few studies have focused on the deep, distal portions of margins where nonlocal process dynamics may contribute most to shaping margin form. Whilst substantial effort has been devoted to parameterizing large-scale terrestrial landscape evolution models (e.g., Barnhart et al., 2019;Barnhart et al., 2020aBarnhart et al., , 2020bBarnhart et al., , 2020cGuerit et al., 2019;Yanites et al., 2018) to test how well they predict landscape form (e.g., Barnhart et al., 2020b;DiBiase & Whipple, 2011;Hobley et al., 2011;Valla et al., 2010;van der Beek & Bishop, 2003), the same is not true of SFMs. The mathematical form of simple, long-term/large-scale seascape evolution models that best represents the development of passive margin stratigraphy is currently an open question.
Here, we test a generalized two-dimensional (2-D) SFM that moves beyond local, linear diffusion by incorporating, as suggested by previous work, sediment transport dynamics that allow sediment to bypass steep slopes and travel beyond the base of the continental slope. Our approach is intended not to simulate such processes explicitly, but to model their integrated effects over geologic time. We test the relative applicability of this nonlocal model and the local, linear model by quantitative comparison against seismic stratigraphic data from well-studied passive margin basins along the Southeast Atlantic Margin (SAM), Southern Africa. Results from model-data comparison indicate that, at least over ca. 100 Ma timescales, passive margin seascape evolution and the development of marine stratigraphy are most consistent with a model that incorporates nonlocal and nonlinear transport dynamics. This indicates that passive margin evolution may be dominated by nonlocal, nonlinear sediment transport processes that may be critical ingredients in models used to invert passive margin stratigraphy for past environmental boundary conditions.

| Model dimensionality
Below, we cast the local, linear and nonlocal, nonlinear models in a form that, by convention in the SFM literature (and in contrast to conventions governing LEMs), is referred to as 2-D because any point in the model grid can be uniquely specified by a horizontal and a vertical coordinate. This choice is essential to keep our model evaluation exercise tractable and interpretable given the available stratigraphic data, but it is important to note that fully 3-D SFMs are routinely used (e.g., Falivene et al., 2020;Zhang et al., 2021) and in some cases allow development of preferential nonlocal sediment transport pathways (e.g., submarine canyons) that the models we test here can only claim to represent on average over geologic time (e.g., Granjeon, 2014).

| The local, linear diffusion model
The simplest and longest standing approach to modelling seascape evolution (and therefore the way, by tracking the bathymetric surface through time, of modelling marine stratigraphy) is to use an analogy to the heat equation that yields a linear diffusion equation where elevation z is the variable 'diffusing' over time and where the gradient driving diffusion is the bathymetric slope z x (Kenyon & Turcotte, 1985;Ross et al., 1994). The downslope sediment flux per unit contour length q s goes linearly with local slope (S = z x for simplicity): (1) q s = − K d S, and the divergence of sediment flux sets the rate of bathymetric change: Here, K d [L 2 /T] is a transport coefficient that governs the rate of bathymetric diffusion. The key assumption in this approach is that downslope sediment flux goes linearly with the local slope, such that no variables beyond K d and bathymetry influence the rate of seascape evolution.
There is no clear physical basis for such a slopedependent diffusion equation at low slopes (i.e., on the continental shelf) and shallow water depths (see Paola, 2000 for a review), and an ad hoc solution has been to assert that the diffusion rate constant declines with water depth d (e.g., Kaufman et al., 1992;van Balen et al., 1995) as wave-and storm-driven bed shear stresses are reduced: Here, K d 0 is the diffusion rate constant at the water surface (d = 0) and d * is the e-folding depth scale that governs the decline in K d with depth below the water surface. When d * is small relative to the total basin depth (i.e., when there are substantial declines in sediment transport efficiency with depth), the linear diffusion approach yields morphologies analogous to continental shelves, shelf breaks and steeper continental slopes. Similar results are achieved by asserting that terrestrial sediment fluxes deposit at a fixed slope when they reach the shoreline and then become subject to marine sediment transport by linear diffusion . Linear diffusion models, with or without modifications in the shallow environment, deliver little sediment beyond the base of the continental slope because the governing equation asserts that the downslope sediment flux approaches zero as the local slope approaches zero.
The inconsistency of local, linear diffusion models with observations of nonlocal transport and long-distance sedimentation has long been noted (e.g., Syvitski et al., 1988) and has motivated model modifications such as adding advective components of sediment transport (Niedoroda et al., 1995;Pirmez et al., 1998;Thran et al., 2020), allowing sediment bypass on slopes above some angle (e.g., Lowe, 1976;Ross et al., 1994;Syvitski et al., 1988;Thran et al., 2020), and enforcing that only some (potentially slope dependent) proportion of the sediment flux may be deposited at any given point, with the rest being routed downslope (Ding, Salles, Flament, Mallard, et al., 2019;Thran et al., 2020). There are also several higher complexity, 3-D SFMs that incorporate nonlocal transport by explicitly simulating advective processes (e.g., Falivene et al., 2019;Granjeon, 2014;Granjeon & Joseph, 1999). Here, we generalize ideas from existing SFMs, as well as recent advances from terrestrial landscape evolution modelling, into a simple SFM that incorporates two key modifications to account for both transport over low slopes and nonlocal transport.

| A modified seascape evolution model
The modified model is a generalization of existing ideas for how seascape evolution might deviate from the local, linear model that (1) is simple enough to be applied over basin-filling timescales, (2) is parsimonious enough to allow iterative calibration of all parameters, and (3) collapses under certain parameter values to the local, linear model. The model is most intuitively cast in terms of a balance between the volumetric entrainment rate per unit bed area E and volumetric deposition rate per unit bed area D (e.g., Beaumont et al., 1992;Braun, 2021;Campforts et al., 2020;Carretier et al., 2016;Davy & Lague, 2009;Kooi & Beaumont, 1994;Shobe et al., 2017;van Balen et al., 1995;. The statement of mass conservation that governs the change in bathymetry at a point is: This framework is convenient because both of the models we propose to compare-the local, linear model and the nonlocal, nonlinear model-can be represented by altering the functional forms of E and D. As shown by Carretier et al. (2016), assuming that the entrainment rate is linearly proportional to the local slope S: that K e is an entrainment rate constant or erodibility [L/T], and that the deposition rate is the volumetric sediment flux per unit width q s over the model grid cell spacing dx: yields the local, linear model with behaviour identical to Equation 2. Its two key assumptions are that sediment entrainment depends only on local slope and that the deposition rate depends only on the downslope sediment flux.
The nonlocal, nonlinear model uses Equation 5 to calculate sediment entrainment but makes two key modifications to Equation 6 inspired by observations from passive margin depositional systems. These are intended to allow (1) a nonlinear dependence of sediment transport on local slope to account for the transition to mass failures and turbidity currents at higher slopes as well as sediment bypass on slopes unable to sustain further steepening beyond some critical slope at which frequent failures are generated, and (2) transport of sediment over negligible slopes as observed in data from deep marine deposits (e.g., Wynn et al., 2002). Our modified model rests heavily on recent advances in terrestrial and marine modelling, especially the framework proposed by Carretier et al. (2016) for hillslope sediment transport. Carretier et al. (2016) proposed altering Equation 6 to encapsulate a nonlinear dependence of the deposition rate on slope such that sediment deposition declines as slope increases towards some imposed threshold (e.g., Andrews & Bucknam, 1987;Roering et al., 1999), such that: Here, S c is the critical slope, best thought of physically as the slope at or above which no further deposition can occur and all remaining sediment continues downslope. As discussed by Carretier et al. (2016), this model is nonlocal in the sense that sediment supplied from upslope can continue downslope if the deposition rate is insufficient to disentrain all sediment. Similar approaches to sediment bypass have also been used in recent seascape evolution models (e.g., Thran et al., 2020).
Equation 7 has one feature that makes it ill-suited for modelling marine transport: at a slope of zero, all sediment in transport is deposited. This is not a problem encountered in the eroding hillslopes for which the model was developed (Carretier et al., 2016), but contradicts the observed behaviour of marine sediment transport processes like turbidity currents that can travel hundreds of km over negligible slopes. Because our goal is to simulate the integrated effects of such events over basin-filling timescales, our model must have a mechanism for transport of sediment over negligible slopes.
To allow sediment transport over near-zero slopes, we modify Carretier et al.'s (2016) model by adopting from Ding, Salles, Flament, Mallard, et al. (2019) the idea that only some proportion of sediment in transport will be deposited at any given location. We incorporate this modification by altering Equation 7 to: where is a sediment transport length scale that is at least the model grid cell spacing. When ≫ dx, only some small proportion of the amount of sediment in transport is deposited. The rest continues in transport towards the distal portion of the margin. When = dx, all sediment in transport is deposited. Whilst this approach is heuristic-values of likely depend on grain size but are not tied explicitly in our model to specific properties of the sediment or the transport system-it allows the model to incorporate the general sediment transport patterns thought to occur in the deep, distal portions of continental margins. Modelled sediment can travel long distances down the continental slope because entrainment is linearly proportional to slope (Equation 5) and because deposition becomes negligible as slopes approach the critical slope of non-deposition (Equation 8). At the base of the continental slope, low slopes drive reduced sediment entrainment rates and increased deposition rates, but the condition ≫ dx allows continued transport across the abyssal plain in lieu of direct calculations of debris flow/turbidity current transport (e.g., Parker et al., 1986). The modified model allows an approximation of nonlocal transport in the sense that the amount of sediment deposited at a given distance from shore depends not only on the local slope at that point but on all the points upslope that have contributed sediment to-or removed it from-active transport.
At a point, the rate of elevation change responds to the sediment flux per unit width q s , the entrainment coefficient K e , the slope S relative to the critical slope of nondeposition S c and the sediment transport length scale ( Figure 1). For a given , there is a shift from net deposition to net erosion as S approaches S c as the deposition rate declines and the entrainment rate increases. At a given S/S c , increasing causes a shift towards less deposition (or more net entrainment) as more sediment remains in transport. The S/S c at which there exists a shift from net deposition to net entrainment (i.e., a shift from positive z t to negative z t ) depends on . For S/S c > 1, no deposition can occur, ceases to matter, and entrainment continues to scale linearly with slope.
We follow previous work (Kaufman et al., 1992;van Balen et al., 1995) in our treatment of both the local, linear model and the nonlocal, nonlinear model by asserting that the erosion coefficient K e declines exponentially with water depth d below some surface erodibility value K e 0 : This accounts for the erosive energy that may prevent the development of steep slopes close to the shoreline. The complete governing equation for the commonly used linear, local model in the erosion-deposition framework is found by substituting Equations 5, 6, and 9 into Equation 4: where 0 is the surface porosity and h * is the e-folding length scale governing the decay of porosity with depth. We used 0 and h * values of 0.56 and 2830 m, respectively, obtained by averaging the sand and clay compaction parameters of Guillocheau et al. (2012).
We only apply Equation 11 to positive slopes (defined as sloping from the shore towards the basin). For adverse slopes, we assert for simplicity that E = 0 and D = q s dx . The formulation for adverse slopes would be important in environments where they occur more commonly, but initial tests indicated minimal influence in our simulations where most slopes tilt towards the basin floor.

| Conditions for the collapse of the nonlocal, nonlinear model to the linear, local model
The nonlocal, nonlinear model (Equation 11) is convenient because it collapses to the local, linear model (Equation 10) under certain parameter values such that the key differences between the two approaches can be undone with parameter changes alone. When the slope of non-deposition S c is infinitely large, or in practice is many times greater than the greatest slopes in the model domain, there is no slope-driven reduction in the deposition rate and therefore no sediment bypass on steep slopes. Similarly, when the sediment transport length scale is equal to the model grid spacing dx (this corresponds physically to a case in which sediment cannot travel far over near-zero slopes), there is no transport over flat regions. Parameter values in this model are therefore a direct proxy for model structure (e.g., Barnhart et al., 2020a), meaning that finding parameterizations that match observations can determine optimal model structure and yield insight into seascape evolution processes.
(12) (h) = 0 e −h∕h * , F I G U R E 1 Model behaviour-as shown by the rate of elevation change-as a function of S S c (where S = z x ) and . Decreasing the transport length scale leads to increased deposition, and therefore positive changes in elevation, when the slope is below the slope of nondeposition. When the slope is at or above the slope of non-deposition, the transport length scale ceases to matter because no deposition occurs and all sediment bypasses the cell. The sediment entrainment rate increases linearly with slope, and deposition rate decreases nonlinearly with slope, leading to net erosion as slopes increase towards the slope of non-deposition. The erosion coefficient is held constant in this figure.

PASSIVE MARGIN STRATIGRAPHY
Our goal, rather than simulating margin evolution under an assumed set of parameter values, is to develop insight into model structure by using a data-driven inversion to find the parameter values that yield the best match between modelled and measured passive margin stratigraphy. Best-fit parameter values will illuminate whether the deviations from the linear diffusion approach encoded within our model (sediment bypass and long-distance transport) are necessary to match observed stratigraphy.

Southern Africa
The SAM is a well-studied passive margin sedimentary basin off the western coast of southern Namibia and South Africa ( Figure 2). Our study area consists of the Cape, Orange, Lüderitz and Walvis basins, which are bounded on the southeast by the Agulhas fracture zone and on the northwest by the Rio Grande fracture zone. The basins were initially formed by early Cretaceous rifting that opened the South Atlantic Ocean as Africa separated from South America (e.g., Hirsch et al., 2010). Rifting initiated at ca. 250 Ma (Hirsch et al., 2010), but we focus only on post-rift stratigraphy (Guillocheau et al., 2012;Baby et al., 2018Baby et al., , 2019. The earliest post-rift units are dated to ca. 131 Ma (Baby et al., 2018). We selected the SAM because of the large number of long (in terms of distance from the shoreline) seismic sections that have been collected and interpreted (Baby et al., 2018(Baby et al., , 2019Guillocheau et al., 2012). Sections that have continuous coverage from the shoreline to the nearly flat basin floortypically reached at a distance of between 300 and 600 km from shore on the SAM-are essential to constraining the extent to which the long-distance sediment transport dynamics in our model adequately describe the development of passive margin stratigraphy.
F I G U R E 2 Study area and seismic data, modified from Baby et al. (2019). We use sections 1 and 3-8 and retain the section numbers from Baby et al. (2019) for clarity. We do not use section 2 for our parameter estimation experiments because the thickness of deposits beyond 500 km from the shoreline is unknown. FZ is Fracture Zone. Seven seismic sections interpreted by Baby et al. (2018Baby et al. ( , 2019 comprise the dataset that we will use to test the two models and determine optimal model structure and parameter values (Figure 2). We omit one of their sectionstheir section 2 ( Figure 2)-from our analysis because it is by far the shortest (<500 km) and because at its end point there are deposits approximately 3 km thick. It is not possible to evaluate models for long-distance sediment transport using section 2 because the section ends before deposits reach a negligible thickness.
The data that are most easily compared to SFM output are the geometry of seismic reflectors. We use as our benchmark data sections that have been converted from two-way travel time to depth. Each section has nine seismic reflectors of interest, each representing the top of a particular unit as defined by Baby et al. (2019). The first (deepest) reflector of interest is the contact between basement/syn-rift deposits and the first post-rift deposits, interpreted by Baby et al. (2019) to occur at ca. 131 Ma. The ninth (uppermost) reflector is the modern bathymetric surface. Because the basement/syn-rift surface will be manipulated as a model boundary condition, there remain eight reflectors that can be used for model-data comparison when determining best-fit model structure and parameter values.

| Inversion methodology
The procedure of our data-driven inversion approachmore formally classified as a parameter inference exercise using a genetic algorithm-is to run successive 'generations' (sets of realizations) of the model that are run in parallel and then compared against data using a misfit function we define below in the 'Inversion Experimental Setup' section. The first generation uses parameter values randomly drawn from a uniform distribution. Each generation yields a subset of model runs with acceptable fits; a new generation of model realizations is then created by randomly perturbing the parameter values (in our case using a Gaussian perturbation kernel (Klinger et al., 2018)) of the runs from the previous generation that were deemed acceptable. By running successive generations of realizations, the inversion procedure converges on a region of the parameter space that yields best-fit parameter values. Because parameter values represent the contributions of slope bypass and long-distance transport processes, bestfit parameter values reveal the importance, or lack thereof, of those processes to passive margin evolution. For our inversions, we used the ABC-SMC (approximate Bayesian computation-sequential Monte Carlo) algorithm implemented in PyABC (Klinger et al., 2018), an open-source Python package that allows efficient parameter estimation using the iterative procedure described above. See Sisson et al. (2007) and Toni et al. (2009) for details of ABC-SMC approaches, and Table S1 for algorithm parameters used in our study.
There are many choices that govern inversion behaviour, including the choice of the algorithm itself. Our chosen approach is purposefully similar to genetic algorithm methods used in prior efforts to infer parameters of SFMs (e.g., Bornholdt et al., 1999;Bornholdt & Westphal, 1998;Cross & Lessenger, 1999;Falivene et al., 2014;Imhof & Sharma, 2006Lessenger & Cross, 1996;, but differs in the details of how successful parameterizations are selected from each generation and perturbed to produce the next. Exploratory testing of different parameter inference algorithm choices did not lead to meaningfully different results. Conducting such an inversion exercise requires estimating or assuming initial and boundary conditions for the model that cannot be precisely known from geophysical and stratigraphic data (for example, the subsidence history of the basin floor over the past 130 Ma). We also need to define how model-data misfit will be calculated.

| Model setup and initial and boundary conditions
All model simulations run from 130 Ma, the approximate beginning of the post-rift evolution of the SAM, to present day, with a time step of 1000 years. Model grid resolution is 10 km, a large spatial discretization but one commonly used in large-scale basin modelling (e.g., Granjeon, 2014) and that is sufficient to resolve the first-order morphology of the margin. Because our goal is to invert for best-fit model parameters, rather than boundary conditions, we must assume a set of boundary conditions lest we introduce too many variables into the inversion. Assessment of inversion sensitivity to boundary conditions is a critical next step, but is not treated here. The two key boundary conditions, both of which are functions of time, are the geometry of the basement/syn-rift layer and the sediment flux to the modelled basin.

| Basement geometry
The model is supplied with a value for basement elevation at every point, both initially and at every subsequent time step. We set initial basement geometry at 130 Ma by assuming that the initial post-rift basement had approximately 1/3 the depth, relative to a steady datum, of the modern basement. We then assume that the basement subsided at an exponentially declining rate (McKenzie, 1978) between 130 Ma and present, such that the basement elevation over time at any point declines from its initial elevation to its known present elevation, rapidly at first and then more slowly (with an e-folding timescale held constant at 23.67 Ma for all sections). These simplistic assumptions are broadly consistent with expectations derived from simple thermal subsidence models (e.g., McKenzie, 1978) and give time series of basement elevations in agreement with those deduced from basin reconstruction studies from the Orange Basin (Hirsch et al., 2010). We do not model flexural subsidence due to sediment and water loading (except in the sense that the deepest portions of the basement subside the fastest from the initial to final condition) so that we can have consistent basement geometry between all model runs for a given section to aid model comparison.
The other key simplification inherent to our treatment of basement geometry is that we do not include any uplift or tilting of the margin over the course of its evolution. Stratigraphic analysis (Baby et al., 2018;Rouby et al., 2009), thermochronologic measurements (Stanley et al., 2020), basin modelling (Hirsch et al., 2010) and numerical modelling (Braun et al., 2014;Dauteuil et al., 2013;Stanley et al., 2020) suggest that portions of the SAM experienced two periods of rock uplift. The first was a pulse of tilting from ca. 81-66 Ma that affected the Orange and Lüderitz basins and could have caused a maximum of 1000 m of rock uplift in the proximal portion of the margin (the distal portions of the margin, closer to the hinge point of the tilt, would have experienced much less rock uplift; Aizawa et al., 2000;Baby et al., 2018;Hirsch et al., 2010;Paton et al., 2008). This pulse is hypothesized to result from the passage of Southern Africa over a mantle superswell (Braun et al., 2014). The second hypothesized rock uplift pulse occurred at ca. 30 Ma (though basin reconstruction studies report the pulse as occurring later at ca. 16 Ma (Hirsch et al., 2010)) and had an amplitude of approximately 300-350 m (Baby et al., 2018); the cause of this pulse remains unknown. We choose not to incorporate these perturbations into our basement boundary condition. The magnitude and timing of uplift pulses are inconsistent-and inconsistently constrained-amongst the four basins for which we have data (Baby et al., 2018), and there is still debate about the existence and importance of the more recent pulse (O'Malley et al., 2021). The magnitude of these perturbations is small relative to the up to 7 km of deposits on the SAM. We acknowledge that incorporating these uplift pulses might improve model-data misfit, but we argue that there is insufficient clarity in the data to incorporate them, and that neglecting them would not lead to different conclusions with respect to differentiating between the models we investigate.

| Terrestrial sediment flux
The model requires a value for the terrestrial sediment flux supplied to the basin at every time step. Basin-scale sediment flux reconstructions for the SAM rely on interpolation between seismic sections to derive estimates of volumetric sediment delivery to the margin over the past 130 Ma (Baby et al., 2019;Guillocheau et al., 2012). However, a cursory look at the sections of interest (Figure 2) shows that the total sediment volume, as well as the volume during any given time interval, varies significantly amongst sections within a given basin. To remove uncertainty surrounding the role of sediment flux, we take the simplest possible approach: for each stratigraphic section to which we compare our model, we calculate the sediment flux for each time period by integrating the volume of sediment per unit margin width contained between each set of reflectors along each section whilst accounting for post-deposition porosity loss due to compaction (see Shobe et al., 2022 for code). This approach yields a total sediment volume per unit basin width [L 2 ] for each unit in each section. Because the time duration represented by each section is known from previous work (Baby et al., 2018(Baby et al., , 2019Guillocheau et al., 2012), we can then divide each unit's volume per unit basin width by the time interval to get an average sediment flux to the section per unit width per unit time [L 2 /T]. Figure 3 shows the sediment flux time series obtained by integration, as well as the basin-integrated sediment flux time series from Baby et al. (2019). The sediment flux time series in any one section is reasonably similar to the basin-integrated sediment flux. Estimates from our section integration approach are subject to uncertainty due to stratigraphic incompleteness (e.g., Straub et al., 2020) caused by sediment moving into and out of the plane of the section (i.e., parallel to the margin). There also are non-terrestrial sediments (i.e., carbonates and pelagic deposits; Baby et al., 2018;Guillocheau et al., 2012) in our sections that are counted as terrestrially derived sediment fluxes under our methodology. Incompleteness and non-terrestrial sources likely introduce significant uncertainty into the terrestrial sediment flux estimates. Given that the alternative to accepting these uncertainty sources is to assume that reconstructed basin-scale sediment fluxes were evenly distributed amongst all sections in a given basin, an idea not supported by section volumes or isopach maps (Baby et al., 2019), we argue that we have made the safer assumption by conserving mass within each section we analyse to enable direct comparison of modelled and measured seismic sections. Potential effects of uncertainty in the sediment supply are worthy of future investigation.

| Sea level
We hold sea level constant throughout all model experiments. The amplitude of eustatic sea level variations (ca. 120 m) is small relative to the length and depth scales of the SAM both globally over the past 100 Ma (Bessin et al., 2017) and more recently throughout the Quaternary off Southern Africa specifically (Ramsay & Cooper, 2002). Further, the influence of eustatic sea level on sediment delivery over geologic timescales to the deep, distal portions of continental margins-the places where nonlocal transport dynamics may most influence stratigraphy-is unclear (Falivene et al., 2020;Harris et al., 2016Harris et al., , 2018Harris et al., , 2020Sømme et al., 2009).

| Inversion experimental setup
We use two approaches to compare numerical model outcomes against the stratigraphic record. The first (experiment 1) is to compare the modelled and measured modern bathymetric surface without taking into account the geometry of subsurface reflectors. This has the advantage of simplicity as it does not require accounting for the post-deposition compaction of older reflectors. The second approach (experiment 2) is to simultaneously compare between the model and the data the position of all reflectors (except for the top of the basement/syn-rift deposits, which is a boundary condition). This latter approach is more complicated, but provides a time-integrated picture of model-data (mis) fit rather than relying on only the modern surface. The multi-reflector approach may be particularly important when working with data from the SAM, as the geometry of the uppermost layer (11-0 Ma) is thought to be heavily influenced by contour currents in addition to processes transporting sediment seaward from the coast (Baby et al., 2018). In both experiments, best-fit model parameter values are constrained for each section independently. This approach allows comparison of best-fit parameter values amongst sections to assess the variability of best-fit values across the SAM.
For each set of experiments, we also run an inversion using a parameterization of our model that collapses to the standard linear diffusion model by setting the sediment transport length scale equal to the grid spacing and removing slope as a control on the sediment deposition rate. Comparison of best-fit results between the nonlocal, nonlinear model and the local, linear model will reveal whether the additional complexity we have implemented to approximate nonlocal, nonlinear sediment transport leads to model results that better match observations from the SAM. study by integrating over the depth and length of each seismic section and assuming an exponentially declining porosity profile. Given that the basins range from 500 to 1000 km wide, the two estimates agree to an order of magnitude. deposited at every point i along a section. The misfit function can be written as: where N is the number of cells in the model domainand the number of points to which the seismic section has been downsampled-such that all points except for the boundary condition tied to z = 0 are considered. is the error associated with our observations. Because we do not have an explicit estimate of at every point, which would be a quantity derived during the seismic interpretation process, the value of has no influence on the inversion process because the divisor is constant throughout all of our experiments. Only in a case of spatially or temporally varying would its value affect the search for a best-fit set of parameter values.

| Experiment 2: Calculating misfit using all reflectors
Our second, more sophisticated inversion scheme compares the elevation above basement of the eight reflectors from a given seismic section against the same measurements from each modelled section. This comparison gives rise to the misfit function: where N r is the number of reflectors being compared between each measured and modelled section (in our case N r = 8).
The set of possible misfit functions for an inverse problem is infinite, necessitating somewhat arbitrary choices. Our misfit functions are purely geometricthat is, they use deposit shape alone. This is appropriate given the simplicity of our model, but we note that additional constraints such as sand percentages derived from well-log data can allow the inference of additional model parameters (e.g., Falivene et al., 2014). Other options for constructing misfit functions include comparing deposit thickness or geometry at only a few key points (e.g.,  or, if working in more than one plan view dimension, comparing metrics of plan view margin geometry like the shelf edge (Zhang et al., 2021) or the stratigraphic centroid (Granjeon, 2014;Martin et al., 2009).

| Best-fit parameter values
Of the four parameters governing the nonlinear, nonlocal model, two dominate model behaviour and show narrow ranges that yield the best fit to the stratigraphic data ( Figure 4, Table S2). The two key parameters are the sediment transport distance and the slope of non-deposition. Inversions converge on relatively narrow best-fit regions for these two values, such that substantial deviation from the best-fit values results in much worse model-data fit. The same is not true of the surface sediment erodibility and the erodibility depth scale. For all seven sections, these parameters show large regions over which they provide fits of relatively unchanging quality. This indicates that the sediment transport distance and slope of non-deposition drive most of the variability in model outcomes. Physically, this suggests that it is the spatial pattern of deposition, rather than remobilization of previously deposited sediments, that shapes the SAM.
Comparing parameter distributions across the seven sections (best seen in the kernel density plots in Figure 4) reveals that every section converges on best-fit parameters that depart significantly from the local, linear model. The majority of sections converge on values for the sediment transport length scale of slightly over 2 × 10 5 m.
Recalling that the local model is recovered with a value of 10 4 m (our grid cell spacing), this result indicates that the shape of the modern bathymetric surface in the SAM requires significant long-distance transport even across low slopes. The best-fit slope of non-deposition is between ca. 0.02 and 0.06 for all sections except one-section 1which has no portions of the parameter space that provide a good fit to the data ( Figure 5). Such low slopes of non-deposition imply a significant role for slope bypass, or nonlocal downslope sediment transport. Best-fit S c values many times the maximum slopes observed on the SAM would indicate that sediment transport can be reasonably approximated by transport that depends only on local slope (because sediment bypass becomes negligible when S ≪ S c ; Equation 8, Figure 1). Given that our inverse analyses reveal S c values ranging from ca. 0.02 to 0.06 in the sections where we find reasonable model-data fit, we do not find support for the local transport approximation. Instead, the best fit between modelled and measured stratigraphy is achieved when sediment can bypass slopes of more than a few degrees.

| Comparison of modelled and observed stratigraphy
For five of the seven sections, the inversion yielded bestfit parameter estimates that led to best-fit simulations that qualitatively and quantitatively fit the data reasonably well ( Figure 5). These sections have gently sloping continental shelves with altitudes below, rather than level with, sea level, and smooth, convex-up shelf edges. They have concave-up continental slopes grading into gently sloping continental rise/abyssal plain deposits. Sediment is not always found as far from shore as in the data, but noticeable accumulations of sediment are observed up to ca. 1000 km from shore. Two sections, 1 and 7, yielded what we interpret to be substantially worse fits as defined by the mismatch of major morphometric features like the continental shelf edge and the curvature of the continental slope. It is difficult to know why the fits are substantially worse for sections 1 and 7. One key commonality that the two sections share is a relatively high proportion of the total sediment volume stored at the extreme distal end of the section. Whilst our approach does allow for more realistic modelling of long-runout sediment transport than the classic local, linear approach does, there is still a fundamental tension in which allowing sediment to accumulate at the very distal end of the modelled section requires too much inhibition of deposition at the proximal end. It may not be possible for our model to deposit F I G U R E 4 Results for all seven sections from the search for a best-fit parameterization of the nonlocal, nonlinear model with the inversion procedure constrained only by the modern bathymetric surface. Scatter plots show model-data misfit (colour) as a function of the four key parameters. Kernel density estimate (KDE) plots show the distribution of values for each parameter. Because the inversion procedure runs more model realizations in regions of the parameter space with reduced model-data misfit, peaks in the KDE plots can be interpreted as showing the region of each parameter's range that leads to the lowest misfit. Narrow peaks in the KDE plots indicate parameters with well-constrained best-fit values, whilst broad peaks indicate parameters for which a wide range of values produces similar misfit. Numbered sets of plots refer to the seismic section used for the inversion. Maximum and minimum misfit values vary between sections; colour values have been scaled for interpretability.

F I G U R E 5
Comparison between modelled and measured stratigraphy for the best-fit nonlocal, nonlinear model realization for each section. Bottom panels show two spatially resolved measures of misfit. Whilst all modelled reflectors are shown (and are compacted to account for overburden), only the modern bathymetric surface was used to assess model-data fit in this experiment; subsurface modelled reflectors were not compared against data to assess fit. Percent error points that appear to be missing are >100%; values of exactly 100% error typically occur where the model deposited no sediment. is total misfit given by Equation 13; VE, vertical exaggeration. enough sediment in distal reaches whilst preserving steep, well-defined shelf edges. This weakness would not be resolved in section by raising the maximum possible S c value ( Figure 4); increases in S c would further inhibit transport to the basin floor.
Comparison of modelled and observed subsurface reflectors, though it was not quantitatively incorporated into the misfit function in this experiment, shows that the pattern of reflectors is almost completely depositional. There are fewand only minor-instances of reflectors being truncated by overlying units, indicating that the story in these models is one of continuous deposition rather than episodes of deposition and re-erosion driven by variations in the terrestrial sediment flux time series. This is broadly concordant with the interpreted geologic history of the SAM, in which-barring the episodes of rock uplift that we have not modelled herethere is little erosional truncation of units except by eustatic variations in the nearshore. This concordance of modelled and observed stratigraphy suggests that our model is not only producing reasonable final bathymetry, but is building a stratigraphic record that reflects the long-term average of the processes shaping the SAM.

| Comparison between the nonlocal, nonlinear model and the local, linear model
Here, we compare inversion results between the two models to assess whether the nonlocal, nonlinear model leads to substantially better fits between modelled and measured stratigraphy. We search for the best-fit local, linear model using the same procedure as for our new model; the only two parameters to optimize in the local, linear model are the surface sediment erodibility K e and the depth scale over which it decays d * .
Using only the modern bathymetric surface as a constraint, the local, linear model converges to a narrow range of surface erodibility values and a broader region of erodibility decay depths ( Figure 6, Table S2). All sections except section 6 indicate that the model is 'searching' for erodibility decay depth values even greater than the 40,000 m maximum value in the inversion. At the maximum values of 40,000 m, erodibility in the deepest parts of the margin only declines to ca. 80% of its value at the water surface such that sediment entrainment can still occur in the deep, distal reaches of the margin wherever nonzero slopes are found. We interpret this behaviour as the local, linear model compensating for its lack of mechanisms for long-distance sediment transport by allowing substantial erosion at great depth. Interestingly, the tendency of the inversion procedure to identify d * values large enough that sediment erodibility does not meaningfully decline with depth suggests that whilst erodibility decay with depth may give rise to realistic-looking shallow marine morphometric features like shelf breaks (Kaufman et al., 1992;van Balen et al., 1995), such an approach may ultimately be counterproductive when we expand our view to include the distal portion of the margin because it yields models that cannot transport sediment far enough from F I G U R E 6 Results for all seven sections from the search for best-fit parameter values for the local, linear diffusion model constrained only by the modern bathymetric surface. The tall, narrow region of good-fitting models indicates that only a narrow range of surface erodibility values leads to minimized misfit. The majority of sections (all except 6) have converged to the maximum values of the erodibility decay depth scale, indicating that even higher values would lead to further improvements in model-data fit. Given that under our imposed maximum value of 40,000 m, erodibility in the deepest regions of the margin only declines to ca. 80% of its value at the water surface, further improvements to model-data fit from increasing the maximum decay depth would be marginal. shore without some other process or additional changes in erodibility with depth or distance from shore.
The local, linear model provides, for all sections that can be reasonably fit by either approach, a worse fit to the modern bathymetric surface than was obtained with the nonlocal, nonlinear model (Figures 7 and 8). Whilst bestfit parameterizations of the local, linear model do exhibit sediment delivery to the distal portions of the sections (achieved through large erodibility decay depths that yield non-negligible erodibility at depth), this comes at the cost of model-data fit in the nearshore environment. The large erodibility decay depths required to enable transport of sediment far from shore precludes the local, linear model from achieving the rounded, shallow continental shelf edge observed in the data. Instead, a shelf of sorts is created simply by progradation of the shoreline as sediments accumulate in the nearshore but are prevented from accumulating above sea level under the assumption that the shoreline will prograde under such conditions. Shoreline progradation, combined with an erodibility that is nearly constant throughout the depth profile, results in sharp shelf breaks grading immediately into the concave-up continental slope rather than the smooth, convex-up shelf breaks observed in the seismic data. The local, linear model is effectively being forced to choose between accurately reproducing the shelf edge and delivering sediment to the distal portions of the margin. Because our misfit function incorporates every point along each section, the model minimizes misfit if it delivers sediment far from shore even at the cost of reproducing the shelf and shelf edge. A misfit function that focused on the nearshore (e.g.,  would likely lead to the opposite endmember of this trade-off. Though our misfit function in this experiment did not incorporate comparison between observed and modelled subsurface reflectors, the local, linear model-even in its best-fit parameterizations-does not stand up to a qualitative assessment of the form of the subsurface reflectors it produces (Figure 7). To deliver sediment far from shore, the local, linear model must first deposit that sediment in a proximal location and then erode those deposits during times of low terrestrial sediment flux. The time series of reflectors produced in most of the local, linear best-fit simulations reveal a steep, prograding wedge of sediment that is then smoothed out to lower gradients through subsequent erosion. Except for the brief periods in SAM history when the margin experienced substantial rock uplift, which we do not model, there is limited evidence for significant erosional truncation beyond that occurring in the nearshore due to eustatic variations (Baby et al., 2018). The reflectors from the nonlocal, nonlinear model ( Figure 5) do not show this pattern of progradation of a steep-fronted sediment wedge followed by later truncation by erosion; they instead show consistent accumulation of sediments through time at any given location, including the distal reaches of the basin. Interpretation of the stratigraphic record suggests that the latter behaviour is more consistent with the history of the SAM.
It is unsurprising that the nonlocal, nonlinear model provides a better fit to the data than the local, linear model ( Figure 8) in all but one case where neither model provided a reasonable fit and imposed parameter ranges prevented the more complex model from fully minimizing misfit (Figure 4)-the latter model is a restricted subset of the former. The critical results of this comparison are that (1) the model requires significant deviation from linear diffusion parameter values (i.e., a large travel distance relative to the model grid cell spacing and a critical slope low enough that sediment bypass is common) to provide a reasonable match between modelled and observed bathymetry, (2) the local, linear model cannot through parameter adjustments provide fits that approximate the outcomes of the nonlocal, nonlinear model, (3) the dynamics of the local, linear model as revealed by subsurface reflectors are not supported by observations from the SAM, and (4) six of seven sections show a reduction in misfit-and four sections show at least a factor of two reduction-achieved by adding nonlocal, nonlinear transport dynamics (Figure 8). This suggests that long-distance transport and slope-dependent sediment bypass processes are required to form the canonical shapes of passive margin stratigraphy, and therefore argues that these processes are essential ingredients in SFMs, at least for passive margin settings.

| The influence of considering multiple reflectors
Parameters estimated by the inversion that takes into account all eight reflectors are surprisingly similar to those estimated when using only the modern bathymetric surface to constrain the inversion. For brevity, we show average parameter values for the 50 best-fitting model realizations from the single reflector and multiple-reflector inversions plotted against each other (Figure 9) such that points falling on the 1:1 line indicate consistent parameter values achieved by the two methods. See Table S3 and Figures S1-S4 for detailed results of multi-reflector inversions.
Inclusion of all reflectors in the misfit calculation for the nonlocal, nonlinear model resulted in a shift towards slightly greater best-fit travel distance values (Figure 9a), likely because the data require that good-fitting models be able to distribute sediment to the distal portion of the basin even relatively early in F I G U R E 7 Comparison between modelled and measured stratigraphy for the best-fit local, linear diffusion model realization for each section. Bottom panels show two spatially resolved measures of misfit. Whilst all modelled reflectors are shown (and are compacted to account for overburden), only the modern bathymetric surface was used to assess model-data fit in this experiment; subsurface modelled reflectors were not compared against data to assess fit. is total misfit given by Equation 13; VE, vertical exaggeration. the margin's evolution when there do not yet exist the slopes required to drive sediment bypass in the absence of another mechanism for long-distance transport. The critical slope of non-deposition (Figure 9b) remained remarkably consistent between the surface-only and multiple-reflector inversions, suggesting that the model most effectively adjusts to the need to deliver early deposits far from shore with changes in the travel distance, which affects transport over all slopes, rather than the critical slope, which only affects transport over meaningful gradients. Physically, this may indicate the importance of long-runout sediment transport processes (e.g., turbidity currents, marine debris flows) that may initially be generated by significant bathymetric slopes but then transport sediment up to hundreds of km over vanishingly low slopes. The erodibility and erosion depth scale (Figure 9c,d, respectively) show more scatter between inversion methods; this is not surprising given that there is a large region of good-fitting values for both parameters (e.g., Figure 4).

F I G U R E 8
Misfit values for the best-fit model for each section using the nonlocal, nonlinear model (dark blue) and the local, linear model (light blue) when the model fit is determined by comparing only against the modern bathymetric surface. The nonlocal, nonlinear model yields better fitting best-fit realizations for six of seven sections.

F I G U R E 9
Comparison between best-fit parameter values derived from the surface-only inversion and the multiplereflector inversion. Black lines indicate a 1:1 match between parameter values derived by the two methods. In the case of the nonlinear, nonlocal model (column 1; a-d), the two most important parameters fall close to the 1:1 line, indicating that the inversion method (whether subsurface information is incorporated or not) does not have a strong influence on the bestfit parameter values and therefore on predicted margin stratigraphy. In the case of the local, linear model (column 2; e-f), erodibility values are consistent between methods, whilst erosion depth scale values show more scatter.
Including all reflectors when searching for best-fit parameters for the local, linear model leads to surface erodibility values that largely fall near the 1:1 line (Figure 9e), indicating that the composition of the misfit function did not have a strong effect on the best-fit value. The same is true of the erodibility decay depth scale (Figure 9f) with the exception of two values that changed significantly between the surface-only and multiple-reflector inversion schemes. We attribute the overall consistency between parameter values derived using the two different methods to the fact that all reflectors in our seismic data show a similar pattern: long-distance transport beginning from the earliest stages of post-rift margin evolution, followed by the largely depositional draping of successive units atop previous deposits. In this respect, the modern surface is not geometrically distinct from the subsurface reflectors, which may explain why incorporating the subsurface reflectors leads to little improvement in model-data fit. A model can either achieve parameter values that allow it to develop these types of deposits (i.e., in the nonlocal, nonlinear model) in which case the specific number and age of reflectors used does not have a significant effect on inferred best-fit parameter values, or it cannot achieve parameterizations that allow long-distance, depositiondriven stratal stacking patterns (i.e., in the local, linear model) in which case the specifics of the misfit function do not matter because the fit to eight reflectors will be no better than the fit to a single one. We initially undertook the multiple-reflector inversion because the modern bathymetric surface is thought to be heavily influenced by contour currents (Baby et al., 2018). Adding seven subsurface reflectors does not substantially change inferred best-fit parameters, which may indicate that variability in contour current effects amongst units does not cause a radical enough change in stratigraphic architecturerelative to the effects of subsidence and terrestrial sediment flux-to influence our simple models.
When the misfit function incorporates all eight reflectors, the nonlocal, nonlinear model yields a better fit to the observed stratigraphic data than the local, linear model does for all seven sections ( Figure 10). The improvement in model-data fit gained from adding nonlocal, nonlinear sediment transport dynamics exists regardless of whether we use only the modern surface or all reflectors as a basis for comparison. The misfit values between the two models are much closer when all reflectors are used for the inversion (Figure 10). This arises from the introduction of seven additional constraints on the model, many of which it must inevitably fail to match ( Figure 5) even in its bestfit parameterization. However, the consistent reduction in misfit that accompanies the nonlocal, nonlinear model signals that those dynamics are required to produce stratigraphy that matches observations. The only scenario where this would not hold true is one in which a misfit function was used that did not take into account the distal portions of the basin at all. Given the substantial accumulations of sediment in the distal portions of the SAM (Figure 2), we argue that finding models that adequately simulate those deposits is a prerequisite for closing the source-to-sink mass balance.

IMPLICATIONS FOR INVERSION OF THE STRATIGRAPHIC RECORD
Our motivation in testing SFMs is to enable the inversion of the stratigraphic record for information about past terrestrial environments and geomorphic processes. If reasonably effective SFM structures and parameterizations can be identified a priori, then coupled LEM/SFMs will be more useful for inferring drivers of past landscape evolution. Our results favour the idea that SFMs should incorporate mechanisms for sediment bypass and long-distance transport, and that these processes cannot be adequately mimicked with parameter changes in the commonly used local, linear diffusion model for seascape evolution. Our study further emphasizes that both mechanisms of nonlocality (bypass and long-distance transport) are required to achieve the model-data agreement we find; Figure 4 demonstrates that one element or the other is not sufficient to place the model in the good-fitting region of the parameter space.
The nonlocal, nonlinear model we tested represents an amalgamation of ideas from previous workers that have not previously been evaluated in detail against F I G U R E 1 0 Misfit values for the best-fit model for each section using the nonlocal, nonlinear model (dark blue) and the local, linear model (light blue) when the model fit is determined by comparing against all seismic reflectors. The nonlocal, nonlinear model yields better fitting best-fit realizations for all seven sections. stratigraphic data, and our analysis reveals that it provides a substantial improvement over the more widely used local, linear model. However, the nonlocal, nonlinear model still needs improvement. Aside from subsuming a wide array of marine transport processes into two key transport parameters, its most critical shortcoming is that it only heuristically accounts for the momentum that allows transport processes like turbidity currents and marine debris flows to carry sediment into the distal portions of basins. More effective conceptualizations of sediment entrainment and disentrainment, possibly following recent advances in hillslope geomorphology (e.g., Doane et al., 2018;Furbish et al., 2021), might further improve SFMs with the understanding that the models will always need to simulate the spatial and temporal average of marine sediment transport if they are to prove feasible for inverse analyses that require 10 5 -10 6 forward model realizations. Improving model fit-especially abrupt slope breaks driven by changes in process dominance-may require multiprocess models (e.g., Granjeon & Joseph, 1999;Syvitski & Hutton, 2001), but their parameter-rich nature may hinder parameter estimation exercises and make them susceptible to overfitting to a given calibration location. There exist sufficient models in the literature that span a wide range of complexity that, as in this study, the future challenge is more about rigorously testing models against data to find the simplest workable theory than it is about developing new models.
Though we used seven seismic sections spanning four basins to evaluate different SFMs, our study is limited to a single passive margin. Best-fit regions of the parameter space for the nonlinear, nonlocal model's travel distance and critical slope of non-deposition parameters consistently showed that the model was not collapsing to its local, linear parameterization, but the key parameters still exhibited considerable variability amongst sections ( Figure 4). Whilst our analysis may have restricted the range of possible values that need to be considered when using such a model to invert the stratigraphic record, a set of global parameter values cannot be assumed. Similarly, we have not established sensitivity of inversion outcomes to initial and boundary conditions and additional processes-including eustatic sea level, lithospheric flexure and terrestrial sediment supply-which are well-understood in the SAM relative to other regions but still carry considerable uncertainty (e.g., Guillocheau et al., 2012).
Flexure is a process of particular interest given that it can influence the location of depocentres and resulting stratal geometries. We have not treated flexure here to ensure that modelled stratigraphy is compared in the context of a consistent time-evolving basement geometry. We suspect that adding flexure to the model would not alter the conclusion that nonlocal processes govern the development of passive margin stratigraphy. The generally proximal deposition in the local, linear model (Figure 7) might cause flexural subsidence in those locations, thereby potentially reducing bathymetric slopes and resulting fluxes of sediment towards the distal portions of the basin. The longer-distance deposition given by the nonlocal, nonlinear model ( Figure 5) may result in less proximal flexural subsidence and the maintenance of greater bathymetric slopes, allowing enhanced transport towards the deep, distal portions of the margin. Nonetheless, the relative importance of nonlocal transport processes in models including flexural subsidence is important to examine.
A final open question is the applicability of our findings given the reduced dimensionality of our modelling exercise. We tested 2-D implementations of our candidate models. This means that the models enforced purely margin-perpendicular sediment transport, when in reality margin-parallel components of transport-such as contour currents that are known to have influenced the SAM (Baby et al., 2018)-also occur. Our 2-D implementations also cannot simulate processes that cause the development of preferential sediment transport pathways, like submarine canyons and channels. We, therefore, must interpret the improvement in fit given by our nonlocal, nonlinear model as arising due to the model's ability to simulate average sediment transport patterns that occur as a result of nonlocal processes whose effects likely vary spatially over geologic time, like, for example, a submarine channel undergoing avulsions across a deep-sea fan. Though there exist plenty of 3-D SFMs (e.g., Falivene et al., 2019;Granjeon & Joseph, 1999;Salles et al., 2018), testing optimal SFM structure in two dimensions remains an important stepping stone towards inverting terrestrial landscape history from stratigraphy because the simplicity and parsimony of 2-D models allow relatively efficient calibration even in data-poor situations.

| CONCLUSIONS
We evaluated a simple, nonlocal, nonlinear model for marine sediment transport and the development of marine stratigraphy over geologic time. The model builds on the concepts of sediment bypass espoused by previous authors (e.g., Ding, Salles, Flament, Mallard, et al., 2019;Ross et al., 1994;Syvitski et al., 1988) that have not previously been directly tested against observed stratigraphy. Quantitative comparison of the model against seven stratigraphic sections from the SAM reveals that: 1. The nonlocal, nonlinear model can achieve parameterizations that develop realistic marine bathymetry and stratigraphy, though variability in best-fit parameter values exists amongst the seven seismic sections tested. 2. The nonlocal, nonlinear model does not converge on parameter values that result in a collapse to the local, linear model. The local, linear model cannot fit the data. It fails both to fit the modern bathymetric surface and to yield seascape evolution trajectories that match observations. 3. The key difference between the two models lies in the ability of the nonlocal, nonlinear model to deliver sediment to distal portions of the basin without compromising its ability to develop realistic nearshore morphology and stratigraphy. 4. Points (1) through (3) hold true regardless of whether model parameters are optimized using only the modern bathymetric surface or the full suite of subsurface seismic reflectors, indicating that our results are robust to the specifics of the misfit function employed. 5. Processes of sediment bypass and long-distance transport govern the architecture of the stratigraphic record over basin-filling timescales, making it essential that SFMs capture at least the spatial and temporal averages of these nonlocal processes.
Given the general lack of terrestrial evidence for past landscape evolution dynamics, the stratigraphic record represents our best chance to learn about the erosion trajectories of landscapes long gone. We tentatively suggest that the transport dynamics encapsulated in the nonlocal, nonlinear model govern the development of passive margin stratigraphy. Our ability to invert the stratigraphic record, either on its own for inferring sediment supply to basins or coupled with landscape evolution models to infer past tectonic, climatic and/or lithologic boundary conditions, would benefit from improved understanding of such nonlocal transport processes.