0% found this document useful (0 votes)

51 views

Calibration and Validation of Multiple - Split Sample - N

The document discusses calibration and validation of multiple regression models for stormwater quality prediction. It examines the effect of calibration dataset size and characteristics on model performance, and determines an optimal split of available data into calibration and validation subsets. The key findings are: 1) Multiple regression models are sensitive to the calibration data used, with smaller calibration datasets resulting in poorer predictions despite good calibration. 2) Randomly splitting available data into halves for calibration and validation is suboptimal. More data should be allocated to calibration, with the validation proportion increasing with total available data up to around 35% for datasets of 55 events.

Uploaded by

Azamor Cirne de Azevedo Filho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views

Calibration and Validation of Multiple - Split Sample - N

Uploaded by

Azamor Cirne de Azevedo Filho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CALIBRATION AND VALIDATION OF MULTIPLE

REGRESSION MODELS FOR STORMWATER QUALITY

PREDICTION: DATA PARTITIONING, EFFECT OF DATA
SETS SIZE AND CHARACTERISTICS

M. MOURAD, J-L. BERTRAND-KRAJEWSKI and G. CHEBBO**

*Laboratoire URGC Hydrologie Urbaine, INSA de Lyon, 34 avenue des Arts, 69621 Villeurbanne cedex,
France.
**CEREVE, ENPC, 6-8 av. Blaise Pascal, Cité Descartes, Champs sur Marne, 77455 Marne la Vallée -
cedex 2, France.

Abstract
Two main issues regarding stormwater quality models have been investigated. i) The effect of
calibration dataset size and characteristics on calibration and validation results. ii) the optimal
split of available data into calibration and validation subsets. Data from 13 catchments have
been used for 3 pollutants: BOD, COD and SS. Three multiple regression models were
calibrated and validated. The use of different data sets and different models allows viewing
general trends. It was found mainly that multiple regression models are case sensitive to
calibration data. Few data used for calibration infers bad predictions despite good calibration
results. It was also found that the random split of available data into halves for calibration and
validation is not optimal. More data should be allocated to calibration. The proportion of data to
be used for validation increase with the number of available data (N) and reach about 35 % for
N around 55 measured events.

Keywords Stormwater, modelling, calibration, validation, uncertainty.

INTRODUCTION
Stormwater discharges from both combined and stormwater sewer systems are considered as a
large source of pollutants into the receiving waters. Lately, growing concerns about the quality of
surface waters have led to more demanding water legislation. The design of control and treatment
facilities as well as management strategies, in many cases, require the estimation of discharged
pollutant loads at different scales of time in some specific locations of the sewer system. A wide
variety of stormwater quality models can be found in the literature, ranging from very simple
ones such as the simple method (Schueler, 1987) to complex fully detailed models implemented
in commercial softwares such as the well known Infoworks CS (Wallingford Software) and
MouseTrap (Danish Hydraulic Institute). The complexity of a model can be quantified by the
number of parameters or number of processes included in the model. Most of the existing models
are based or coupled with statistical or empirical approaches, generally, incorporating conceptual
parameters. Such hardly measurable parameters must be calibrated so the output of the model can
match the observations used for calibration. The more the model is complex, the more
observations are required for calibration to cover all possible conditions of use.

In practice, the models are generally used for predictive purposes. Good match of calibration data
doesn’t guarantee that future predictions will be good. To increase confidence in prediction
ability of a model, validation is performed. There is not yet a complete agreement on what
validation can or should encompass. Nevertheless some methods have been widely admitted like
the split sample validation. Split sample validation consists of splitting available data into two
samples. One sample is used to calibrate the model and the other one to test the prediction ability
of the model. Independently of the modelling approach and research field, available data are

1
generally splitted randomly into halves (e.g. Kerlinger and Pedhazur, 1973; Schlütter, 1999; Vaze
and Chiew, 2003). Jewell et al. (1978) suggest the same split ratio but stated also that if limited
data are available, it is the usual practice to use the larger portion of the data for calibration and
the smaller portion of the data for validation.

Among modelling approaches, multiple regression models (MRMs) have intermediate

complexity. This approach has been used widely (e.g. Driver and Tasker, 1990; Saget, 1994). It
can be considered as simple at first sight. Nevertheless great attention must be paid to hypotheses
and limits of these models, which are frequently neglected by many users. A MRM aims to
estimate a single variable by means of a set of other explanatory variables. In this study, the event
mean concentration (EMC) is evaluated using the rainfall or/and the flow characteristics.

To carry out calibration and validation properly, large datasets are required. Unfortunately, in
practice, the number of measured events is limited because of the high costs of monitoring
campaigns and the relatively short time devoted for the studies. Thus engineers and researchers
are often found dealing with limited datasets to be used for model identification, calibration and
validation. In this case, what would be the effect of calibration dataset size and characteristics on
the estimated relationship and how to optimally split the available observations into calibration
and validation sets?

Three MRMs are implemented in the stormwarter quality module of the French software Canoe
(Insa/Sogreah, 1999) having all the same structure. The models are presented in Table 1. In an
attempt to explore the questions above, these models served as a bench test and were applied to
data from 13 catchments and for three pollutants BOD, COD and SS.

EL = K ⋅ ADWPa ⋅I max
b
5
⋅Vr c
M1
EMC = EL = K ⋅ ADWPa ⋅I max b
⋅Vr c −1
Vr 5

EL = K ⋅TP a ⋅Db
M2
EMC = EL
Vr
M3 EMC = K ⋅ ADWPa ⋅TPb ⋅I maxc
5
Where EL is the event total load ; EMC is the event mean concentration ; ADWP is the antecedent
dry weather period ; Imax5 is the maximum intensity on 5 minutes period ; Vr is the total runoff
volume ; TP is total precipitation ; D is the event duration ; K, a, b and c are calibration parameters.
Table 1 : the three MRMs used in the study

METHODS
CALIBRATION DATASET EFFECT
In order to study the effect of the size and characteristics of calibration datasets on the results of a
model, subsets were sampled from the available data. The size n of subsets ranged from 4 to N - 2
where N is the number of available data. For each n a large number of subsets was drawn
randomly without replacement (typically 1000). In fact the objective is to see how the results of a
model could vary if a subset of events have been measured instead of all available data for the
same period of time. Hence in each subset, each event can not occurs more than once and for
each n, all subsets were distinct.
CALIBRATION
The type of the model determines the extent of calibration required. The desirable output of the
model must be considered and be of focus. Generally, a model is calibrated by minimizing the
distances between calculated and measured values. The sum of the squared errors is one of the

2
criteria most used for calibration. The use of a least squared errors approach infers constancy of
error variance for all observations. However larger uncertainties in hydrologic variables are
known to be associated to larger variables values. To cope with this problem, a logarithmic
transformation is applied for all variables. This transformation achieves stability in the error
variance, normality of residuals and linearity of the regression model, making it more easy to
calibrate with ordinary least square method. To allow comparison between data sets
independently of n, the Root Mean Squared Error (Eq. 1) is used as a measure of goodness of
calibration.

∑(Oi − Ti )
n 2

i =1
Eq. 1
RMSE =
n

where Oi denotes observations and Ti denotes model outputs.

Calibration has been performed for the three models shown in Table 1 for BOD, COD and SS
using data from 13 catchments. Data of 12 catchments have been extracted from the French
database on stormwater discharges QASTOR (Saget and Chebbo, 1996). The thirteenth
catchment is "Le Marais" catchment in Paris, France (Chebbo et al., 1999). A summary of the
available data for each catchment is given in Table 2.

Combined sewer systems Separated Sewer systems

Catchment BOD COD SS Catchment BOD COD SS
La Briche d11 12 13 13 aixnord 33 41 41
La Briche dd11 9 9 9 aixzup 41 47 47
La Briche enghien 10 10 10 ulissud 29 29 29
La Briche PHI 16 16 16 velizy 22 22 22
La Briche PLB 16 16 16 Maurepas 59 59 59
Mantes 23 23 23 ulisnord 57 57 57
Le Marais - 64 67
Table 2 : Summary of available data (EMCs)

OPTIMAL DATA SPLIT

Let Y denote the N × 1 vector of the dependent variable to be estimated and X denote the N × K
matrix of independent variables, where N is the number of observations and K is the number of
independent explanatory variables in the model. Suppose that N 1 ( N 1 ≤ N ) observations are
selected randomly from the available dataset for calibration. The relationship between Y 1 and
X 1 can be written as follows:

( )
Y1i = f X1i⋅ , β + ε1i Eq. 2

where i⋅ means that all the ith row of X is considered

β is the vector of the estimated parameters of the model
ε1 is the ( N 1 × 1 ) vector of the model calibration errors

Let Y 2 and X 2 denote the N 2 observations held out for validation ( N 2 = N − N1 ). The
prediction error is vector ε2 :

3
ε2 = Y2 − Ŷ2 Eq. 3

where Ŷ2 is the predicted concentrations given by

(
Ŷ2i = f X2i⋅ , β ) Eq. 4

For each possible split of the available data (from N1 = 4, N2 = N - 4 to N1 = N - 2, N2 = 2) a large

number of random possible splits is available. The number of returned splits was limited to 1000.
The root mean squared error (RMSE) is used as an overall measure of goodness for validation as
like as for calibration.

For each value N1 one gets two distributions of the RMSE, one for calibration and one for
validation. Since the validation aims to prove that the model is able to give results for prediction
as good as for calibration, the two distributions can be compared. The results showed that the
distributions are not always normally distributed. Hence the use of non-parametric tests is
recommended. The idea is to test if the two distributions are identical. The best split will be the
one that maximizes the probability to have two identical distributions. The most appropriate test
in our case is the Wilcoxon rank sum test (Hollander and Wolfe, 1973). This is a test of the null
hypothesis H0 that two samples are drawn from the same distribution, against the alternative
hypothesis H1 that the distributions have different origins (Figure 1). The test returns the
significance probability "p" that the two distributions are identical.

Figure 1: Illustration of the Wilcoxon test

RESULTS
CALIBRATION AND VALIDATION VARIABILITY
The procedures above were applied to each catchment, for each pollutant and the three models.
Regardless the model, the pollutant and the catchment that have been used, the distribution
evolution of the calibration RMSE as a function of the number of data used in calibration is quite
the same (Figure 2). With only four events used for calibration, the models were able to fit
perfectly the observations, of course with a different set of parameters for each calibration subset.
This is not surprising, as it is easy to perfectly fit 4 rainfall events with a model having 4
calibration parameters. It was found also that generally more than 20 events are necessary to have
a calibration RMSE distribution centred on the RMSE value obtained using all available data for
calibration.

4
As well as for calibration distribution, the evolution of the validation distribution was also
identical for all the catchments and pollutants using the three models. A typical plot is given in
Figure 3. For n = 4 the RMSE values are very high and situated out of the figure for scale
purposes. In fact, when few data are used for calibration, the model not only fits the data but it
fits also measurement errors. In addition to this, data don’t cover enough possible conditions.
This explains the weakness in prediction ability of the calibrated model.

Figure 2 : Typical calibration distributions plot

Figure 3 : Typical validation distributions plot

5
An analysis of the calibration subsets giving best results (most likely the RMSE using all data)
will be carried out to identify potential characterized groups of events. This might help in
optimising data collection and reducing costs.

DATA SPLIT
As it was mentioned previously, the Wilcoxon rank sum test was applied to test if the RMSE
distributions for calibration and validation are identical for each split. For each data split, the
significance probability is returned. The example of "Le Marais" catchment for COD using model
M3 is shown in Figure 4. In this case the calibration and validation RMSE distributions are most
likely identical when using N1 = 49 events for calibration and N2 = N - N1 = 15 events for
validation. This means that using approximately 25 % of the data for validation is statistically
optimal. This ratio has been calculated for the three pollutants and the three models. A summary
of the results is shown in Figure 5.

In Figure 5 the data ratio to be used for validation is drawn as a function of the size of the
available dataset. Each point on the figures refers to a catchment. Hence, on each figure, 13
points can be seen except for BOD where only 12 points are present. It is shown that none of the
optimal splits reached a 50 % ratio for validation. All cases showed data validation ratios less
than 40 %.

Wilcoxon rank sum test

Significance probability

0.8

0.6

0.4

0.2

0
0 10 20 30 40 50 60 70
Number of data used for calibration
Figure 4 : Significance probability for Le Marais catchment for COD using M3

A general increasing trend can be observed for BOD, COD and less obvious for SS. For
validation data ratio equal to zero, the validation and calibration RMSE distributions are not
identical for all tested splits. If the number N of available data is less than 30 the optimal
validation proportion varied between 0 and about 25 %. This proportion ranges from 25 to 40 %
for N > 30. For SS it is more difficult to draw conclusions because of more dispersed points. For
COD and SS "Le Marais" catchment (N > 60) reverses the trend and gives less validation
proportion with higher N. the proportion of the data to be used for validation declines to near 20
%. One point beyond N = 60 doesn’t allow any generalisation of this trend.

6
The scatters differ from one pollutant to another and from one model to another. Different model
and different pollutant means different modelling errors. This gives an indication that N may not
be the single determinant factor for an optimal split. Other factors like model error to signal ratio
might tend to influence the optimal split. In fact, one can expect that more data must be used to
calibrate a less good model or when calibration data have larger measurement errors.

However, it can be thought that beyond a given N (very high) any split might perform well. This
is due to the fact that even a small proportion of the data can cover most of the possible
conditions. Unfortunately this is not the case in this field of research where data collection is very
expensive and only limited number N of data is used in modelling.

40 40 40
Proportion for validation (%)

Proportion for validation (%)

35 35 BOD, M2 35 BOD, M3
BOD, M1
30 30 30
25 25 25
20 20 20
15 15 15
10 10 10
5 5 5
0 0 0
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
Number of available data Number of available data Number of available data
40 40 40
Proportion for validation (%)

Proportion for validation (%)

35 COD, M1 35 COD, M2 35 COD, M3

30 30 30
25 25 25
20 20 20
15 15 15
10 10 10
5 5 5
0 0 0
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
Number of available data Number of available data Number of available data

40 40 40
Proportion for validation (%)

Proportion for validation (%)

35 SS, M1 35 SS, M2 35 SS, M3

Figure 5 : summary of optimal split results

CONCLUSIONS
In this paper two main issues have been investigated. The first one is the effect of calibration
dataset size and characteristics on multiple regression models performance. It was found that
using few data (less than 20) for calibration infers important variability on the results. In addition,
contrary to lower values of the RMSE for calibration, validation RMSE values can be extremely

7
high and thus the predictive ability of the model is very poor. Multiple regression models are
found to be very sensitive to calibration data. Outliers can easily affect the calibration results.

The second issue is about the optimal split of available dataset into calibration and validation
subsets. Results for BOD and COD showed an obvious correlation between the number of
available data and the optimal proportion to be used for validation and this for N between 10 and
60. For N less than 20 it was found that that less than 15 % of the available data are to be used for
validation. The main finding in this part is that the usual split of data into halves is not optimal.
More data must be allocated for calibration than for validation. Validation proportion did not
exceed 40 % in all cases.

A further analysis of the relation between the validation ratio and the noise to signal ratio in
model performance, might explain the differences between the scatters of different pollutants and
models. Another extension to this work could be the investigation of balanced split sample
(McCarthy, 1976) instead of random split. The balanced split sample produces subsets covering
approximately the same conditions, which could partly solve some of the above difficulties.

ACKNWLEGMENTS
The authors gratefully acknowledge the financial support of the RGCU “Réseau Génie Civil et
Urbain”.

REFERENCES
Chebbo G., Gromaire M.-C., Garnaud S., Gonzalez A. (1999). The Experimental Urban Catchment "Le
Marais" in Paris. Proceedings of the 8th International Conference on Urban Storm Drainage (ICUSD),
Sydney, Australia, 30 August - 3 September 1999, 1520-1527
Driver N.E., Tasker G.D. (1990). Techniques for estimation of storm-runoff loads, volumes, and selected
constituent concentrations in urban watersheds in the United States. U.S. Geological Survey Water-
Supply Paper 2363, 44 p.
Hollander M., Wolfe D.A.(1973). Nonparametric Statistical Methods. New York (USA) : Wiley, 1973.
INSA/SOGREAH (1999). CANOE User Manual. Villeurbanne (France) : ALISON, INSA LYON, 1999.
Jewell T.K., Nunno T.J., Adrian D.D. (1978). Methodology for calibrating stormwater models. Journal of
the Environmental Engineering Division, 104(3), 485-501.
Kerlinger F.N., Pedhazur E.J. (1973). Multiple regression in behavioural research. New York: Holt,
Reinhardt and Winston.
McCarthy P.J. (1976). The use of balanced half sample replication cross-validation studies. Journal of the
American Statistical Association, 71 (September), 596-604.
Saget A. (1994). Base de données sur la qualité des rejets urbains de temps de pluie : distribution de la
pollution rejetée, dimensions des ouvrages d’interception. PhD Thesis, Ecole Nationale des Ponts et
Chaussées, Paris (France), 333p.
Saget A., Chebbo G. (1996). Qastor: The French Database about the Quality of Urban Wet Weather
Discharges. Proceedings of the 7th International Conference on Urban Storm Drainage (ICUSD),
Hannover, Germany, 9 – 13 September, vol. 3, 1707-1713.
Schlütter F. (1999). Numerical modelling of sediment transport in combined sewer systems. PhD thesis,
Aalborg University, Aalborg, Denmark, 1999, 172p.
Schueler T.R. (1987). Controlling Urban Runoff : A Practical Manual for Planning and Designing Urban
BMPs. Publication No. 87703. Metropolitan Washington Council of Governments, Washington, DC.,
(USA), 275 pp.
Vaze J., Chiew F.H.S. (2003). Comparative evaluation of storm water quality models. Water Resources
Research, 39(10), 10 p.

User Friendly Multivariate Calibration GP
100% (4)
User Friendly Multivariate Calibration GP
354 pages
Anexo 1 JAWWA - Defining Calibration
No ratings yet
Anexo 1 JAWWA - Defining Calibration
4 pages
Advances in Water Resources: Soon-Thiam Khu, Henrik Madsen, Francesco Di Pierro
No ratings yet
Advances in Water Resources: Soon-Thiam Khu, Henrik Madsen, Francesco Di Pierro
12 pages
Bayesian Approach For The Calibration of Models: Application To An Urban Storm Water Pollution Model
No ratings yet
Bayesian Approach For The Calibration of Models: Application To An Urban Storm Water Pollution Model
8 pages
Calibration and Validation of Water Quality Model
No ratings yet
Calibration and Validation of Water Quality Model
10 pages
10.1016@j.jhydrol.2019.124436
No ratings yet
10.1016@j.jhydrol.2019.124436
33 pages
Nam Auto Cal NHK 2000 Paper
No ratings yet
Nam Auto Cal NHK 2000 Paper
8 pages
A New Tool For Automatic Calibration of Storm Water Management Model
No ratings yet
A New Tool For Automatic Calibration of Storm Water Management Model
9 pages
EGU21-11704_presentation
No ratings yet
EGU21-11704_presentation
31 pages
Zheng ExploringApplicationFlood 2021
No ratings yet
Zheng ExploringApplicationFlood 2021
21 pages
Model Calibration Kin Wave 2
No ratings yet
Model Calibration Kin Wave 2
16 pages
gupta2005
No ratings yet
gupta2005
17 pages
Water Resources Research - 2022 - Shen - Time to Update the Split‐Sample Approach in Hydrological Model Calibration
No ratings yet
Water Resources Research - 2022 - Shen - Time to Update the Split‐Sample Approach in Hydrological Model Calibration
26 pages
Water Quality Modelling For Small River Basins: Stefano Marsili-Libelli, Elisabetta Giusti
No ratings yet
Water Quality Modelling For Small River Basins: Stefano Marsili-Libelli, Elisabetta Giusti
13 pages
Sorooshian 1983
No ratings yet
Sorooshian 1983
9 pages
UCODE Lecture v2.3
No ratings yet
UCODE Lecture v2.3
45 pages
Modeling Ungauged Watersheds: Bibliography
No ratings yet
Modeling Ungauged Watersheds: Bibliography
113 pages
TSS 2
No ratings yet
TSS 2
24 pages
A Practical Protocol For Calibration of Nutrient Removal
No ratings yet
A Practical Protocol For Calibration of Nutrient Removal
21 pages
Calibration Of Watershed Models download
No ratings yet
Calibration Of Watershed Models download
86 pages
JWRPMWR22551 Final Version
No ratings yet
JWRPMWR22551 Final Version
57 pages
A First Course in Dimensional Analysis: Simplifying Complex Phenomena Using Physical Insight
From Everand
A First Course in Dimensional Analysis: Simplifying Complex Phenomena Using Physical Insight
Juan G. Santiago
No ratings yet
Mcmillan 2009 Wrr
No ratings yet
Mcmillan 2009 Wrr
12 pages
2009 - Tonkin - Calibration Constrained Monte Carlo Analysis of Highly Parameterized Models
No ratings yet
2009 - Tonkin - Calibration Constrained Monte Carlo Analysis of Highly Parameterized Models
17 pages
Tutorial On PLS and PCA
100% (1)
Tutorial On PLS and PCA
17 pages
Partial Least Squares Regression A Tutorial
100% (1)
Partial Least Squares Regression A Tutorial
17 pages
Efstratiadis 2010 One Decade of Multi Objective Calib
No ratings yet
Efstratiadis 2010 One Decade of Multi Objective Calib
22 pages
Calibrating The Models: What Is Calibration?
No ratings yet
Calibrating The Models: What Is Calibration?
13 pages
Measurement Technologies: by Hassanain Ghani Hameed
No ratings yet
Measurement Technologies: by Hassanain Ghani Hameed
49 pages
Wu Paper 3
No ratings yet
Wu Paper 3
7 pages
Parameterisation, Calibration and Validation of Distributed Hydrological Models
No ratings yet
Parameterisation, Calibration and Validation of Distributed Hydrological Models
29 pages
Water 12 01393 v2
No ratings yet
Water 12 01393 v2
15 pages
HuyenAnh Environmental Monitoring
No ratings yet
HuyenAnh Environmental Monitoring
13 pages
Utilization of Markov Chain Monte Carlo Approach For Calibration and Uncertainty Analysis of Environmental Models
No ratings yet
Utilization of Markov Chain Monte Carlo Approach For Calibration and Uncertainty Analysis of Environmental Models
5 pages
Wang_WRR_2023 Hydrological Model Adaptability to Rainfall Inputs of Varied Quality
No ratings yet
Wang_WRR_2023 Hydrological Model Adaptability to Rainfall Inputs of Varied Quality
20 pages
rieger2010
No ratings yet
rieger2010
8 pages
Hwre Ta1
No ratings yet
Hwre Ta1
12 pages
Modelado y Calibración
No ratings yet
Modelado y Calibración
40 pages
Darwin Calibrator PDF
No ratings yet
Darwin Calibrator PDF
19 pages
Fin Irjmets1651989957
No ratings yet
Fin Irjmets1651989957
5 pages
ISAT 600_ Progress Report 1
No ratings yet
ISAT 600_ Progress Report 1
4 pages
Reviews in Computational Chemistry, Volume 31
From Everand
Reviews in Computational Chemistry, Volume 31
Abby L. Parrill
No ratings yet
Method of Moments for 2D Scattering Problems: Basic Concepts and Applications
From Everand
Method of Moments for 2D Scattering Problems: Basic Concepts and Applications
Christophe Bourlier
No ratings yet
Regionalisation of Hydrological Model Parameters Under Parameter Uncertainty: A Case Study Involving TOPMODEL and Basins Across The Globe
No ratings yet
Regionalisation of Hydrological Model Parameters Under Parameter Uncertainty: A Case Study Involving TOPMODEL and Basins Across The Globe
19 pages
Mathematical Modelling of Water Quality in Pinacanauan de Tuguegarao River
No ratings yet
Mathematical Modelling of Water Quality in Pinacanauan de Tuguegarao River
77 pages
Validating Rainwater Quality Through Modelling
No ratings yet
Validating Rainwater Quality Through Modelling
11 pages
2e948dc027105f30f1b314dceaaac7617fa0
No ratings yet
2e948dc027105f30f1b314dceaaac7617fa0
17 pages
Effective Groundwater Model Calibration With Analysis of Data Sensitivities Predictions and Uncertainty 1st Edition Mary C. Hill download
No ratings yet
Effective Groundwater Model Calibration With Analysis of Data Sensitivities Predictions and Uncertainty 1st Edition Mary C. Hill download
48 pages
Statistical Analysis of Water Quality Data
No ratings yet
Statistical Analysis of Water Quality Data
8 pages
Journal of Hydrology: Axel Ritter, Rafael Muñoz-Carpena
No ratings yet
Journal of Hydrology: Axel Ritter, Rafael Muñoz-Carpena
13 pages
1 s2.0 S1364815221002772 Main
No ratings yet
1 s2.0 S1364815221002772 Main
14 pages
Efficient Technique For Pipe Roughness Calibration and Sensor Placement For Water Distribution Systems
No ratings yet
Efficient Technique For Pipe Roughness Calibration and Sensor Placement For Water Distribution Systems
12 pages
Chamohconf
No ratings yet
Chamohconf
4 pages
Model Calibration
0% (1)
Model Calibration
7 pages
Tasks
No ratings yet
Tasks
11 pages
Fundamentos_Calibracion
No ratings yet
Fundamentos_Calibracion
3 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
15
No ratings yet
15
24 pages
1-s2.0-S136481520500109X-main
No ratings yet
1-s2.0-S136481520500109X-main
15 pages
Lecture 30
No ratings yet
Lecture 30
49 pages
Ch 1 What Chemistry Is
No ratings yet
Ch 1 What Chemistry Is
9 pages
Syllabus Online Ged 183 Great Books Sy 2021 2022
No ratings yet
Syllabus Online Ged 183 Great Books Sy 2021 2022
10 pages
Financial Risk Management With Bayesian Estimation of GARCH Models PDF
No ratings yet
Financial Risk Management With Bayesian Estimation of GARCH Models PDF
204 pages
Full
No ratings yet
Full
395 pages
(Ebook) Linear Models With R (Second Edition) by Julian James Faraway ISBN 9781439887332, 1439887330 instant download
100% (1)
(Ebook) Linear Models With R (Second Edition) by Julian James Faraway ISBN 9781439887332, 1439887330 instant download
55 pages
Inq Mod 2
No ratings yet
Inq Mod 2
22 pages
The Enlightenment
No ratings yet
The Enlightenment
45 pages
Syllabus Statistics For Business Decision
100% (1)
Syllabus Statistics For Business Decision
2 pages
Chi-Square As A Statistical Test
No ratings yet
Chi-Square As A Statistical Test
27 pages
MKT 310 Final Paper - EickhoffSodexo Dining
No ratings yet
MKT 310 Final Paper - EickhoffSodexo Dining
32 pages
Normal Distr I
No ratings yet
Normal Distr I
16 pages
Hand-0ut 1 Inres Prelims
No ratings yet
Hand-0ut 1 Inres Prelims
7 pages
Lessons
No ratings yet
Lessons
16 pages
4 Kinds of Quantitative Research
No ratings yet
4 Kinds of Quantitative Research
3 pages
Project 2017-2019
0% (1)
Project 2017-2019
30 pages
Research Methodology
No ratings yet
Research Methodology
2 pages
001 - Nida Najiyah - NURSING PROCESS AS A SCIENTIFIC METHOD IN PROBLEM SOLVING - Id.en
No ratings yet
001 - Nida Najiyah - NURSING PROCESS AS A SCIENTIFIC METHOD IN PROBLEM SOLVING - Id.en
7 pages
Research Writing: Tothie M. Castillo RN, Man, Maed, Edd, PHD
No ratings yet
Research Writing: Tothie M. Castillo RN, Man, Maed, Edd, PHD
68 pages
Chi Square
No ratings yet
Chi Square
76 pages
HW 10-4-22 Measurement
No ratings yet
HW 10-4-22 Measurement
16 pages
Sales Forecasting
No ratings yet
Sales Forecasting
28 pages
CORM PSYCHOLOGY Class 11 PDF
100% (1)
CORM PSYCHOLOGY Class 11 PDF
138 pages
Chapter 24 - Logistic Regression
100% (7)
Chapter 24 - Logistic Regression
21 pages
25-36 Pengaruh Akreditasi Terhadap Peningkatan Mutu Pelayanan Rumah Sakit
No ratings yet
25-36 Pengaruh Akreditasi Terhadap Peningkatan Mutu Pelayanan Rumah Sakit
12 pages
Econometrics Project
No ratings yet
Econometrics Project
17 pages
POM 1546. Activity7: 1 Point
No ratings yet
POM 1546. Activity7: 1 Point
2 pages
A Sample Project POB POA SBA
75% (4)
A Sample Project POB POA SBA
16 pages
Albert Einstein - The Mai Kien Quoc
No ratings yet
Albert Einstein - The Mai Kien Quoc
13 pages
Accountability and Performance: Evidence From Local Government
No ratings yet
Accountability and Performance: Evidence From Local Government
18 pages
DR - Bach & A Medicine For The Soul PDF
100% (1)
DR - Bach & A Medicine For The Soul PDF
20 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Calibration and Validation of Multiple - Split Sample - N

Uploaded by

Calibration and Validation of Multiple - Split Sample - N

Uploaded by

CALIBRATION AND VALIDATION OF MULTIPLE

REGRESSION MODELS FOR STORMWATER QUALITY

M. MOURAD, J-L. BERTRAND-KRAJEWSKI and G. CHEBBO**

Keywords Stormwater, modelling, calibration, validation, uncertainty.

Among modelling approaches, multiple regression models (MRMs) have intermediate

where Oi denotes observations and Ti denotes model outputs.

Combined sewer systems Separated Sewer systems

OPTIMAL DATA SPLIT

where i⋅ means that all the ith row of X is considered

where Ŷ2 is the predicted concentrations given by

For each possible split of the available data (from N1 = 4, N2 = N - 4 to N1 = N - 2, N2 = 2) a large

Figure 1: Illustration of the Wilcoxon test

Figure 2 : Typical calibration distributions plot

Figure 3 : Typical validation distributions plot

Wilcoxon rank sum test

Proportion for validation (%)

Proportion for validation (%)

35 COD, M1 35 COD, M2 35 COD, M3

Proportion for validation (%)

35 SS, M1 35 SS, M2 35 SS, M3

Figure 5 : summary of optimal split results

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Calibration and Validation of Multiple - Split Sample - N

Uploaded by

Calibration and Validation of Multiple - Split Sample - N

Uploaded by

CALIBRATION AND VALIDATION OF MULTIPLE

REGRESSION MODELS FOR STORMWATER QUALITY

M. MOURAD*, J-L. BERTRAND-KRAJEWSKI* and G. CHEBBO**

Keywords Stormwater, modelling, calibration, validation, uncertainty.

Among modelling approaches, multiple regression models (MRMs) have intermediate

where Oi denotes observations and Ti denotes model outputs.

Combined sewer systems Separated Sewer systems

OPTIMAL DATA SPLIT

where i⋅ means that all the ith row of X is considered

where Ŷ2 is the predicted concentrations given by

For each possible split of the available data (from N1 = 4, N2 = N - 4 to N1 = N - 2, N2 = 2) a large

Figure 1: Illustration of the Wilcoxon test

Figure 2 : Typical calibration distributions plot

Figure 3 : Typical validation distributions plot

Wilcoxon rank sum test

Proportion for validation (%)

Proportion for validation (%)

35 COD, M1 35 COD, M2 35 COD, M3

Proportion for validation (%)

35 SS, M1 35 SS, M2 35 SS, M3

Figure 5 : summary of optimal split results

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

M. MOURAD, J-L. BERTRAND-KRAJEWSKI and G. CHEBBO**