**airGR** is a package that brings into the [**R software**](https://cran.r-project.org/) the hydrological modelling tools used and developed at the [Catchment Hydrology Team](https://webgr.irstea.fr/en/) at [Irstea (France)](http://www.irstea.fr/en/), including the [**GR rainfall-runoff models**](https://webgr.irstea.fr/en/modeles/) and a snowmelt and accumulation model, [**CemaNeige**](https://webgr.irstea.fr/en/modeles/modele-de-neige/). Each model core is coded in **Fortran** to ensure low computational time. The other package functions (i.e. mainly the calibration algorithm and the efficiency criteria calculation) are coded in **R**.

**airGR** is a package that brings into the [**R software**](https://cran.r-project.org/) the hydrological modelling tools used and developed at the [Catchment Hydrology Research Group](https://webgr.irstea.fr/en/) at [Irstea (France)](http://www.irstea.fr/en/), including the [**GR rainfall-runoff models**](https://webgr.irstea.fr/en/modeles/) and a snowmelt and accumulation model, [**CemaNeige**](https://webgr.irstea.fr/en/modeles/modele-de-neige/). Each model core is coded in **Fortran** to ensure low computational time. The other package functions (i.e. mainly the calibration algorithm and the efficiency criteria calculation) are coded in **R**.

The **airGR** package has been designed to fulfill two major requirements: to facilitate the use by non-expert users and to allow flexibility regarding the addition of external criteria, models or calibration algorithms. The names of the functions and their arguments were chosen to this end. **airGR** also contains basics plotting facilities.

...

...

@@ -34,6 +34,7 @@ The models can be called within **airGR** using the following functions:

The [**GRP**](https://webgr.irstea.fr/en/modeles/modele-de-prevision-grp/) forecasting model and the [**Otamin**](https://webgr.irstea.fr/en/modeles/otamin/) predictive uncertainty tool are not available in **airGR**.

In this vignette, we show how to prepare and run a calibration and a simulation with airGR hydrological models.

# Loading data

...

...

@@ -47,7 +48,7 @@ First, it is necessary to load the **airGR** package:

library(airGR)

```

Below is presented an example of a `data.frame` of hydrometeorological observations time series for a fictional catchment included in the **airGR** package that contains:

Below is presented an example of a `data.frame` of daily hydrometeorological observations time series for a fictional catchment included in the **airGR** package that contains:

* *DatesR*: dates in the POSIXt format

* *P*: average precipitation [mm/day]

...

...

@@ -71,7 +72,7 @@ To run a model, the functions of the **airGR** package (e.g. the models, calibra

To facilitate the use of the package, there are several functions dedicated to the creation of these objects:

* `CreateInputsModel()`: prepares the inputs for the different hydrological models (times series of dates, precipitation, observed discharge, etc.)

* `CreateRunOptions()`: prepares the options for the hydrological model run (warm-up period, calibration period, etc.)

* `CreateRunOptions()`: prepares the options for the hydrological model run (warmup period, calibration period, etc.)

* `CreateInputsCrit()`: prepares the options in order to compute the efficiency criterion (choice of the criterion, choice of the transformation on discharge: "log", "sqrt", etc.)

* `CreateCalibOptions()`: prepares the options for the hydrological model calibration algorithm (choice of parameters to optimize, predefined values for uncalibrated parameters, etc.)

...

...

@@ -118,7 +119,7 @@ As a consequence, it is possible to define in `CreateRunOptions()` the following

* `IniStates`: the initial states of the 2 unit hydrographs (20 + 40 = 60 units)

* `IniResLevels`: the initial levels of the production and routing stores

* `IndPeriod_WarmUp`: the warm-up period used to run the model, to be defined in the same format as `IndPeriod_Run`

* `IndPeriod_WarmUp`: the warmup period used to run the model, to be defined in the same format as `IndPeriod_Run`

```{r}

...

...

@@ -130,7 +131,7 @@ str(RunOptions)

The `CreateRunOptions()` function returns warnings if the default initialization options are used:

* `IniStates` and `IniResLevels` are automatically set to initialize all the model states at 0, except for the production and routing stores, which are initialized at respectively 30% and 50 % of their capacity

* `IndPeriod_WarmUp` default setting ensures a one-year warm-up using the time steps preceding the `IndPeriod_Run`, if available

* `IndPeriod_WarmUp` default setting ensures a one-year warmup using the time steps preceding the `IndPeriod_Run`, if available

## InputsCrit object

...

...

@@ -138,7 +139,7 @@ The `CreateRunOptions()` function returns warnings if the default initialization

The `CreateInputsCrit()` function allows to prepare the input in order to calculate a criterion. It is possible to define the following arguments:

* `FUN_CRIT`: the name of the error criterion function (they are introduced later on)

* `FUN_CRIT`: the name of the error criterion function (the available functions are introduced later on)

* `InputsModel`: the inputs of the hydrological model previously prepared by the `CreateInputsModel()` function

* `RunOptions`: the options of the hydrological model previously prepared by the `CreateRunOptions()` function

* `Qobs`: the observed discharge expressed in *mm/time step*

The `Calibration_Michel()` function is the only one implemented in the **airGR** package to calibrate the model, but the user can implement its own calibration function. Two vignettes explain how it can be done ([2.1](V02.1_param_optim.html) and [2.2](V02.2_param_mcmc.html)).

The `Calibration_Michel()` function is the only one implemented in the **airGR** package to calibrate the model, but the user can implement its own calibration function. Two vignettes explain how it can be done ([2.1 Plugging in new calibration](V02.1_param_optim.html) and [2.2 MCMC parameter estimation](V02.2_param_mcmc.html)).

The `Calibration_Michel()` function returns a vector with the parameters of the chosen model, which means that the number of values can differ depending on the model that is used. It is possible to use the `Calibration_Michel()` function with user-implemented hydrological models.

...

...

@@ -247,7 +248,7 @@ Moreover, if the CemaNeige model is used, the air temperature and the simulated

## Efficiency criterion

To evaluate the efficiency of the model, it is possible to use the same criterion as defined at the calibration step or to use another criterion.

To evaluate the efficiency of the model, it is possible to use the same criterion as defined at the calibration step or to use another criterion.

title: "Plugging new calibration algorithms in airGR"

title: "Plugging in new calibration algorithms in airGR"

author: "François Bourgin"

output: rmarkdown::html_vignette

vignette: >

%\VignetteEngine{knitr::rmarkdown}

%\VignetteIndexEntry{Plugging new calibration algorithms}

%\VignetteIndexEntry{Plugging in new calibration algorithms}

%\VignetteEncoding{UTF-8}

---

...

...

@@ -25,12 +25,12 @@ library(Rmalschains)

## Scope

The Michel's calibration strategy (`Calibration_Michel()` function) is the proposed calibration algorithm in **airGR**. However, other optimization methods can be used in combination with **airGR**.

We show how to use different R packages to perform parameter estimation.

The Michel's calibration strategy (`Calibration_Michel()` function) is the calibration algorithm proposed in **airGR**. However, other optimization methods can be used in combination with **airGR**.

We show here how to use different R packages to perform parameter estimation.

In this vignette, we use the **GR4J** model to illustrate the different optimization strategies.

In particular, we assume that the R global environment contains input climate data, observed discharge and functions from the **airGR** vignette, as shown below.

The calibration period is defined in the `CreateRunOptions()` function .

In particular, we assume that the R global environment contains input climate data, observed discharge and functions from the **airGR** [Get Started](V01_get_started.html) vignette, as shown below.

Please note that the calibration period is defined in the `CreateRunOptions()` function .

@@ -50,7 +50,7 @@ Parameter estimation can be performed by defining a function that takes a parame

There are two important steps: the transformation of parameters to real space and the computation of the value of the performance criterion.

Here we choose to minimize the root mean square error.

The change of the repository from the "real" parameter space to a "transformed" space ensures homogeneity of displacement in the different dimensions of the parameter space during the stepbystep procedure of the calibration algorithm of the model.

The change of the repository from the "real" parameter space to a "transformed" space ensures homogeneity of displacement in the different dimensions of the parameter space during the step-by-step procedure of the calibration algorithm of the model.

title: "Parameter estimation within an Bayesian MCMC framework"

title: "Parameter estimation within a Bayesian MCMC framework"

author: "François Bourgin"

output: rmarkdown::html_vignette

vignette: >

...

...

@@ -25,9 +25,9 @@ library(dplyr)

## Scope

Here, we give an example of parameter estimation within a Bayesian MCMC approach.

In this vignette, we give an example of parameter estimation within a Bayesian MCMC approach.

We use the **GR4J** model and we assume that the R global environment contains data and functions from the **airGR** vignette.

We use the **GR4J** model and we assume that the R global environment contains data and functions from the **airGR** [Get Started](V01_get_started.html) vignette.

Please refer to the [2.1 Parameter estimation](V02.1_param_optim.html) vignette for explanations on how to plug in a parameter estimation algorithm to **airGR**.

Please refer to the [2.1 Plugging in new calibration](V02.1_param_optim.html) vignette for explanations on how to plug in a parameter estimation algorithm to **airGR**.

Please note that this vignette is only for illustration purposes and does not provide any guidance about which parameter inference strategy is recommended for the family of the GR models.

...

...

@@ -46,7 +46,7 @@ Please note that this vignette is only for illustration purposes and does not pr

We show how to use the DRAM algorithm for SLS Bayesian inference, with the `modMCMC()` function of the [FME](https://cran.r-project.org/web/packages/FME/) package.

First, we need to define a function that returns twice the opposite of the log-likelihood for a given parameter set.

Nota: in the `RunAirGR4J()` function the computation of the the log-likelihood is simplified in order to ensure a good computing performance. It corresponds to a translation of the two following lines.

Nota: in the `RunAirGR4J()` function, the computation of the log-likelihood is simplified in order to ensure a good computing performance. It corresponds to a translation of the two following lines.

@@ -55,7 +55,7 @@ Nota: in the `RunAirGR4J()` function the computation of the the log-likelihood i

In our simplified setting of Gaussian likelihood with measurement error integrated out, the log of the sum of squared error is related to the log-likelihood.

Note that we do not use here any discharge transformation but applying Box-Cox transformation is quite popular in hydrological modelling.

Note that we do not use here any discharge transformation, although applying Box-Cox transformation is quite popular in hydrological modelling.

There are several diagnostics that can be used to check the convergence of the chains.

The R package [coda](https://cran.r-project.org/web/packages/coda/index.html) provides several diagnostic tests.

Among others, the Gelman and Rubin's convergence can be used. A value close to 1 suggests acceptable convergence

The result will be better with more iterations than 2000. As we have kept the iterations during the convergence process, we have to set the `autoburnin` argument to `TRUE` in order to consider only the second half of the series.

Among others, the Gelman and Rubin's convergence can be used. A value close to 1 suggests acceptable convergence.

The result will be better with more iterations than 2000. As we kept the iterations during the convergence process, we have to set the `autoburnin` argument to `TRUE` in order to consider only the second half of the series.

Note that we rescale model parameter sets of the GR4J model from the transformed space to the real space

Note that we rescale model parameter sets of the GR4J model from the transformed space to the real space.

The given GR4J **X4u** variable does not correspond to the actual GR4J **X4** parameter. As explained in @andreassian_seeking_2014 [section 2.1], the given GR4J **X4u** value has to be adjusted (rescaled) using catchment area (S) [km2] as follows: $X4 = X4u / 5.995 \times S^{0.3}$ (please note that **the formula is erroneous in the publication**).

So, we need to transform the **X4** parameter in the right way.

It means we need first to transform the **X4** parameter.

It is recommended to use the generalist parameter sets when the calibration period is less than 6 month.

As shown in @andreassian_seeking_2014 [figure 4], a recommended way to use the `Param_Sets_GR4J` dataframe is to run the GR4J model with each parameter set and to select the best one according to an objective function (here we use the Nash-Sutcliffe Efficiency criterion).

As shown in @andreassian_seeking_2014 [figure 4], a recommended way to use the `Param_Sets_GR4J` `data.frame` is to run the GR4J model with each parameter set and to select the best one according to an objective function (here we use the Nash-Sutcliffe Efficiency criterion).

Find below the `r nrow(Param_Sets_GR4J)` criteria corresponding to the different parameter sets.

The values are quite low (from `r OutputsCrit_Loop[which.min(OutputsCrit_Loop)]` to `r OutputsCrit_Loop[which.max(OutputsCrit_Loop)]`), which can be expected as this does not represents an actual calibration.

The criterion values are quite low (from `r OutputsCrit_Loop[which.min(OutputsCrit_Loop)]` to `r OutputsCrit_Loop[which.max(OutputsCrit_Loop)]`), which can be expected as this does not represents an actual calibration.

```{r, echo=FALSE}

OutputsCrit_Loop

```

...

...

@@ -142,11 +142,12 @@ Now we can compute the Nash-Sutcliffe Efficiency criterion on the validation per

# Calibration of the GR4J model with the buit-in `Calibration_Michel()` function

# Calibration of the GR4J model with the built-in `Calibration_Michel()` function

As seen above, the Michel's calibration algorithm is based on a local search procedure.

It is **not recommanded** to use the generalist parameter sets when the **calibration period is less than 6 month**.

It is **not recommanded** to use the `Calibration_Michel()` function when the **calibration period is less than 6 month**.

We will show below its application on the same short period for two configurations of the grid-screening step to demonstrate that it is less efficient than the generalist parameters sets calibration.

## GR4J parameter distributions quantiles used in the grid-screening step

...

...

@@ -184,9 +185,9 @@ The Nash-Sutcliffe Efficiency criterion computed on the calibration period is be

This shows that the generalist parameter sets give more robust model in this case.

# GR4J parameter sets used in the grid-screening step

## GR4J parameter sets used in the grid-screening step

It is also possible to give to the `CreateCalibOptions()` function a matrix of parameter sets used for grid-screening calibration procedure.

It is also possible to give to the `CreateCalibOptions()` function a matrix of parameter sets used for the grid-screening calibration procedure.

So, it possible is to use by this way the GR4J generalist parameter sets.

The results are the same here. The Nash-Sutcliffe Efficiency criterion computed on the calibration period is better (`r OutputsCrit_Cal$CritValue`), but the one computed on the validation period is just a little bit higher (`r OutputsCrit_Val$CritValue`).

The results are the same here. The Nash-Sutcliffe Efficiency criterion computed on the calibration period is better (`r OutputsCrit_Cal$CritValue`), but the one computed on the validation period is just a little bit lower (`r OutputsCrit_Val$CritValue`) than the classical calibration.

Generally the advantage of using GR4J parameter sets rather than the GR4J generalist parameter quantiles, is that they make more sense than a simple exploration of combinations of quantiles of parameter distributions. In addition, for the first step, the number of iterations is smaller (27 runs instead of 81), which can save time if one realizes a very large number of calibrations.

Generally, the advantage of using GR4J parameter sets rather than the GR4J generalist parameter quantiles is that they make more sense than a simple exploration of combinations of quantiles of parameter distributions (each parameter set represents a consistent ensemble). In addition, for the first step, the number of iterations is smaller (27 runs instead of 81), which can save time if one wants to run a very large number of calibrations.