# Statistical description of wave parameters

Because of the random nature of natural waves, a statistical description is normally always used. A fair approximation of the observed distribution of wave heights is given by the Rayleigh distribution. Statistical wave parameters are often calculated based on this distribution. The most commonly used variables in coastal engineering are described below.

## Most commonly used variables in coastal engineering

### Significant wave height Fig. 1. Time-series of surface elevations by individual waves for a certain sea state.

An example of a wave record representative for a certain sea state is shown in Fig. 1. The significant wave height, $H_s$, is the mean of the highest third of the waves; instead of $H_s$ the notation $H_{1/3}$ is also often used. $H_s$ represents well the average height of the highest waves in a wave group. The significant wave height can also be computed from the wave energy. For non-breaking waves it appears that $H_s \approx H_{m0} = 4 [\lt E\gt / (g \rho)]^{1/2},$ where $H_{m0}$ is the spectral significant wave height. An explanation, definitions and formulas are given in appendix A.

### Mean wave period

The mean wave period, $T_m$, is the mean of all wave periods in a time-series representing a certain sea state.

### Peak wave period Fig. 2. Wave spectrum: $H_{m0}=1m, T_{02}=3.55s, T_p=5s$ (corresponding to peak frequency of 0.2 $s^{-1}$)

The peak wave period, $T_p$, is the wave period with the highest energy. The analysis of the distribution of the wave energy as a function of wave frequency $f=1/T$ for a time-series of individual waves is referred to as a spectral analysis. Wind wave periods (frequencies) often follow the so-called JONSWAP or Pierson-Moskowitz spectra (see appendix B). The peak wave period is extracted from the spectra. As a rule of thumb the following relation can be used, see Fig. 5:

$T_p \approx 5 \sqrt{H_{m0}}. \qquad (1)$

### Mean wave direction

The mean wave direction, $\theta_m$, is defined as the mean of all the individual wave directions in a time-series representing a certain sea state.

## Description of wave conditions

These various wave parameters are often calculated from continuous or periodic time-series of the surface elevations; typically the parameters are calculated once every one or three hours, whereby a new discrete time-series of the statistical wave parameters is constructed. This time-series is thereafter analyzed statistically to arrive at a condensed description of the wave conditions as follows:

• Wave height distribution represented by $H_s$ vs. percentage of exceedance. This often follows a Weibull-distribution (see appendix A and the example in Fig. 3);
• Directional distribution of the wave heights, which is often presented in the form of a wave rose (see appendix B and the example in Fig. 4);
• Scatter diagram of $T_p$ vs. $H_s$ (example in Fig. 5). Fig. 3. Wave height exceedance distribution for various wave directions; wave climate from the West Coast of Denmark. Fig. 4. Wave height directional distribution, a so-called wave rose. Wave climate from the West Coast of Denmark. Fig. 5. Scatter diagram of $T_p$ vs. $H_{m0}$; wave climate from the West Coast of Denmark. The approximate empirical relation between $T_p$ and $H_{m0}$ for storm waves is also shown.

Analyses of extreme wave conditions are performed on the basis of max. wave heights in single storm events or on the basis of annual max. wave heights. These analyses are often presented as exceedance probability vs. wave heights, see Fig. 6 for an example. Fig. 6. Extreme value analysis of wave height by Weibull distribution. Threshold = 2 m.

• Mangor, K., Drønen, N.K., Kaergaard, K.H. and Kristensen, S.E. 2017. Shoreline Management Guidelines. DHI Water and Environment, 451pp. 
• Coastal Engineering Manual, part II, chapter 1. US Army Corps of Engineers (USACE), 2008

## Appendix A: Rayleigh distribution

We consider a wave field $\eta(x,y,t)$ consisting of a superposition of $n$ independent, uncorrelated sinusoidal (i.e. linear) waves, with amplitudes $a_j$, radial frequencies $\omega_j$ and random phases $\phi_j$, originating from different nearby and remote regions. This superposition can be represented by

$\eta=Re[\sum_{j=1}^{n} a_j \exp(i\omega_j t + i\phi_j)]. \qquad (A1)$

The statistical distribution of wave heights was derived by Longuet-Higgins (1952) under a few specific conditions: (a) the radial frequencies $\omega_j$ of the random waves are grouped in a single narrow band around a central frequency $\omega$ such that $|\omega_j -\omega_j'|/ \omega \lt \lt 1$ for each $j, j' \, ; \;$ (b) the slowly varying numbers $\; a=\sum_{j=1}^{n} a_j \cos \big( \phi_j + (\omega_j - \omega)t \big), \, b=\sum_{j=1}^{n} a_j \sin\big( \phi_j + (\omega_j - \omega)t \big) \;$ are statistically independent and normally (Gaussian) distributed. Under these conditions the expression (A1) can be approximated for the time interval [$-\pi / \omega \lt t\lt \pi / \omega$] by

$\eta \approx \frac{1}{2} H \, \cos(\omega t+ \phi) , \quad H = 2 \sqrt{a^2 + b^2} , \quad \phi = \tan^{-1} (b/a) . \qquad (A2)$

A well-known mathematical theorem states that the length of a vector with Gaussian distributed components follows the Rayleigh distribution. In this case the vector length is the wave height $H$ and the components are the random numbers $\sqrt{2}a, \sqrt{2}b$. The Rayleigh probability density function $p_R(H)$ for the wave height $H$ reads:

$p_R(H) = \Large\frac{2H}{H_{rms}^2}\normalsize \exp\Large (–(\frac{H}{H_{rms}})^2)\normalsize . \qquad (A3)$

The root mean square wave height (also called mean energy wave height) $H_{rms}$ is related to the average wave energy $\lt E\gt$:

$H_{rms}^2 = \int_0^{\infty} p_R(H) H^2 dH =\Large\frac{8}{g \rho}\normalsize \lt E\gt . \qquad (A4)$

The average wave energy is defined as $\lt E\gt = g \rho \, m_0 \equiv g \rho \lt (\, \eta(t)-\lt \eta\gt )^2\gt$, where $g$ is the gravitational acceleration, $\rho$ the seawater density and where $\; \lt …\gt \;$ designates the average over a period much longer than the characteristic wave periods. Not considering wave set-up / set-down implies $\lt \eta\gt =0$.

The average wave height $\lt H\gt$ is related to the root mean square wave height $H_{rms}$ by

$\lt H\gt = \int_0^{\infty} p_R(H) H dH = \Large\frac{\sqrt{\pi}}{2}\normalsize H_{rms} \approx 0.89 H_{rms} .\qquad (A5)$

The cumulative Rayleigh distribution (probability of wave height $\lt H$) is given by

$P_R(H)=\int_0^H p_R(H')dH' = 1-\exp\Large (-(\frac{H}{H_{rms}})^2)\normalsize .\qquad (A6)$

Assuming that wave heights are Reynolds distributed, relations can be derived between different wave parameters that are often used in practice. For the derivation of the significant wave height $H_s \equiv H_{1/3}$ (the mean of the highest 1/3-part of the waves) we first determine the lowest height of the highest 1/3-part of the waves, $H_3$, from the condition $P_R(H_3)=2/3$, yielding $H_3=H_{rms} \sqrt{\ln(3)}$. The significant wave height is then related to the root mean square wave height by

$H_s=\Large\frac{\int_{H_3}^{\infty} p_R(H)HdH}{\int_{H_3}^{\infty} p_R(H)dH }\normalsize = 3 \, \int_{H_3}^{\infty} p_R(H)HdH \approx 1.6 \lt H\gt = 1.42 H_{rms}. \qquad (A7)$

From Eqs. (A4) and (A7), it follows that the significant wave height $H_s$ is related to the average wave energy $\lt E\gt$ by

$H_s \approx H_{m0} \equiv 4 \Large \sqrt{ \frac{\lt E\gt }{g \rho}}\normalsize \equiv 4 \sqrt{m_0}. \qquad(A8)$

Extreme wave heights can be derived from the Rayleigh distribution in a similar way. For example, the mean of the 1% highest waves is given by

$H_{1/100} \approx 1.52 H_s . \qquad (A9)$

The Rayleigh distribution has been derived under the rather restrictive conditions ((a) and (b)). Since the Rayleigh distribution does not put a limit on the wave height, it allows for unrealistic high waves. When compared with wave height statistics obtained from field observations, it appears that the Rayleigh distribution tends to overestimate wave heights. This overestimation is greatest (more than 8%) for high long-period waves. Overestimation of the wave height is mainly due to wave breaking, which is related to increasing sharpness of wave peaks in deep water or to depth-induced wave breaking in shallow water. A broad-banded instead of a narrow-banded wave spectrum can be another reason for wave height overestimation by the Rayleigh distribution. Unlike wave height, wave crest height is generally underestimated by the Rayleigh distribution. The main reason for underestimating the wave crest height is the neglect of nonlinearity in the wave field. Nonlinearity generates so-called 'skewed' waves: the wave shape changes from sinusoidal to more peaked with narrow wave crests and more shallow and wide wave troughs. The conditions underlying the Rayleigh distribution are not satisfied in the case of significant nonlinearity, because the wave components in Eq. A1 are not random but correlated to generate a systematic deviation from the sinusoidal shape, see Appendix C: Distribution of nonlinear waves.

The Weibull distribution is one of the proposed alternatives to the Rayleigh distribution. The Weibull distribution reads  :

$p_W(H)=\Large\frac{m}{\lambda}(\frac{H}{\lambda})^{(m-1)}\normalsize \exp\Large (–(\frac{H}{\lambda})^{m}) \normalsize . \qquad (A10)$

The Rayleigh distribution corresponds to the Weibull distribution for $m=2, \; \lambda=H_{rms}$. The Weibull distribution has an additional parameter ($m$) that allows suppression of the highest waves for $m\gt 2$ and an optimum adjustment to the observed wave data.

This is especially relevant for shallow-water waves, which are truncated due to depth-induced wave breaking (see Breaker index). Because of this truncation, the random numbers $a$ and $b$ in Eq. (2) are not Gaussian distributed; the wave height therefore does not follow a Rayleigh distribution. For this situation, alternative distributions have been proposed, for example by Battjes and Groenendijk (2000). According to this study, a Weibull distribution with $m=3.6$ should be used above a certain threshold, $H_{tr}$ (threshold for depth-induced wave breaking). This implies that the relationships (A7-A9) are not valid in shallow water. For example, if $H_{tr}\lt H_s$ (i.e. very shallow water), Eq. (A9) should be replaced by 

$H_{1/100} \approx 1.28 H_s . \qquad (A11)$

Flume experiments of shallow-water wave transformation show that the value of $m$ is not constant but varies over the surf zone slope (gradual increase followed by decrease).

## Appendix B: Frequency spectrum

A wave record can further be characterized by its frequency spectrum. The energy density spectrum of a sea state is generally designated by $E(f)$. The total energy is given by

$\lt E\gt =\int_0^{\infty} E(f)df . \qquad (B1)$

The wave frequency spectrum can be determined from a wave record $\eta(t)$ by using a Fourier transform as follows: The wave energy averaged over a period $[-T/2 \lt (t -t_0)\lt T/2]$ is given by $\lt E\gt =\frac{g \rho}{T} \int_{-T/2}^{T/2} (\eta(t-t_0) - \lt \eta\gt )^2 dt$, where $\lt \eta\gt$ is the mean value. Inserting in this expression the Fourier development of $(\eta(t-t_0) - \lt \eta\gt )$ gives

$\lt E\gt =\sum_{k=1}^{\infty} E(f_k) \Delta f , \quad f_k=k \Delta f, \; \Delta f = \frac{1}{T} , \quad E(f_k) \Delta f =\frac{g \rho}{8} |H_k|^2 , \quad H_k = \frac{4}{T} \int_{-T/2}^{T/2} (\eta(t-t_0)-\lt \eta\gt ) e^{-2 i \pi f_k t} dt .$

The wave frequency spectrum can also be determined by modelling the wind-induced wave field in a large source area. Empirical formulas have been established for fully developed wave fields under constant wind stress. For deep water without fetch restriction, it is recommended to use the adapted Pierson-Moskowitz frequency distribution $E_{PM}$ :

$E_{PM}(f) = \alpha_{PM} \Large\frac{\lt E\gt }{f_p}(\frac{f_p}{f})^4 e^{-(\frac{f_p}{f})^4}\normalsize . \qquad (B2)$

For the average wave energy $\lt E\gt$ and the peak frequency $f_p$ the following empirical expressions are found:

$\lt E\gt \approx 0.005 \rho g^{-1} U_{10}^4\; , \; f_p \approx 0.123 g U_{10}^{-1} \; ,$

where $g$ is the gravitational acceleration and $U_{10}$ the wind velocity at 10 m above the sea surface. For the coefficient $\alpha_{PM}$ a usual value is $\alpha_{PM} \approx 3.26$.

For fetch-limited seas, the spectrum is more strongly peaked around the peak frequency. For this situation, the adapted empirical JONSWAP spectrum can be used. It has the form

$E_{JWP} = \alpha_{JWP} \Large\frac{\lt E\gt }{f_p}(\frac{f_p}{f})^4 e^{-(\frac{f_p}{f})^4}\normalsize \gamma^\delta \;, \quad \delta =e^{-\Large\frac{1}{2}\Large(\frac{(f/f_p)-1}{\sigma})^2}\normalsize , \qquad (B3)$,

where the parameters $\alpha_{JWP}, \gamma, \sigma, f_p$ depend on the wind velocity and the fetch length and should be fitted to the wave data. The peak enhancement factor $\gamma$ can take values between 1 and 7 depending on the ocean region and depending on calm of stormy weather conditions. For $\gamma=1$ the Pierson-Moskowitz and JONSWAP spectra are the same. The most usual value is $\gamma=3.3$, but a lower value (between 1 and 2) is probably more appropriate for most ocean regions.

Different characteristic wave periods can be derived from the wave spectrum: the significant wave period $T_{01}$, the mean wave period $T_{02}$ and the mean energy period $T_E \equiv T_{m-1,0}$. They are given by the expressions

$T_{01} = \Large\frac{\int_0^{\infty} E(f)df}{\int_0^{\infty} E(f)fdf }\normalsize, \quad T_{02} = \Large \sqrt{\frac{\int_0^{\infty} E(f)df}{\int_0^{\infty} E(f) f^2 df }}\normalsize, \quad T_E \equiv T_{m-1,0} = \Large\frac{\int_0^{\infty} E(f) f^{-1} df}{\int_0^{\infty} E(f)df }\normalsize \; .\qquad (B4)$

The peak frequency $f_p=T_p^{-1}$ is related to the mean energy period $T_E$. Fitting the Pierson-Moskowitz distribution (B2) to field data yields $T_E/T_p \approx 0.85$; fitting the JONSWAP distribution (B3) yields $T_E/T_p \approx 0.9$. The ratio $T_E/T_p$ can also be derived directly from field data.

The periods defined by Eq. (B4) can be misleading when the frequency spectrum has multiple peaks corresponding to different kinds of waves, for example short-period sea waves, long-period swell waves and infragravity waves with the period of wave groups. In this case, the separate periods can be determined by integrating the spectrum over frequency bandwidths associated with the different kinds of waves. An analysis of hindcasted wave data for the US Atlantic and Pacific coasts  yielded an overall value of $T_E/T_p = 0.81 - 0.85$. Considering separately wind wave-dominated data and swell-dominated data, the resulting values were $T_E/T_p = 0.85 -0.88$ for wind waves and $T_E/T_p = 0.93 – 0.97$ for swell waves.

In an irregular wave field, waves may come from different directions. The wave incidence direction is an important parameter for sediment transport in the coastal zone. Waves originating from different areas may have different spectra. The directional spread of incoming waves for a particular wave frequency can be represented by a distribution function $D(f,\theta)$, where $\theta$ is the wave incidence angle. We then have

$\lt E\gt \equiv \int_0^{\infty} \int_0^{2 \pi} S(f, \theta) df d \theta \equiv \int_0^{\infty} \int_0^{2 \pi} E(f) \, D(f, \theta) df d \theta \; ,\quad \int_0^{2 \pi} D(f, \theta) d \theta =1 \; .\qquad (B5)$

The directional wave spectrum $S(f, \theta)$ can be derived from directional wave buoys. In practice, it is often obtained by numerical modelling of the wave field in the major source area.

## Appendix C: Distribution of nonlinear waves

Here we reproduce an expression for the probability density distribution $p(\zeta)$ of the normalized surface elevation $\zeta$ of nonlinear waves following the derivation by Fuhrman, Klahn and Zhai (2023).

In appendix A we considered a wave field consisting of sinusoidal waves. However, waves incident on the coast are typically skewed as a result of nonlinear interactions and hence cannot be considered statistically independent. So here we consider a wave field

$\zeta = \zeta_1 + \zeta_2 + \zeta_3 + ….. , \qquad (C1)$

where the first order field $\zeta_1$ consists of densely distributed but uncorrelated waves as in (A1). These sinusoidal first order waves are modified by nonlinear interactions which generate higher harmonic waves $\zeta_2$ of second order ($O(\epsilon), \, \epsilon \lt \lt 1$) and harmonic waves, $\zeta_3, \zeta_4, …$ of higher order ($O(\epsilon^k), k \ge 2$). The small parameter $\epsilon$ is a measure of the average wave steepness. The mean surface elevation $\lt \zeta\gt$ is taken equal to zero and the wave field $\zeta$ is normalized such that $\lt \zeta^2 \gt = 1$.

No theoretical analytical expression of $p(\zeta)$ exists for nonlinear waves in general, but Fuhrman et al. 2023 have shown that an exact expression can be obtained when terms of order $\epsilon^k, \; k \ge 2$ are ignored. This implies that the wave skewness $S \equiv \lt \zeta^3\gt$ is given by $S = \lt \zeta_1^2 \zeta_2\gt$. Also to order $\epsilon$, expressions can be derived for the higher moments of $\zeta$:

$\lt \zeta^{2n}\gt = \Large\frac{2n!}{2^n n!}\normalsize \; , \quad \lt \zeta^{2n+1}\gt = \Large\frac{S}{6}\frac{(2n+1)!}{2^{n-1} (n-1)!}\normalsize . \qquad (C2)$

Now consider the function $M_{\zeta}(s) \equiv \lt \exp(i \zeta s) \gt \equiv \int_{-\infty}^{\infty} p(\zeta) \exp(i \zeta s) ds$. The probability density function $p(\zeta)$ can be found from the inverse Fourier transform

$p(\zeta) = \Large\frac{1}{2 \pi}\normalsize \int_{-\infty}^{\infty} \exp \big( \ln(M_{\zeta}(is)) - i \zeta s \big) ds . \qquad (C3)$

In order to evaluate this expression $M_{\zeta}(s)$ is developed into a Taylor series, $M_{\zeta}(s) = \sum_{n=0}^{\infty} \Large\frac{s^n}{n!}\normalsize \lt \zeta^n\gt ,$ this series is substituted in the Taylor expansion $\ln(M_{\zeta}(s)) = - \sum_{n=1}^{\infty} \Large\frac{(-1)^n}{n}\normalsize (M_{\zeta}(s) -1)^n$. Ignoring all terms of order $\epsilon^2$ and higher and using the expressions (C2) gives the result $\; \ln(M_{\zeta}(s)) = \Large\frac{1}{2}\normalsize s^2 + \Large\frac{1}{6}\normalsize S s^3 .$ It is emphasized that the exact skewness $S \equiv \lt \zeta^3\gt$ is utilized here.

Substitution in Eq. (C3) yields an expression for the distribution $p(\zeta)$. Evaluating this integral gives

$p(\zeta) = \Big| \Large\frac{2}{S}\normalsize \Big|^{1/3} \exp \big( \Large\frac{1}{3 S^2}+\frac{\zeta}{S}\normalsize \big) Ai(\chi) , \qquad (C4)$

where $Ai(\chi)$ is the Airy function (Fig. C1), $\; Ai(\chi) = \Large\frac{1}{\pi}\normalsize \int_0^{\infty} \cos \big( \Large\frac{t^3}{3}\normalsize + \chi t \big) \, dt \;$ and $\; \chi = \Big| \Large\frac{2}{S}\normalsize \Big|^{1/3} \big( \Large\frac{1}{2S}\normalsize + \zeta \Large\frac{S}{|S|}\normalsize \big) .$

For $S \rightarrow 0$ the formula (C4) reduces to the Gaussian distribution $p(\zeta) = \Large\frac{1}{\sqrt{2 \pi}}\normalsize \exp(-\zeta^2 / 2) .$ Fig. C1. The Airy function. Fig. C2. The probability density distributions Eq. (C3) for $S=0$ (blue) and $S=0.3$ (red).

The Airy function $Ai(x)$ oscillates for negative values of $x$ (Fig. C1). The probability density (C4) can therefore take negative values for large negative values of $\zeta$, which is physically unrealistic. A positive skewness $S$ corresponds to the usual situation of a peaked wave crest and shallow wave trough. Large negative $\zeta$ values (very deep wave troughs) are therefore highly improbable. The opposite holds for the unusual situation of negative skewness. Fig. C2 displays the probability density distributions for $S=0$ (the linear case) and $S=0.3$. The probability of high wave peaks is much higher in the nonlinear case than in the linear case.

Comments of David Fuhrman and Mathias Klahn on an earlier draft of this appendix are acknowledged.