Identifying and interpreting extreme rainfall events using image classi ﬁ cation

This study presents the ﬁ rst attempt to identify extreme rainfall events based on surrounding sea-level pressure anomalies, using neural net-work-based classi ﬁ cation. Sensitivity analysis was also performed to identify the spatial importance of sea-level pressure anomalies. Three classi ﬁ cation models were generated: the ﬁ rst classi ﬁ es the patterns between extreme and regular rainfall events in the north west of England, the second classi ﬁ es the patterns between extreme and regular rainfall events in the south east of England, and the third classi ﬁ es between the patterns of extreme events in the north west and south east of England. All classi ﬁ ers obtain accuracies between 60 and 65%, with precision and recall metrics showing that extreme events are easier to identify than regular events. Finally, a sensitivity analysis is performed to identify the spatial importance of the patterns across the North Atlantic, highlighting that for all three classi ﬁ ers the local anomaly sea-level pressure patterns around the British Isles are key to determining the difference between extreme and regular rainfall events. In contrast, the pattern across the mid and western North Atlantic shows no contribution to the overall classi ﬁ cations.


INTRODUCTION
Flooding caused by extreme rainfall events can have severe social, environmental and economic consequences. Although variations in yearly flood trends have been studied extensively (e.g. Robson et al. 1998;Cox et al. 2002;Prosdocimi et al. 2019), such trends do not help in identifying the processes which lead to the extreme cases. For example, in February 2020 storms Ciara and Dennis passed over the UK, resulting in up to 177 mm of rainfall in a single 24-h period (Met Office 2020), and are estimated to have resulted in insured losses of up to £200 million (Finch 2020). To provide improved risk analysis and support flood management, the processes which cause these extreme rainfall events need to be better understood and differentiated from those of regular rainfall events.
The occurrence of extreme rainfall events in the UK has a strong dependence on the concurrent and prior meteorological conditions across north western Europe and the North Atlantic. For example, Brown (2018) shows the dependence of extreme daily rainfall in the UK on large-scale meteorological indices: North Atlantic Oscillation (NAO), Pacific Decadal Oscillation (PDO), El Niño -Southern Oscillation (ENSO) and Atlantic Multidecadal Oscillation (AMO). These indices represent the difference in either sea-level pressure (NAO, PDO) or sea-surface temperature (ENSO, AMO) across their specified regions. Brown found the biggest impact was made by the NAO (the difference in sea-level pressure between Iceland and the Azores), a positive NAO increases the likelihood of extra-tropical cyclones developing over the North Atlantic. This relationship is further demonstrated by Richardson et al. (2017) and Schillereff et al. (2019) which both show a negative NAO correlates strongly with increased high-river flows.
Extra-tropical cyclones are known to be the main contributors to extreme precipitation across the globe (Pfahl & Wernli 2012) and have also been linked to the development of atmospheric rivers (ARs) (Gimeno et al. 2020). ARs are long plumes of highly concentrated water vapour in the atmosphere which originate from the mid to lower latitudes, with those affecting the UK moving upwards towards North Western Europe from the Caribbean. Lavers et al. (2011) found ARs occurred during the 10 largest floods in the UK. Following this, Lavers & Villarini (2013) analysed the frequency and intensity of ARs with several climate change scenarios, concluding that both the intensity and frequency of the strongest ARs are expected to increase in the future. In contrast, Champion et al. (2015) found that less than 35% of winter and 15% of summer ARs are associated with an extreme rainfall event. This highlights the need to be able to determine the difference between atmospheric phenomena which cause extreme rainfall events and those which produce more moderate rainfall events of little or no interest in flood management.
Attempts to address this need have focussed on the identification of key meteorological patterns across the North Atlantic relating to extreme rainfall events derived from large-scale climate data. Neal et al. (2016) present 30 sea-level pressure anomaly (SLPA) patterns (MO-30) identified through the application of the k-means clustering algorithm (Lloyd 1957). These patterns represent the types of SLPA patterns which can be present over the North Atlantic on any day. The patterns were then combined subjectively into 8 SLPA patterns (MO-8), some of which are shown to strongly correlate with the NAO. Richardson et al. (2017) investigated the applicability of these cluster sets for identifying regional precipitation and drought climatology throughout the UK, finding that the smaller set of eight clusters do not aid in explaining precipitation variability. However, magnitude variation between patterns was observed, with some patterns producing consistently higher levels of median daily rainfall across all regions of the UK. However, this does not allow an easy distinction between extreme and regular rainfall SLPA patterns in the various regions. Ummenhofer et al. (2017) attempted to cluster SLPA patterns using the anomaly precipitation spatial variation across Europe and found a dipole in SLPA across the UK, which can determine whether precipitation anomalies in the UK will be positive in the north west or south east. This relates to the findings by Champion et al. (2019), who found that summer extreme rainfall events in the north west are typically associated with a positive SLPA region over the UK. However, no such relationship was found when analysing extremes in the south east.
The large-scale meteorological patterns being considered by the studies above are represented by images, 2D matrices with pixel colour representing the numerical value of the meteorological variable in question (e.g. SLPA). Neural networks have proved to be effective at image classification across various domains, from the classification of YouTube videos (Karpathy et al. 2014) to tumour feature extraction (Yang et al. 2019) but, up until now, they have not been used effectively to classify meteorological patterns such as distinguishing between SLPA of extreme events. This study SLPA applies neural networks to identify the differences in SLPA patterns across the North Atlantic for extreme and regular rainfall events, applied to both the north west and south east of England. In particular, the study demonstrates the differences in SLPA between: 1. Daily extreme and regular rainfall events in the north west of England. 2. Daily extreme and regular rainfall events in the south east of England. 3. Daily extreme rainfall events in the north west and south east of England.
Following this, a sensitivity analysis was conducted to compare which regions of the North Atlantic are most important to determining between the above classifications.

Rainfall events
Extreme and regular daily rainfall series were extracted from the administrative regions of the north west and south east of England. The rainfall series used was the UK Centre of Ecology and Hydrology's CEH-GEAR data set (Tanguy et al. 2019). This data set combines data from the Met Office database of observed precipitation data (Met Office 2014) using a natural neighbour interpolation method to generate a regular grid of rainfall based covering the British National Grid with daily rainfall values from 1890 to 2017. The application of the natural neighbour interpolation allows grid cells to be generated based on the closest viable rainfall monitors. To extract representative rainfall for each region (north west and south east England), a regular grid of coordinates is generated for each region at 30 km intervals, bounded by the North West and South East administrative regions. Figure 1 shows the position of these points in both the North West (a) and the South East (b). For each day, the rainfall magnitude for a region is given by the total rainfall at each of the points identified. The day on which the rainfall total is greater than a trace level (.0.5 mm at each location) is considered a rainfall day and is added to the non-trace rainfall series for the region. Following this, each non-trace event is standardised based on the mean and standard deviation of the month in which it occurs, using the following equation: where r l,m represents the non-trace rainfall series for region l for all non-trace rainfall events in month m, r l,m is the mean and std(r l,m ) is the standard deviation of the r l,m series. Following standardisation, the magnitude of each event represents the deviation of that event's magnitude from the mean of its respective location and month. Selecting extreme rainfall days from a rainfall series can be achieved by taking the maximum rainfall day in each time period (e.g annual maximum rainfall); alternatively, a threshold can be applied such that all days with a daily rainfall quantity greater than the threshold are selected. In this study, the later approach is used, as maxima approaches require a rainfall day to be selected in each time-period which can lead to the selection of rainfall days which in the context of the rainfall series are not extreme events such as in a particularly dry year. Selecting the events with the highest 10% of standard deviations from the mean provides a set of extreme rainfall events, the 10% of events closest to 0 can then be chosen as the regular rainfall data set as these represent the expected or mean amounts of rainfall in a given month for a given location. This extraction resulted in 3,008 individual events (1,504 extreme and regular rainfall, respectively) in the North West and 2,290 events (1,145 extreme and regular events) in the South East. The disparity here is due to there being fewer non-trace rainfall days in the South East, with the North West having 15,046 non-trace rainfall days and the South East having only 11,450.

Meteorological patterns
To classify each event, the SLPA pattern is required across the North Atlantic. For each of the regular and extreme rainfall days for both the north west and south east regions, SLPA patterns are extracted across the North Atlantic. The patterns are extracted from the 2.5 gridded NCEP/NCAR Reanalysis 1 data set (Kalnay et al. 1996), with each pattern bounded between 15 and 70 latitudes and À80 and 15 longitudes.
Each of these patterns is then standardised by the monthly mean and standard deviation patterns of the given variable across the bounded region, such that: where P i,m represents the 2D pattern for event i in month m which is standardised using element-wise operations using P m and std(fP m ). The monthly mean P m and standard deviation std(P m ) are defined using all daily patterns in month m, each resulting in a 2D matrix with each cell representing a single pixel across the North Atlantic.

CLASSIFICATION
To classify the rainfall events, a neural network-based classification method is used, the classifiers are trained using the SLPA patterns for each event over the North Atlantic. Three classifiers are required; the first classifies the north western extreme and regular event patterns, the second classifies the south eastern extreme and regular patterns and finally, the third classifier classifies the extreme event patterns from both the north west and the south east. The exclusion of a fourth model distinguishing between north west and south east regular events is intentional, as the focus of this paper is on the identification and comparison of extreme events. Table 1 describes each of these classifiers (M NW , M SE and M comp ) and indicates which data sets are used in each model and which class they represent. This section introduces the neural network classification method and the optimisation procedure including how the data is split for training.

Neural network classification
A neural network consists of at least two layers of nodes connected through edges. Figure 2 shows an example architecture with three layers: an input layer, a hidden layer and an output layer. Assigned to each edge in the graph is a weight which is optimised through training and is indicated by w When propagating an input vector, the input nodes in the neural network take on the values of their relevant features, and in the case of Figure 2, there are only two input nodes and hence two features in the input vector. Following this, the value propagated out of each edge can then be calculated by multiplying the value of the input node by the weight assigned to the edge, and in the case of edge 1, the value can be calculated as follows: After calculating the values of each edge, the next layer (the hidden layer in the case of Figure 2) sums the values of the edges which link to it. Following this, an activation function is used, which in this study is the rectified linear unit (ReLU) function (Equation (4)), outputting the maximum value of 0 or the sum of the edge values. ReLU is chosen as it both increases the speed of convergence in neural networks and can help improve accuracy as shown in text-based classification (Cui et al. 2017). The value at C can then be calculated as follows: The numbers indicate the class of the given data set in the given model.

Uncorrected Proof
A similar calculation can be done for node D and then the values at nodes C and D can be propagated forwards in the same manner to nodes E and F along their respective edges. In Figure 2, two output nodes are given, indicating a binary classification, a choice of two class nodes: either E or F. For classifiers M NW and M SE , the outputs nodes will represent whether the input pattern is an extreme event or a regular event, and in the case of M comp , these will indicate whether the pattern is a south eastern or north western extreme. To make the process of selection easier a softmax function (Equation (5)) is applied to the output nodes such that the output from node E is, where E is the output from node E, F is the output from node F and exp(…) is the exponential function. Softmax forces the output to be between 0 and 1 giving a probability of the given pattern being in classes E and F.

Classifier training
Each of the three models introduced at the beginning of this section (M NW , M SE and M comp ) requires different data sets, and hence three data sets are produced consisting of a labelled set of input vectors. Each data set is represented by a matrix of n rows and m columns where n is determined by the number of events and m is the number of cells (or pixels) in a given pattern. The row which represents a given pattern is generated through the flattening of its matrix, which is done through the concatenation of each row in the matrix to each other resulting in a vector. Through this operation, each 22 Â 38 cell pattern is converted to a single vector of size 836. Each set of events is then split into two subsets: a training and a test set. The training set consists of 80% of the input vectors and is used to train the neural network, the remaining 20% is the testing data set and is used to validate the accuracy of the network on unseen data. The actual selection of events as belonging to either the training or testing data sets is randomised. Training the network involves the optimisation of the weights between the nodes in the network, this optimisation is done using stochastic gradient descent (Bottou 1998) and the backpropagation algorithm (Rumelhart et al. 1986). Backpropagation is a way to propagate the error calculated at the output nodes back through the weights of a neural network (Laung & Haykin 1991). To do this, an objective function is required, and in this study, the cross-entropy loss function (Goodfellow et al. 2016) is used to determine the error between the output nodes and the intended classification label (extreme or regular). The crossentropy error will take the log difference between the true label and its calculated probability, resulting in the two output nodes containing the probability of each class being correct.
To further provide refinements to the final model, we trial models with a varying number of hidden nodes. In the example given in Figure 2, only two hidden nodes are used; however, in the SLPA classifiers, the number of hidden nodes is trialled from 10 to 100 in increments of 10. This will enable the selection of a suitable number of hidden nodes for the final representative model.

RESULTS
The accuracy and interpretation of the three SLPA classifiers are given in this section. The first part of this section discusses the accuracy of each classifier and then the sensitivity of the classifiers to regions of the North Atlantic's SLPA.

North west classifier (model M NW )
The results presented in Figure 3(c) show that the testing accuracies during training plateau at approximately 60%; in contrast, the training accuracies continue to increase close to 100% for classifiers with 90 and 100 hidden nodes. The classifier with 30 hidden nodes ends the 100 epochs of training with the highest testing accuracy of 62% and hence is selected for further investigation.
Investigating the M NW,30 classifier, precision between the extreme and regular event classifications differs at 68 and 55%, respectively. This indicates the classifier is better at identifying only extreme events, with fewer false positives. However, when comparing the recall scores of each classification, which are 60% for extremes and 63% for regular events, this shows the classifier is equally good at labelling positive extreme and regular events alike.

South east classifier (model M SE )
Similarly to the north west classifier a plateau occurs when attempting to train a neural network to classify between south eastern extreme and regular rainfall event SLPA patterns across the North Atlantic. As presented in Figure 4, the training accuracy increases, as the training error of these classifiers decreases; however, as found in the training for M NW , the testing accuracy remains consistent at around 60%.
For M SE , the optimal classifier contains 10 hidden nodes (M SE,10 ), this classifier finishes the 100 epochs with a 60% testing accuracy and 84% training accuracy. Breaking this into the precision and recall values, a similar trend again is seen to M NW,30 with precision values for both regular and extreme events being similar at 55 and 59%, respectively. Further to this, the recall values also show a similar trend with the extreme event patterns having a recall higher than that with the regular event patterns (61 and 53%, respectively).

Extreme event classifier
The final classifier (M comp ) looked to distinguish between extreme event patterns for north west and south east, similarly again to M NW and M SE a plateau of testing accuracy occurs, as shown in Figure 5, but with a slight variability depending on the number of hidden nodes used. The spread of training accuracies and training errors is also marginally lower than those presented in the previous classifiers.

Uncorrected Proof
Despite this, the most accurate classifier has 10 hidden nodes (M comp,10 ) and gives a testing accuracy of 65% and a training accuracy of 81%. In contrast to the M NW,30 and M SE,10 models, the precision and recall values of identifying south eastern extremes (66 and 67%) are both higher than those of identifying north western extremes (55 and 54%). This is counter-intuitive as the model was trained using 31% more examples of north western extremes, indicating that the model would have more experience classifying these types of extremes.

Spatial sensitivity
To identify the regions of interest to each classifier, a saliency map is created, representing the relative contribution of each cell (input feature) to the overall classification. To calculate the contribution of each cell to the overall classification, the backpropagation algorithm is used on a baseline image which is a pattern consisting of only 0 and a given classification, for example, an extreme event in M NW . The error generated by the network is then propagated back through the network; the weights of edges will show a stronger difference if they are important to the given classification, whereas those with little relevance will not change by a comparatively large amount (Simonyan et al. 2014). When the errors reach the input nodes, they can be rank-ordered to identify which pixels were contributing the most to the given classification, in this study this is achieved by normalising between 0 and 1 the contribution values calculated. Figure 6 shows the spatial contribution patterns for both M NW,30 (left) and M SE,10 (right). Both maps show little contribution from cells in the mid and western regions of the North Atlantic; however, a strong contribution is present closer to the British Isles. M NW,30 presents higher levels of contribution from both the Irish and North Seas, whereas M SE,10 presents relatively weak contributions from these regions but a higher level of contribution from the coast of Brittany in north western France. Ummenhofer et al. (2017) show the difference in SLPA across the North Atlantic for various precipitation anomaly patterns; the key difference in the patterns presented is the SLPA just west and south west of the British Isles (a positive SLPA leads to negative precipitation anomalies in the north west and a negative SLPA indicates positive precipitation anomalies). Similarly, M SE,10 shows interest in the south west of the UK which continues to match the findings of Ummenhofer et al. (2017).
Next, the saliency map for M comp,10 , which distinguishes between extreme events in the north west and extreme events in the south east, is shown in Figure 7 and presents a high level of contributions across the North of England, the Irish Sea and the coast of Brittany. This reinforces the case of local meteorological conditions creating the difference not only between extremes and regular rainfall events but also the difference between north west and south east extreme events. Further to this, both Figures 6 and 7 indicate on the day of occurrence the conditions across the rest of the North Atlantic are not contributing to the resulting classifications. This raises the question of how these contributions could change if the classifiers were trained using patterns of the days prior to an event, as it is known that certain prior meteorological conditions are common to some extreme rainfall events (Allan et al. 2019).

Limitations
The neural network-based approach used in this study relies heavily on the data used to train and test the models. As neural networks were trained to differentiate between extreme/regular rainfall events, hence, the training process is sensitive to how extreme Uncorrected Proof events were selected. In this study, a threshold method is used, which selects the top 10% of standardised rainfall days in each region to represent extreme events. The reasons for this selection are outlined in Section 2.1; however, the use of an alternative threshold (e.g. 5%) or a maxima-based method would result in a set of models which vary greatly from the models presented here.
Furthermore, the method used for calculating the representative daily rainfall total for each region and day is the result of a trade-off, between computational time and accuracy. As presented in Section 2.1, the representative sample for each day is collated from a regular grid of points at 30 km intervals, the CEH-GEAR data set is provided at 1 km intervals; however, increasing the resolution of the regular grid would substantially increase the computational time required to calculate the average daily regional rainfall total. This trade-off was considered reasonable, as the present study only uses the magnitudes to determine the extreme and regular events within each region. If, however, we were to compare the magnitudes between regions, then a different approach using higher resolution representations would be necessary.
Finally, the sensitivity analyses shown in Section 4.2 highlight some interesting disparities between regular and extreme events in both regions. However, the interpretability and reliability of the sensitivity analysis are tied to how effective the optimisation of the neural network's parameters was during training. Hence, with further fine-tuning of the network's parameters, the resulting sensitivity analyses should reveal clearer disparities and offer further insight into the differences.

CONCLUSIONS
Neural network-based image classification was used to identify the concurrent, and SLPA differences across the North Atlantic between the following types of daily rainfall events for two homogenous rainfall regions in the UK:  1. Extreme and regular rainfall events in the north west of England. 2. Extreme and regular rainfall events in the south east of England. 3. Extreme rainfall events in the north west and south east of England.
Through the generation and optimisation of several neural network classifiers to represent each of the above scenarios, the following conclusions can be drawn: 1. The differences in SLPA for extreme and regular rainfall events in both the north west and south east of England are close enough to make numerical classification difficult. a. Classifying between north western extremes and regular rainfall events has an accuracy of 2% higher than classifying between south eastern extremes and regular events (62 and 60%, respectively). b. In both regional classifiers, the precision and recall of extreme events are higher than those of the regular rainfall patterns, indicating that extreme event SLPA patterns are more defined than regular events. c. In determining the differences between extreme events in the north west and those in the south east, both recall and precision are higher for south eastern extremes, indicating that the patterns relating to these events are more defined than those in the north west. 2. Saliency maps have been used to identify the spatial regions of SLPA which contribute to the classifications.
a. The local SLPA patterns across the British Isles are key to determining the difference between extreme and regular rainfall events in both the north west and the south east. b. The mid and western North Atlantic, however, has been shown not to provide any substantial contribution to any of the classifications developed in this study. Finally, the patterns presented in the sensitivity analysis indicate the potential for this method to be used to identify the spatial importance of meteorological variables in the days prior to extreme or regular events. This will aid meteorologists to target the regions of importance without the need to incorporate global data, reducing computational requirements. Furthermore, opening this method for the application of other meteorological variables such as precipitable water, sea-surface temperature and geopotential height may enable further inference to be gained on why extreme events are different.