## ABSTRACT

A new model - super-resolution Wasserstein Generative Adversarial Network with Gradient Penalty (SRWgan-GP) - is developed with resolution of 512×512 to reconstruct the sliced 2D high-resolution flow field from low-resolution data. To train the SRWgan-GP model, flow field data obtained from Large Eddy Simulation (LES) behind the trash racks is utilized. A sub-pixel convolution layer is incorporated in the framework to generate higher-resolution feature maps (512 × 512), which significantly reduces the network's memory requirements under the same output resolution .The performance of the proposed model is compared with that of other commonly used generative models including u-shaped architecture model (Unet) and Convolutional Neural Network (CNN). The results reveal that the SRWgan-GP model excels in reconstructing the flow field along both the x with and y axes, demonstrating the most accurate performance with minimal error achieving an MSE of 0.001, PSNR of 46.557, and SSIM of 0.994 in depicting turbulent structures and the Kįrmįn vortex street. Power Spectral Density (PSD) analysis shows that the primary shedding frequency of the vortex street is consistent with LES at approximately 10Hz for SRWgan-GP. Additionally, the SRWgan-GP exhibits proficient accuracy in computing second-order statistics of the flow field, achieving minimal error in instantaneous Reynolds shear stresses.

## HIGHLIGHTS

A new super-resolution model – SRWgan-GP is developed to reconstruct the flow field structure behind the trash rack.

SRWgan-GP is capable of accurately reconstructing the flow field behind the trash rack, and its reconstruction accuracy is higher than that of CNN and UNet models.

SRWgan-GP can accurately predict the shedding frequency of the Kármán vortex street and the distribution of second-order quantities.

## INTRODUCTION

Trash racks play crucial roles in ensuring the safe operation of hydroelectric power stations (Nguyen & Naudascher 1991; Naudascher & Wang 1993) and protecting fish ecosystems (Raynal *et al.* 2013; Lučin *et al.* 2020). These devices are primarily designed to intercept large debris that might potentially harm the turbine components, while also being strategically designed to prevent fish from inadvertently entering the turbine units. However, the dynamic water flow loads that the trash racks are exposed to make them susceptible to fatigue failure. This is particularly evident when the frequency of the vortex street matches the natural frequency of the trash racks, leading to vortex-induced vibrations (Alazwari *et al.* 2022; Liu *et al.* 2022). Such vibrations can compromise the stability of the trash racks or even result in their damage. Therefore, it is essential to have a comprehensive understanding of the flow field characteristics downstream of the trash racks that are helpful in their design and maintenance.

Trash racks are commonly simplified as cylinders when analyzing the flow field downstream of their trailing edges. Currently, high-resolution wake flow fields are primarily obtained using computational fluid dynamics (CFD) methods (Hariri Asli *et al.* 2023), such as RANS models (Wu *et al.* 2022), large eddy simulation (LES) (Wu *et al.* 2020), and direct numerical simulation (DNS) (Trias *et al.* 2015). While CFD has been successful in many aspects of hydrodynamic analysis and design, obtaining a high-resolution flow field with a large number of grid points requires abundant computational resources. Given the substantial investment required for both experiments and numerical simulations, proposing a rapid and efficient approach to deriving the high-resolution flow fields and the turbulence characteristics is crucial. The promotion of the integration of artificial intelligence and big data methods (Guo *et al.* 2023) in fluid numerical computation, along with further exploration of the physical laws governing fluid motion, has emerged and developed rapidly (Kutz 2017; Kim *et al.* 2021).

Deep learning, an important branch of machine learning, has been extensively utilized in recent years in the field of fluid mechanics (Wang *et al.* 2018; Lee & You 2019; Raissi *et al.* 2019). Besides flow control (Zhou *et al.* 2020), deep learning can also be applied to solve the Navier–Stokes equations. Deep learning models utilized in flow control can optimize multiple parameters to achieve desired target characteristics while generating new complex turbulent motions. This advancement aids in enhancing the understanding of turbulent physical processes, which proves to be a challenging task with traditional control techniques. When solving the Navier–Stokes equations, CFD methods predominantly rely on modeling turbulent physics. On the other hand, deep learning decouples the Navier–Stokes equations primarily through the utilization of neural networks and the incorporation of residuals into the original loss function. This approach leads to the development of physics-informed neural networks (PINN) with fast convergence and high accuracy (Pang *et al.* 2019; Krishnapriyan *et al.* 2021). Kharazmi *et al.* (2021) encoded the Navier–Stokes equations using PINN and coupled them with the structural dynamic equations to derive the structural parameters, velocity fields, and dynamic structural motions. Ling *et al.* (2016) proposed a tensor-based neural network model that incorporated Galilean invariants of the Reynolds stress mean equation. This approach resulted in computational results that were superior to those obtained using nonlinear eddy viscosity models. Kim & Lee (2020) utilized generative adversarial networks and recurrent neural networks to achieve long-term sequence prediction of turbulence development. Their predicted results showed consistent spatiotemporal correlations with direct numerical simulation (DNS).

Besides the above-mentioned research, flow field results can be affected by potential issues such as noise, low resolution due to defects in measuring equipment, instability in observation scenarios in experiments, or insufficient grid resolution in CFD simulations. Recently, the research about deriving high-resolution (HR) flow fields from low-resolution flow fields (LR) is becoming increasingly popular, which is commonly referred to as super-resolution reconstruction (SR). Dong *et al.* (2014) first used deep learning models to reconstruct HR images with higher efficiency than the conventional bicubic interpolation method. Ribeiro *et al.* (2020) and Zhang *et al.* (2023) both utilized U-shaped convolutional networks to implement the reconstruction of time-averaged flow fields behind a cylinder. However, deep learning models face a significant challenge when it comes to reconstructing irregular turbulent flows, which are more complex than the regular time-averaged flow fields. Kim *et al.* (2021) employed an unsupervised model called Cycle-GAN to generate high-resolution turbulent flows from different filtered scales. Moreover, they validated the coherence of the reconstructed flow by comparing the high-order turbulence indicators and turbulent characteristics. Zhou *et al.* (2022) introduced a novel SR model that successfully reconstructed three-dimensional high-resolution turbulent fields using low-resolution flow field data.

Many previous studies have successfully utilized deep learning models to reconstruct turbulent flow fields. However, these models often have low input and output resolutions, typically at 64 × 64 or 128 × 128. In a recent study conducted by Kim *et al.* (2021), the impact of different input resolutions on the reconstruction of flow fields was investigated. Their findings revealed that higher input resolutions resulted in better reconstruction of turbulent flow fields across various models. Zhou *et al.* (2024) successfully reconstructed the wake flow field structure of a cylinder using LGF-CNN with an output resolution of 55 × 37. Although the lower output resolution can enhance computational efficiency, it also introduces significant interpolation errors. In simulations of the sliced 2D flow fields that adopted LES or DNS, tens even hundreds of thousands of grid nodes will be necessary. Interpolating such a vast number of grids into 128 × 128 slices of turbulent data may lead to the loss of important details, which finally alter the original characteristics of the flow field. Hence, it is essential to employ higher-resolution images in deep-learning models.

Considering the high demand of memory in the derivation of high-resolution flow fields, the current investigation is devoted to incorporating a sub-pixel convolution layer in the computational framework, which is capable of generating detailed feature maps at a resolution of 512 × 512, while not requiring a significant increase in memory. This layer learns filters to map features to higher resolutions, thus avoiding direct convolution and deconvolution computations in HR space, which significantly alleviates the GPU burden. Additionally, the proposed approach utilizes the Wasserstein distance as a loss function and incorporates gradient penalty features. Combining these techniques, the SRWgan-GP neural network framework successfully achieves the reconstruction of the HR(512 × 512) turbulent flow field of the trash rack.

## METHODS

### Model input

The spacing between each trash rack bar is 0.16 m. Given that the trash rack comprises several bars, the current simulations select three adjacent bars to reflect actual engineering conditions, thereby considering the influence of the outer bars on the central one. Thus, the flow field results obtained are solely for this middle bar. The bar width *B* = 0.16 m and thickness *D* = 0.022 m. To reduce the number of grids, the height of the computational domain *H* is taken as 0.022 m.

*x*-direction set at 1 m/s, consistent with the velocity observed in actual engineering projects. The outlet is set as a pressure boundary, maintaining a pressure value of 0 Pa. Periodic boundary conditions are employed on the sides to accurately simulate the flow conditions of the original trash rack. The top and bottom boundaries are treated with symmetric conditions.

As is shown in Figure 1(b), the original flow field is obtained from the LES simulations, then a 2D slice is extracted at the height of *Z* = 0.5 H. Two probes for validating the accuracy of time series results, namely J1 and J2, are positioned at *x*/*d* = 1 and *x*/*d* = 4 on the central axis behind the middle trash racks bar. The reconstruction velocity filed region in the *x*- and *y*-directions behind the trash rack bars in this section are interpolated to create an HR flow field. This HR flow field is the target for the deep learning model's reconstruction. Additionally, the flow field is interpolated to a LR 16 × 16 domain to eliminate the original flow field characteristics. This LR flow field serves as the input for the deep learning model. The main objective of this paper is to utilize the spatial information from the instantaneous LR flow field to generate the corresponding HR flow field, achieving the mapping from LR data to HR data, denoted as *HR*(*t*) = *F*(*LR*(*t*),*θ*), where *θ* represents the model parameters.

In the neural network training process, 400 flow field datasets are used for training, 100 for validation, and 500 for testing. Each dataset has a time interval of 0.0001 s, resulting in a total test data duration of 0.5 s.

### Model validation

In the grid independency study, three different grid configurations were selected with total computational node amounts of 2.1, 5.3, and 8 million, respectively. The height of the first layer grid is 4 × 10^{−3}, with a *y*^{+} value under 5, and the time step is set at 1 × 10^{−3}.

Table 1 lists the results derived from different grids and taking the results from the finest grid as reference, the deviation of Strouhal number (St) changed by 5.6% from the coarse grid to the fine grid, and only 0.7% from the medium grid to the fine grid.

Cases . | Re (10^{4})
. | C_{d}
. | C_{d}′
. | C_{l}′
. | St . |
---|---|---|---|---|---|

Mesh1-Coarse | 2.2 | 2.10 | 0.12 | 1.13 | 0.124 |

Mesh2-Medium | 2.2 | 2.18 | 0.16 | 1.28 | 0.132 |

Mesh3-Fine | 2.2 | 2.18 | 0.17 | 1.27 | 0.131 |

Experiment (Minguez et al. 2011) | 2.0–2.2 | 2.10 | / | / | 0.130 |

Lyn & Rodi (1994) | 2.14 | 2.10 | / | / | 0.134 |

Bearman & Obasaju (1982) | 2.0 | 2.10 | / | 1.20 | 0.130 |

DNS (Trias et al. 2015) | 2.2 | 2.18 | 0.20 | 1.71 | 0.132 |

LES (Chen et al. 2020) | 2.2 | 2.25 | 0.14 | 1.45 | 0.135 |

Cao & Tamura (2016) | 2.2 | 2.11–2.30 | 0.14–0.27 | 1.26–1.54 | 0.126–0.138 |

Fureby et al. (2000) | 2.2 | 2.10 | 0.19 | 1.34 | 0.135 |

k-ω (Wu et al. 2023) | 2.2 | 2.17 | 0.17 | 1.93 | 0.129 |

Cases . | Re (10^{4})
. | C_{d}
. | C_{d}′
. | C_{l}′
. | St . |
---|---|---|---|---|---|

Mesh1-Coarse | 2.2 | 2.10 | 0.12 | 1.13 | 0.124 |

Mesh2-Medium | 2.2 | 2.18 | 0.16 | 1.28 | 0.132 |

Mesh3-Fine | 2.2 | 2.18 | 0.17 | 1.27 | 0.131 |

Experiment (Minguez et al. 2011) | 2.0–2.2 | 2.10 | / | / | 0.130 |

Lyn & Rodi (1994) | 2.14 | 2.10 | / | / | 0.134 |

Bearman & Obasaju (1982) | 2.0 | 2.10 | / | 1.20 | 0.130 |

DNS (Trias et al. 2015) | 2.2 | 2.18 | 0.20 | 1.71 | 0.132 |

LES (Chen et al. 2020) | 2.2 | 2.25 | 0.14 | 1.45 | 0.135 |

Cao & Tamura (2016) | 2.2 | 2.11–2.30 | 0.14–0.27 | 1.26–1.54 | 0.126–0.138 |

Fureby et al. (2000) | 2.2 | 2.10 | 0.19 | 1.34 | 0.135 |

k-ω (Wu et al. 2023) | 2.2 | 2.17 | 0.17 | 1.93 | 0.129 |

The drag coefficient (*C*_{d}) for both fine and medium grids is consistent, aligning with results obtained using DNS in Trias *et al.* (2015). However, the calculated fluctuation of the drag coefficient (*C*_{d}′) for the fine grid is slightly lower than that of Trias, with an absolute error of 0.03. Regarding the fluctuation of the lift coefficient (*C*_{l}′), the results calculated by LES are consistently lower than those obtained by DNS. The fluctuation of the lift coefficient (*C*_{l}′) presented in this study falls within the range calculated by Cao & Tamura (2016). To balance computation time and accuracy, a medium-density grid is employed, and further verification of key flow field indicators has been conducted.

*x*-direction along the centerline of the rectangular cylinder with experimental results (Lyn & Rodi 1994; Trias

*et al.*2015). The current experimental results are in good agreement, while the calculations by Cao & Tamura (2016) are higher in the region where

*x*/

*D*> 2 for the rectangle. Figure 2(b) compares the pressure coefficient

*C*

_{p}on the wall of the rectangle. Overall, the results derived from current numerical simulation agree well with the experimental results reported by Bearman & Obasaju (1982) and Nishimura (2001) with slight overestimation. Figure 2(c) and 2(d) present the Reynolds normal stress and Reynolds shear stress , respectively, at a

*y*-directional distance of 0.125D from the rectangle. The results indicate that the simulated Reynolds normal stress and Reynolds shear stress in this simulation generally align with the experimental findings of previous studies (Lyn & Rodi 1994; Minguez

*et al.*2011), although there is a slight overestimation in the values of , current results present least deviation compare with the LES results of Cao & Tamura (2016) and Chen

*et al.*(2020).

As the adoption of the current medium grid and boundary condition settings meets the accuracy requirements. These same settings are applied to calculate the flow field of the trash racks (composed of three 7.23:1 rectangles).

### Neural network framework

#### Sub-pixel convolution layer

In current literature, the high-resolution flow field used as a model input is predominantly obtained through bicubic interpolation (Dong *et al.* 2014). Compromising the significant demand of computational resources and time, the 2D flow field is typically interpolated into a 64 × 64 or 128 × 128 flow field. However, flow fields derived from DNS or LES can have millions of grid cells within a 2D slice. Interpolating such a vast amount of grid node data into a 128 × 128 plane will inevitably result in significant numerical errors or even alter the original flow field characteristics. Therefore, it is necessary to interpolate the computed results into higher-resolution flow fields, such as 512 × 512 and 1,024 × 1,024.

To enhance spatial resolution without consuming significant GPU memory, Shi *et al.* (2016) proposed a network framework based on a sub-pixel convolution layer. The design concept is as follows.

*n*− 1 are used for feature extraction in low-resolution space.In the equation, where

*l*∈ [1,

*n*− 1],

*W*

_{l}and

*b*

_{l}represent network parameters and Φ denotes the activation function. The final layer,

*f*

_{n}, represents the sub-pixel convolution layer that converts the LR feature map

*f*

_{n−1}(

*I*

_{LR}) into an HR image,

*I*

_{HR}. This implies that the neural network primarily extracts features from LR images. Reducing the input and output resolution not only reduces the computational memory of the network framework but also allows the use of smaller convolutional kernels to extract information, thereby achieving high-definition image reconstruction. The sub-pixel convolution layer, commonly referred to as the pixel shuffle layer (PS), operates on low-resolution feature maps with high depth (number of channels), such as those of size (

*h*,

*w*,

*c*×

*r*

^{2}). Its primary function is to spatially rearrange the information in the depth (channel) dimension by splitting the feature map into blocks of size

*r*×

*r*and arranging them in a specific order on a new two-dimensional plane. This process can be understood as ‘projecting’ the information in the depth direction onto the spatial axis. Ultimately, the output feature map size is expanded to (

*h*×

*r*,

*w*×

*r*,

*c*) after being processed by the PS layer, leading to high-resolution reconstruction. In other words, the resolution is magnified by a factor of

*r*, while the depth is reduced to (1/

*r*

^{2}) of the original.

#### SRWgan-GP network

*et al.*(2017) introduced the Wasserstein GAN with gradient penalty (WGAN-GP), which enhances stability throughout the training process. Wgan-Gp network consists of a generator and a discriminator, which compete with each other during training to achieve an adversarial effect. One of the unique aspects of Wgan-Gp is its loss function construction method. The traditional binary cross-entropy loss function is replaced with the Wasserstein distance (also known as Earth Mover's Distance) to measure the distance between generated data and real data. Its Wasserstein distance is as follows:where

*P*

_{r}and

*P*

_{g}are two distributions, and

*Π*(

*P*

_{r},

*P*

_{g}) is the set of all joint distributions

*γ*(

*x*,

*y*) whose marginals are, respectively,

*P*

_{r}and

*P*

_{g}. This transformation enables the discriminator's output to represent the distribution distance between samples, encouraging the generation of complex sample distributions similar to the target while avoiding training instability issues. Moreover, to address the training complexity of generative adversarial networks, which can lead to non-convergence or model collapse, the gradient penalty method (Gulrajani

*et al.*2017) is applied to directly constrain the gradient, preventing gradient explosions. Our paper proposes a neural network framework named SRWGAN-GP for high-resolution (HR) flow field reconstruction by integrating a sub-pixel convolution layer and residual blocks into the Wgan-GP framework.

*et al.*2023). The Unet model is comprised of an encoder–decoder framework, which maps the input into a latent representation space and subsequently generates new images from this space. As is shown in Figure 3(a) and 3(b), the specific operations of the model are described as follows:

(1) In the encoder: Multiple blocks are utilized to extract features, and the resulting feature maps from the block outputs are saved to be used in the Unet's skip connections. Each block consists of two convolutional layers with a kernel size of 3 and one max pooling layer. While the convolution layers do not alter the size of the feature maps, the pooling layer reduces the size of the feature maps by half. (2) In the latent representation space: two residual blocks are employed to enhance the model's ability to capture high-dimensional information while maintaining the constant feature map size of 64 × 64. (3) In the decoder: deconvolution operations are used to increase the size of the feature maps by a factor of 2. The output result of each block is then fused with the corresponding skip connection from the Encoder using concatenation. Finally, a pixel shuffle (PS) layer is incorporated to map the size of the feature maps to a super-high-resolution flow field of 512 × 512.

The discriminator's framework is relatively simple, with its main function being to determine the authenticity of images. Firstly, the high-resolution flow field (512 × 512) is transformed into higher-dimensional data with a smaller feature size using the PixelUnShuffle operation, effectively reducing network parameters. Subsequently, convolution operations are applied multiple times to convert the authenticity of the image into the similarity of image distributions.

*N*is the size of the training dataset. The goal of deep learning is to find the optimal function

*f*by updating the parameters Θ, to minimize the loss function

*L*.

The neural network model in this study is constructed using PyTorch and developed using an NVIDIA GeForce 3060Ti. The Adam optimizer is employed to minimize the loss function. Both the generator and the discriminator adopt a learning rate of 0.0001, and stable convergence is achieved after 2,000 training iterations. A total of 400 flow field datasets are used for training 100 for validation, and 500 for testing. Each dataset has a time interval of 0.0001 s, resulting in a total test data duration of 0.5 s.

### Performance metrics

*et al.*2004). MSE and RMSE both represent the average error between two samples

*I*(

*i*,

*j*) and

*J*(

*i*,

*j*), while PSNR is commonly used alongside MSE to indicate the ratio of the maximum possible power of a signal to the MSE. Higher PSNR values correspond to better sample quality. In contrast to MSE and PSNR, SSIM is designed to align more closely with human visual perception. It evaluates the structure, brightness, and contrast of samples, generating values ranging from −1 to 1. SSIM values closer to 1 indicate a greater similarity between samples. The formulas for calculating MSE, PSNR, and SSIM are as follows:

In the formula, *m*,*n* are the dimensions of the sample, MAX_{I} represents the maximum possible pixel value of the sample, *μ*_{I}, *μ*_{J} are the average pixel values, *σ*_{I},*σ*_{J} are the variance of the pixel values, *σ*_{IJ} is the covariance, and *c*_{1} and *c*_{2} are constants to stabilize the division with a weak denominator.

*L*, within the compromise programming (CP) framework to evaluate and rank different models, as proposed by Khan

*et al.*(2023). The distance measure

*L*is defined as follows:

In the formula, MSE_{ideal} is 0, PSNR_{ideal} is 60, and SSIM_{ideal} is 1.

## RESULTS AND DISCUSSION

### Flow field characteristics

When water flows through a trash rack, Kármán vortex streets are formed behind it inducing multiple micro vortices. These structures are mainly concentrated around the vortex street, making the overall flow field appear disordered therefore bringing great challenge in the reconstruction of such a complex flow using deep learning methods. The current investigation adopts SRWgan-GP to reconstruct the high-resolution instantaneous flow field. To facilitate appropriate comparisons, the performance of SRWgan-GP was assessed in contrast to Unet and CNN models, which have been widely utilized in reconstructing flow fields. Specifically, for the SRWgan-GP model, the generator component was implemented using the Unet framework. This allowed for the evaluation of reconstruction performance without introducing the discriminator component, which is a crucial element in the original SRWgan-GP architecture.

*x*- and

*y*-directions and error maps of the velocity field were plotted to stress the distribution of errors. Table 2 quantifies the errors in the samples using MSE, PSNR, and SSIM. It is evident from Figure 4(a) that SRWgan-GP exhibits superior predictive performance, achieving the lowest reconstruction error in the test set with MSE of 0.001, PSNR of 46.557, and SSIM of 0.994. In comparison, Unet shows slightly inferior performance with MSE, PSNR, and SSIM values of 0.004, 39.160, and 0.975, respectively. While Unet accurately predicts the vortex street, its performance is less ideal in predicting turbulent flow, particularly in regions near the trailing edge, which suggests that without the discriminator, the ability to predict turbulent vortex structures is reduced. The CNN model's overall reconstruction effectiveness is the poorest among the three, with MSE, PSNR, and SSIM values of 0.006, 38.234, and 0.954, respectively. This model shows significant prediction errors for both the vortex street and turbulent flow. Performance metrics have been calculated for SRWgan-GP, Unet, and Cnn, with the resulting

*L*values being 17.85, 18.92, and 29.9, respectively, which indicates that SRWgan-GP has the best performance. Similarly, Figure 4(b) demonstrates the ability of the three models to reconstruct high-resolution flow fields in the

*y*-direction. SRWgan-GP again leads in accurately predicting the flow field in this direction, followed by Unet and then CNN. When compared to the

*x*-direction, all three models exhibit greater structural errors in reconstructing the flow field in the

*y*-direction, indicating more pronounced turbulence. Cao & Tamura (2016) utilized LES for a square cylinder at a Reynolds number of 22,000, consistent with the Reynolds number in this study, and found that the turbulence intensity in the

*y*-direction was greater than in the

*x*-direction. When water flows past a trash rack, the shedding of vortices causes unstable lateral fluctuation (Nguyen & Naudascher 1991), leading to significant turbulent fluctuations in the

*y*-direction. Therefore, the models exhibit greater reconstruction errors in predicting the flow field in the

*y*-direction.

. | Deep learning method . | ||
---|---|---|---|

Unet . | SRWGAN-GP . | CNN . | |

Mse | 0.004 | 0.001 | 0.006 |

Psnr | 39.160 | 46.557 | 38.234 |

Ssim | 0.975 | 0.994 | 0.954 |

L | 17.85 | 18.92 | 29.90 |

. | Deep learning method . | ||
---|---|---|---|

Unet . | SRWGAN-GP . | CNN . | |

Mse | 0.004 | 0.001 | 0.006 |

Psnr | 39.160 | 46.557 | 38.234 |

Ssim | 0.975 | 0.994 | 0.954 |

L | 17.85 | 18.92 | 29.90 |

### Velocity time series and power spectrum

*x*/

*d*= 1 and

*x*/

*d*= 4 on the central axis behind the middle trash racks bar. J1 is situated within the trailing edge vortex, while J2 is located in the Kármán vortex street region. Figure 5 illustrates that the

*x*-directional flow velocity exhibits significant turbulence and violent fluctuations, particularly at J1 where the trailing edge vortex is present. Consequently, the error at this point is higher due to the influence of stronger vortices and disturbances. The RMSE errors which are the square roots of MSE, for SRWgan-GP, Unet, and CNN models are 0.0623, 0.0905, and 0.198, respectively. As for J2, the more regular flow pattern due to its location within the Kármán vortex street results in lower difficulty in the velocity prediction, therefore fewer errors exist for SRWgan-GP, Unet, and CNN with magnitudes of 0.0544, 0.0607, and 0.130, respectively.

*y*-direction flow velocity time series with a minimal RMSE value of 0.238 for J1 and 0.125 for J2. The increased reconstruction error in the

*y*-direction, as compared to the

*x*-direction, suggests a heightened complexity and turbulence intensity within the

*y*-direction flow field.

*x*- and

*y*-directions at J2. The results reveal a strong agreement of the SRWgan-GP and Unet results with the LES simulation results in terms of the main frequency and energy of vortex shedding within the frequency range of 0–100 Hz. The primary frequency is approximately 10 Hz, corresponding to a Strouhal number (St) of 0.2, which aligns with previous research (Nguyen & Naudascher 1991; Naudascher & Wang 1993), except for the CNN with an overestimated main frequency of up to 14 Hz. Moreover, the power spectral curves of all models exhibit a −5/3 slope, indicating consistency in the observed turbulence scales within this frequency range. In the high-frequency region beyond 100 Hz, the energy of the power spectral curve derived from the deep learning models gradually deviates from the LES results. This discrepancy suggests that the deep learning models may not accurately capture the small structures and vortices in the flow field, leading to an overestimation of energy in the flow field.

### Instantaneous Reynolds shear stress

*u*and

_{x}*u*flow fields to assess the capability of deep learning models in reconstructing higher-order quantities within the flow field. Reynolds shear stress is as follows:

_{y}*ρ*denotes the density of water, while

*u*′ and

*v*′ correspond to the fluctuating flow velocities in the

*x*- and

*y*-directions, respectively. Owing to the significantly distorted distribution of instantaneous Reynolds shear stress computed by the CNN, Figure 8 presents the distribution of the instantaneous Reynolds shear stress and corresponding errors for two alternative deep learning models. The figure illustrates the capability of both SRWgan-GP and Unet models to reconstruct the instantaneous second-order quantities in the flow field. The momentum exchange primarily occurs in the Kármán vortex street, which undergoes periodic oscillations, with the error concentrated in this region. Compared to Unet, which has an MSE of 0.00018, PSNR of 41.07, and SSIM of 0.88, SRWGan-Gp accurately computes the instantaneous shear stress with an MSE of 0.00010, PSNR of 42.14, and SSIM of 0.90, resulting in flow field patterns that closely resemble the actual flow.

### Discussion

In this study, the SRWgan-GP network framework was successfully employed to reconstruct high-resolution (512 × 512) turbulent flow structures of trash racks. Most researchers (Liu *et al.* 2020; Ribeiro *et al.* 2020; Zhou *et al.* 2024) use deep learning network frameworks that output flow field resolutions at 128 × 128 or even lower. For simple flow features or laminar flow fields, this resolution might be sufficient, but for the DNS and LES turbulent flow field data, hundreds of thousands to millions of computational cells are required to capture smaller-scale turbulent structures. Additionally, higher resolutions provide a broader reconstructed field, allowing for a more comprehensive analysis of the flow field's dynamic characteristics. Therefore, we propose using a sub-pixel convolution layer in the final layer of the generator to enhance the flow field resolution from 128 × 128 to 512 × 512 which significantly improves the resolution. For fine turbulent fields that rely on higher grid resolutions, this model offers substantial advantages.

Due to the chaotic behavior of turbulence across a wide range of spatiotemporal scales, its inherent irregularity and disorder make it difficult to accurately reconstruct the turbulent field. Kim *et al.* (2021) reconstructed isotropic turbulence from DNS calculations using a Cyclegan model at a lower scaling factor (*r* = 4), where the reconstructed flow field was very close to the actual field, with the smallest MSE error of 0.00548. However, at a higher scaling factor (*r* = 16), the reconstructed flow field differed significantly from the actual field, resulting in a much larger MSE error of 0.087. According to Zhou *et al.* (2024), their LCF-CNN model was applied to reconstruct the PIV experimental flow field of a cylinder with a Reynolds number of 33,000, where at *r* = 4(16), the model reconstructed average relative errors in the *x*- and *y*-directions of the flow field of 0.0335(0.0872) and 0.1274(0.457), respectively. In our study, using a scaling factor of 32 (512/16 = 32), the proposed method based on Wgan-Gp enhanced the training stability and successfully reconstructed the trash rack flow field with an MSE error of 0.001, accurately capturing the turbulent structures and the Karman vortex street of the trash racks.

In summary, the model proposed in this paper, with its higher flow field resolution, is better suited for engineering applications involving the reconstruction of flow fields with a large number of grids and extensive computational domains. Additionally, the model possesses high accuracy, capable of precisely reconstructing actual flow characteristics, thereby providing high-quality flow field reconstruction results.

## CONCLUSION

This paper presents the SRWgan-GP neural network framework for reconstructing ultra-high-resolution (512 × 512) flow fields, achieving successful reconstruction of 2D flow fields behind the trash racks. Through a comparison with two typical generative models, namely Unet and CNN, this study analyzes and validates the accuracy in reconstructing flow fields in the *x*- and *y*-directions, as well as capturing turbulence characteristics and higher-order quantities using deep learning models. The main conclusions of our study are as follows:

(1) The SRWgan-GP model demonstrates a high accuracy in reconstructing the flow field in the

*x*- and*y*-directions, evidenced by its MSE of 0.001, SSIM of 0.994, and PSNR of 46.557. These metrics suggest that the model is effective in capturing the nuances of the Kármán vortex street and the turbulent structures present downstream of the trash rack. Unet is observed that efficiently captures the vortex street but exhibits a larger error in reconstructing smaller turbulence structures. The flow field reconstructed by the CNN exhibited significant errors, as indicated by an MSE of 0.006, SSIM of 0.954, and PSNR of 38.234.(2) The SRWgan-GP model demonstrates effective reconstruction of the flow velocity time series in both the

*x*- and*y*-directions. Notably, the reconstruction error for the*y*-direction time series is more pronounced than in the*x*-direction, suggesting a higher turbulence intensity in the*y*-direction. The shedding frequency of the Kármán vortex street downstream of the trash rack is observed to be 10 Hz. Both the SRWgan-GP and Unet models align with this frequency, whereas the CNN model overestimates it, reaching 14 Hz. In the higher frequency domain (>100 Hz), the energy represented in the power spectra of all three models is consistently overestimated, suggesting challenges in accurately capturing flow dynamics at smaller temporal scales.(3) The results demonstrate that SRWgan-GP exhibits the smallest error and effectively reconstructs the instantaneous Reynolds shear stress field. However, Unet shows a larger error in the vicinity of the trash racks’ trailing edge.

## FUNDING

This research is supported by the National Nature Science Foundation of China (Grant No. 52179060, 52209081, and 51909024).

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Fundamental study of bluff body aerodynamics*. Ph.D. Thesis, Kyoto University, 2001 (In Japanese).

*arXiv preprint arXiv:2004.08826*. http://doi.org/10.48550/arXiv.2004.08826