In this paper, a processing conversion and parallel control platform (PCsP) is proposed for transitioning serial hydrodynamic simulators to a cluster-computing system. We previously undertook efforts to promote the research and development of this type of platform and to demonstrate and commercialize it. Our PCsP provide distributed and parallel patterns, a centralized architecture, and user support. To validate our employed methodology and highlight its simplicity, we adopted the technology in various applications based on multi-grid algorithms. The methodology was shown to be reliable and feasible across computational domains, partitioning strategies, and multi-grid codes. Furthermore, its effectiveness was demonstrated using a complex engineering case in addition to code based on slightly less complex mathematical models. Eventual transition to a cluster-computing system will require further investigation of the impact of different model combinations on calculation accuracy, efficiency of operating models, and PCsP functional development.
INTRODUCTION
Owing to improvements in parallel computing and networking as well as the increasing availability of computer clusters, the development of an approach that effectively utilizes computing power has been a major objective for hydraulic engineers since the mid-2000s. This is especially applicable to hydrodynamics, in which large sets of equations must be simultaneously solved (Afshari et al. 2008). Parallel computing can leverage the problem-solving ability of computer clusters and requires parallel algorithms and a supporting framework (Holmes et al. 2011). The parallel implementation of an algorithm is traditionally based on topologies for structuring knowledge, thereby facilitating the efficient creation of complex relationships among data by means of specific languages, including different levels of abstraction (e.g. PRAM, BSP, and LogP) (Chui et al. 2009). Despite its parallel processing ability, such an implementation faces an underlying problem in that distributed and parallel programming mainly focus on coding topologies and complex relationships (Márquez et al. 2011). This makes parallel applications inaccessible and unmanageable from the perspective of hydraulic engineers, who are rarely experts in topological representations and languages (Chen & Lin 2008). Therefore, most specialized programs that have been developed by hydraulic engineers are limited to serial computation.
The need to reuse these types of programs to develop parallel applications has motivated recent research efforts. However, these efforts face two major hurdles (Jagers 2010). First, the programs generally provide very limited features because they are developed primarily to prove some concepts or investigate certain optimizations. Thus, they are unlikely to provide the required power and utility compared with their corresponding applications. In addition, the lack of a supporting framework means that these programs must be constructed from scratch and with multiple programming languages. Therefore, an urgent need exists for an innovative approach and related techniques for converting existing serial programs into parallel applications without modifying the source code and without the need for expert knowledge of topological representations and languages.
This paper relays our experience of developing such a cluster-computing environment for available facilities. The tangible contribution of this project is a processing conversion and parallel control platform (PCsP), which exclusively converts a program into a module for parallel applications by inserting a procedural function into the program loops. Accordingly, serial programs can be processed in parallel on a computer cluster. The objective of this approach is to improve serial programs through a simple parallel design scheme. This approach shows high performance and effectively speeds up multi-core processors. Moreover, numerical results obtained with this method demonstrate its accuracy with respect to the computational domain (in particular, when considering hybrid computing). Furthermore, the effectiveness of the approach was demonstrated in a practical engineering case (Wang et al. 2011a) and with code based on relatively less complex mathematical models (Shang et al. 2011a, 2011b).
Ongoing projects have shifted our attention to hydrodynamic simulators developed on the basis of multi-grid algorithms. Our approach can be used to parallelize heterogeneous code from a wide range of time-dependent models. In this context, we herein detail the PCsP's development and show how all of the content in a serial program can be encapsulated in a parallel module on the basis of a geometric partitioning strategy (Fialko 2010).
The remainder of this paper is organized as follows. The next section briefly reviews related studies, followed by a section describing the architecture of the proposed PCsP. Implementation issues are discussed in the following section, and the final section summarizes our findings and concludes the paper.
RELATED WORK
The practical approaches that researchers have used when devising parallel applications are based on providing either domain-specific tools or general-purpose solutions. The domain-specific approach is to supply users with standard application programming interfaces (APIs), through which a serial program can be executed in a distributed and coordinated manner (Shang et al. 2007a). To this end, a developer must, in principle, indicate which parts of the program will benefit from being parallelized. This is accomplished by inserting sequential code that implements appropriate calls to the APIs. Through the provision of APIs for executing different components of a program, the program is parallelized by dividing it into a number of distributed components that communicate through exchanged messages (Shang et al. 2011a, 2011b). A message passing interface (MPI) (Shang et al. 2007b) and parallel virtual machine (PVM) (Gregersen et al. 2007) are two well-established parallel computing environments. Object-oriented methods are used to develop APIs based on an MPI or PVM. These object-oriented methods mitigate the complexity inherent to writing parallel applications by encapsulating common distributed and parallel patterns in an intuitive API. A brief review of the development of these APIs can be found in the literature (Moore & Tindall 2005; Bugaets 2014). However, this approach has attracted significant criticism because MPIs and PVMs are basically low-level parallelization tools that require users to have substantial code-specific knowledge of both parallel programming and distributed deployment. Therefore, considerable manual effort is required whenever a new application is considered.
On the other hand, the general-purpose approach tends to relieve users of parallelization and deployment tasks as much as possible by raising the level of abstraction to users in the APIs. In general, this approach adopts a centralized architecture to provide a high-level interface for accessing and manipulating components of a program by using an API that conforms to component object model (COM) automation (Shang et al. 2014). With this approach, prerequisite knowledge of coding is replaced by the need to execute a detailed trace of an application, where the trace is identified with the main communication and computation activities (Shang et al. 2007c). Thus, the main advantage to this approach is that users are not required to have highly detailed knowledge of the program being considered. Consequently, a new application can be assessed with relative ease. However, these programs must have an exposed API that enables a developer to change its inherent behavior. This implies that a parallel application can be developed from serial programs if one knows the APIs in advance.
The PCsP enables modules to be linked with different spatial and temporal model representations. A universal plugin is inserted into the program prior to link-enabling the compiled counterpart. To improve the development environment, functionalities provided in the domain-specific and general-purpose approaches are also adopted, including distributed processing, COM automation, dynamic data exchange, and centralized control. The PCsP does not require users to learn parallel programming APIs and their code. Moreover, it enables even novice users to introduce parallelism into serial programs.
PCSP APPROACH
General description
In Figure 2, the process for solving a model is structured as a time-ordered series of iterations. For each cycle in the ordering, a new solution set is obtained by solving the discrete set of algebraic equations with a fixed or varying time-step, whereby the solution set from the previous cycle is used to configure the boundary conditions of the next cycle. Based on this typical behavior, time-dependent models can be concurrently solved by using the solution-set from one model to change the boundary conditions of another model on a different computer. Moreover, the computation flows of the models can be modified. Thus, the procedure can be concurrently processed by a cluster of computers. Moreover, the dynamic interactions of the physical process can be synchronously simulated using independent mathematical models, provided that these models are controlled at each time-step when exchanging data.
Traditionally, the open modeling interface (OpenMI) is used at run-time to exchange data between models. OpenMI adopts a ‘request and reply’ mechanism, whereby linked models may run asynchronously with respect to the time-steps. Interfaces are created using C# and Java. They are represented by a set of software interfaces that predefine how the programs are executed and how data are transferred. Models that comply with the interface standard can be configured to exchange data during computation (at run-time). To become an ‘OpenMI-compliant’ component, a time-dependent model should be programmed in accordance with the OpenMI standards, and it must pass the dimensional checks on the quantities linked.
The concept discussed above and its equivalents in terms of the PCsP structure, shown in Figure 3, correspond to a server and clients in a client–server model. In this structure, the PCsP plays the role of a master (server) that communicates with each module (client) in a central-access environment via transmissions among themselves. As depicted in Figure 3, by inserting a procedural function (i.e. a plugin) into the loops, a program is converted into a module for parallel application. The module changes its behavior and executes instructions according to the rules determined by the master.
First, the module passes attribute information, such as inputs, outputs, and geometries (i.e. grids) to the master. Then, the master compiles the required information and passes it along to the end-user. Based on the user's objective, the information collected from the other modules is transmitted to the target module for computation. Finally, after the module completes the computation for each time-step, it sends the output results to the master, and the master proceeds with the computation of the next cycle.
Development of PCsP
Based on the proposed system architecture, the detailed object-design scheme and the proposed solutions for achieving the desired system features are presented in this section. TCP/IP networks (Wang et al. 2011b) are adopted for communication among the modules in hydro-informatics platforms. Owing to its interoperability with other modules, the Internet was considered the appropriate solution in our study. The PCsP connects models not only from different suppliers, domains, and concepts, but also with different spatial and temporal resolutions. The PCsP can be described at two levels. At the user level, it provides a set of standard plugins, although modules are allowed to exchange data among themselves on a time-step basis as they run. At the IT level, the PCsP is the engine for the parallel application of interest. The models involved mutually depend on each other's calculation results. Any model in the multi-grid algorithm can be configured without further programming with regard to the data exchanged at run-time. Linked modules can synchronously run with respect to the time-steps, and data represented on different geometries (i.e. grids) can be seamlessly exchanged.
To facilitate data exchange, the input and output (IO) are synthesized in a suitable geometric format; a ‘geometries’ file is written to facilitate the conversion of IO data. The IO data are written in a single file, while the grid information is preserved in another file. Both files are generated and maintained by the master. The master reads the data and suitably translates them for the input of the target module based on the geometries file in which the IO data are included.
Links
The data and information flows form a complicated network that involves all the linked modules. Data exchange is achieved within the network. The exchanged data and information can be classified according to the abstracted levels, namely, the respective model, value, and grid level.
Model level
Model-level information includes the modules for the application, the attributes of the included modules, the links between modules, and the data flow for each link. We denote the model-level links as shown in Table 1.
. | M1 . | … . | Mi . | … . | Mj . | … . | Mn . |
---|---|---|---|---|---|---|---|
M1 | … | ML1,i | … | ML1,j | … | ML1,n | |
… | … | … | … | … | … | … | |
Mi | MLi,1 | … | … | MLi,j | … | MLi,n | |
… | … | … | … | … | … | … | |
Mj | MLj,1 | … | MLj,i | … | … | MLj,n | |
… | … | … | … | … | … | … | |
Mn | MLn,1 | … | MLn,i | … | MLn,j | … | MLn,n |
. | M1 . | … . | Mi . | … . | Mj . | … . | Mn . |
---|---|---|---|---|---|---|---|
M1 | … | ML1,i | … | ML1,j | … | ML1,n | |
… | … | … | … | … | … | … | |
Mi | MLi,1 | … | … | MLi,j | … | MLi,n | |
… | … | … | … | … | … | … | |
Mj | MLj,1 | … | MLj,i | … | … | MLj,n | |
… | … | … | … | … | … | … | |
Mn | MLn,1 | … | MLn,i | … | MLn,j | … | MLn,n |
Here, Mi represents the ith module, and MLi,j denotes the link between source module Mi and target module Mj. This is a one-way link. The reverse direction for this link is denoted by MLj,i for unique correspondence.
Value level
Value-level links constitute a sub-set of model-level links. They relate the outputs of the source module to the inputs of the target module. A model-level link, MLi,j, generally contains multiple value-level links because both the source module and target module are predominantly of the multiple-input multiple-output (MIMO) type. An output of the source module may be sent or converted to multiple inputs of the target module. We map the value-level links as shown in Table 2.
. | QT1 . | … . | QTj . | … . | QTn . |
---|---|---|---|---|---|
QS1 | QL1,1 | … | QL1,j | … | QL1,n |
… | … | … | … | … | … |
QSi | QLi,1 | … | QLi,j | … | QLi,n |
… | … | … | … | … | … |
QSm | QLm,1 | … | QLm,j | … | QLm,n |
. | QT1 . | … . | QTj . | … . | QTn . |
---|---|---|---|---|---|
QS1 | QL1,1 | … | QL1,j | … | QL1,n |
… | … | … | … | … | … |
QSi | QLi,1 | … | QLi,j | … | QLi,n |
… | … | … | … | … | … |
QSm | QLm,1 | … | QLm,j | … | QLm,n |
Here, QSi represents the ith output of the source module, QTj represents the jth input of the target module, and QLi,j represents the link between QSi and QTj.
Grid level
The output of the source module may be converted to suit the input format of the target module. The grid-level information serves as the basis for spatial conversion. Two vectors and a two-dimensional matrix are used to record the linking information at the grid level. One vector stores grid-indexing information with respect to the output data of the source module, while the other vector stores grid-indexing information with respect to the input data of the target module. The relationship between the two vectors is recorded using a two-dimensional matrix. Their correspondence is shown in Table 3.
. | GT1 . | … . | GTj . | … . | GTn . |
---|---|---|---|---|---|
GS1 | GL1,1 | … | GL1,j | … | GL1,n |
… | … | … | … | … | … |
GSi | GLi,1 | … | GLi,j | … | GLi,n |
… | … | … | … | … | … |
GSm | GLm,1 | … | GLm,j | … | GLm,n |
. | GT1 . | … . | GTj . | … . | GTn . |
---|---|---|---|---|---|
GS1 | GL1,1 | … | GL1,j | … | GL1,n |
… | … | … | … | … | … |
GSi | GLi,1 | … | GLi,j | … | GLi,n |
… | … | … | … | … | … |
GSm | GLm,1 | … | GLm,j | … | GLm,n |
Here, GSi represents the ith grid of the mesh from where output data are taken and GTj represents the jth grid of the mesh that will receive the input data. The conversion from GSi to GTj is denoted by GLi,j.
Data conversion
Time-step control
The master determines which modules will be involved. It loads and launches the modules in an orderly manner according to a time-step control strategy. The time-step specifications for different models are found to be variable mainly because they are developed from programs of different model types. Thus, it is necessary to control all the modules so they complete their respective time-step computations within each cycle to ensure the consistency of data operations. For example, if Module A lags behind Module B by a few time-steps, the master will push Module A forward through several consecutive time-steps such that the duration of the computation in progress remains within a small time-step.
Time-stepping simulations produce data roughly in proportion to the number of time-steps computed. Simulations with many steps can result in large data files. To conserve storage, only the computation results from two consecutive time-steps are stored in the master for each module; these steps are needed for the subsequent computation cycle. Then, linear interpolation between the steps can be adopted to ensure that sufficient information is available for smooth animation.
It should be noted that a ‘deadlock’ occurs when Modules A and B are intertwined. Each module demands from its counterpart the results from the next cycle as the boundary conditions for the current cycle. In these cases, each module requires the results of its counterpart from the next cycle for computation when they are mutually conditional. However, the next cycle will proceed only if the current computational cycle is complete. To decouple the process execution, we use the results from the current cycle as the input for the next cycle.
Plugins
By adding the plugins, the program becomes a module of the application. Using an example that considers a sequential mathematical model developed in Visual Fortran for Windows operating systems (C + +), this section defines the rules of such a plugin for data exchange. Plugin development involves three tasks: creating tables for cross-referencing between different data types, defining the reference style of function calls, and setting the packing and unpacking rules for datasets.
Computing languages generally adopt a different naming style for each data type. Therefore, a reference table is required for data exchange. Table 4 summarizes the cross-referencing between the basic data types in Fortran and C + +.
Fortran data type . | C ++ data type . | Fortran data type . | C ++ data type . |
---|---|---|---|
INTEGER(1) | char | CHARACTER(1) | unsigned char |
INTEGER(2) | short | COMPLEX(4) | struct complex4{ |
INTEGER(4) | int, long | Float real, image; | |
REAL(4) | float | COMPLEX(8) | struct complex8{ |
Fortran data type . | C ++ data type . | Fortran data type . | C ++ data type . |
---|---|---|---|
INTEGER(1) | char | CHARACTER(1) | unsigned char |
INTEGER(2) | short | COMPLEX(4) | struct complex4{ |
INTEGER(4) | int, long | Float real, image; | |
REAL(4) | float | COMPLEX(8) | struct complex8{ |
Further, in practice, a Fortran program cannot call a C ++ function unless this function has been referenced in the Fortran program. The reference style used in this study is outlined below:
where ‘void’ represents the types of values returned by a function. The use of ‘void’ herein indicates that any data type in Table 4 is allowed as a return value. In addition, ‘FUNNAME’ represents the name of a function, and ‘int & param’ represents the option for the parameter term of the function. The reference style for a C ++ function defined by a Fortran program is given below:
Finally, different models running in an application will generate mass data of various data types. For convenience, we propose that the data is to be packed in a uniform format. For example, the data in Model_Time_Info can be packed in a dataset as follows:
To ensure that the dataset is correctly unpacked, the receiving end unpacks it through the following procedure: (1) by reading the first four bytes, the receiving end determines the total number of bytes in the incoming dataset and allocates storage space accordingly; (2) by reading the next four bytes, the receiving end determines the type of received data; (3) based on the type and size of the incoming dataset, the receiving end assign values to the corresponding variables; (4) finally, the receiving end transforms the data using the cross-reference table (e.g. Table 4), when the sending and receiving ends are programmed using different languages.
Templates
The plugins serve as the middleware for the PCsP. By virtue of the plugins, a component can receive data from any other component of the PCsP. Nevertheless, a minor modification is required in the source model to make it a component of the PCsP. In order to facilitate the use of the PCsP, we developed a suite of programming templates along with a graphical user interface. These templates are presented below for the benefit of other model developers who can copy the templates to the corresponding locations in their model's source code without further modification. Moreover, such locations are easy to find because the copy procedure merely involves the head of the main program, the IO, and the calculation boundary settings.
Transformation template for a sequential program
Template for creating the PCsP's IO table
Template for outputting results (OutputResult)
Template for setting the calculation boundaries (GetBound)
Note that the functions with the header ‘HYG_’ are modularization middleware functions. An illustration for instructions used in the templates is given in Table 5.
Coding . | Annotation . | Coding . | Annotation . | Coding . | Annotation . |
---|---|---|---|---|---|
1000 | Inputs list | 1005 | Output table | 1010 | Push forward a step |
1001 | Outputs list | 1006 | Computation instruction | 2000 | Remotely request model information |
1002 | Configuration instruction | 1007 | Timing | 2001 | Remotely invoke model |
1003 | Model name | 1008 | Input data | 9998 | Computation completed |
1004 | Input table | 1009 | Output data | 9999 | Quit computation |
Coding . | Annotation . | Coding . | Annotation . | Coding . | Annotation . |
---|---|---|---|---|---|
1000 | Inputs list | 1005 | Output table | 1010 | Push forward a step |
1001 | Outputs list | 1006 | Computation instruction | 2000 | Remotely request model information |
1002 | Configuration instruction | 1007 | Timing | 2001 | Remotely invoke model |
1003 | Model name | 1008 | Input data | 9998 | Computation completed |
1004 | Input table | 1009 | Output data | 9999 | Quit computation |
IMPLEMENTATION ISSUES
This study employed real examples representing different combinations of various sequential models in order to analyze the effectiveness of the PCsP, which can also be extended to more complex situations. The core problem in applying the PCsP is its feasibility and effectiveness with combinations of models. Therefore, this paper highlights the technical details of the coupling of individual models. Further details regarding individual model development can be found in the literature (Ye et al. 2013; Ye et al. 2014; Guo et al. 2015).
Parallel calculation for sequential processes
In some cases, there is no strict requirement for model coupling. One can directly transfer a time-step value from the source component, N, to the target component and use it as the boundary condition for the target component, N + 1. All components can be simultaneously calculated. In our example, a river is divided into two types of sections (upstream and downstream) that are modeled using two one-dimensional hydrodynamic models (with the same model engine). Then, the coupling of the two section types is calculated. The results calculated for the entire river are compared.
Parallel computation of different types of serial programs
This example demonstrates that the combined model can fully exploit different sub-models in complex situations and that it provides excellent approximation of the original models as a whole. This procedure improves the accuracy and efficiency of model calculation. Moreover, component-based modeling prevents development duplication and offers considerable portability.
Verifying the efficiency and accuracy of parallelized sequential programs
The efficiency was compared by adjusting the number of sub-areas and performing calculations using the same number of time-steps on the same quad-core personal computer (PC) (2,000 time-steps in this case). The speedup S is defined as S=Ts/Tp, where Ts is the execution time of the sequential algorithm, and Tp is the execution time of the parallel algorithm. The speedup value and the sub-area numbers represent the single-zone efficiency. This means that the efficiency is calculated in a single processor (where each sub-area has its own processor). The test results for the sub-areas are presented in Table 6. Table 7 summarizes the differences between the different calculation methods.
Sub-area N . | Calculation time (s) . | Speedup S . | Single-zone efficiency S/N . |
---|---|---|---|
1 | 2,489 | 1 | 1 |
2 | 1,248 | 1.99 | 0.995 |
4 | 883 | 2.82 | 0.705 |
6 | 996 | 2.50 | 0.417 |
Sub-area N . | Calculation time (s) . | Speedup S . | Single-zone efficiency S/N . |
---|---|---|---|
1 | 2,489 | 1 | 1 |
2 | 1,248 | 1.99 | 0.995 |
4 | 883 | 2.82 | 0.705 |
6 | 996 | 2.50 | 0.417 |
X . | Actual (10–3) . | Proposed scheme (10–3) . | Literature (Zhang & Shen 2002) (10–3) . | This study (10–3) . | Diff (%) . |
---|---|---|---|---|---|
0.1 | 2.2224 | 2.3654 | 2.3409 | 2.3731 | 6.78 |
0.3 | 5.8184 | 6.1928 | 6.1266 | 6.2088 | 6.71 |
0.5 | 7.1919 | 7.6547 | 7.5670 | 7.6623 | 6.54 |
0.7 | 5.8184 | 6.1928 | 6.1266 | 6.2088 | 6.71 |
0.9 | 2.2224 | 2.3654 | 2.3409 | 2.3731 | 6.78 |
X . | Actual (10–3) . | Proposed scheme (10–3) . | Literature (Zhang & Shen 2002) (10–3) . | This study (10–3) . | Diff (%) . |
---|---|---|---|---|---|
0.1 | 2.2224 | 2.3654 | 2.3409 | 2.3731 | 6.78 |
0.3 | 5.8184 | 6.1928 | 6.1266 | 6.2088 | 6.71 |
0.5 | 7.1919 | 7.6547 | 7.5670 | 7.6623 | 6.54 |
0.7 | 5.8184 | 6.1928 | 6.1266 | 6.2088 | 6.71 |
0.9 | 2.2224 | 2.3654 | 2.3409 | 2.3731 | 6.78 |
The results indicate that the PCsP maintains the calculation accuracy and improves the calculation efficiency. We found that when the optimal number of sub-areas is four, the speedup reaches a maximum value of 2.82. Further, the speedup decreases with the number of the sub-areas, because the calculation capacity of the quad-core PC reaches its maximum level, and additional processes are queued to await execution.
Practical problems with the Three Gorges project
In this study, two models were employed in the PCsP to evaluate the regulation of the water level at Lianhua Pool before the Three Gorges (TG) Dam (compensation regulation). In addition to high efficiency and accuracy, the PCsP offers significant advantages in terms of simplicity, flexibility, and ease of control and feedback.
Figure 12(b) shows a flow rate of 55,000 m3/s. The maximum flood rate is reduced to 61,400 m3/s, with the water level of the TG Dam rising to 157.6 m. In this case, the compensation period lasts for 41 days, which effectively controls the flood areas. To reduce the flood rate below the safety level of 60,000 m3/s, as shown in Figure 12(c), the flow rate is reduced further to 50,000 m3/s.
In this case, however, although the maximum flood rate is reduced to 59,200 m3/s, the TG Dam sustains a water level above 160 m for a long period of time. Therefore, flooding occurs despite the regulated compensation. Furthermore, because this phenomenon is influenced by dam drainage, it lasts for 4 days when the flow rate is above 50,000 m3/s. It can be observed that the choice of flow rate has a crucial impact on the regulation of compensation.
According to the Chenglingji compensation regulation, the TG Dam is to be regulated according to the inflow of the section from the TG Dam site to the Chenglingji Project, which includes the Dongting lake system. This serves to control the water level or flow rate at Chenglingji (Lianhuatang) under the threshold value.
At present, the compensation regulation method is subject to the following rules. The TG Dam starts to store water when the flow rate at Chenglingji is greater than a certain value (starting regulation flow rate Q1). The decrease of the flow rate at the TG Dam within each time step shall be in direct proportion to the increase of the flow rate at Chenglingji within the same time step. The proportionality coefficient is called the coefficient of impounding velocity (k1), which is set by users. The minimum generating flow shall be no less than 25,000 m3/s, and the TG Dam will be operated by maintaining the water level after the level in front of the Dam reaches 161.3 m. Water drainage shall be increased at the TG Dam, and extra water stored in the earlier stage shall be discharged when the flow rate at Chenglingji is less than a certain value (drainage flow rate Q2, and Q1 ≥ Q2). The water level in front of the dam shall be controlled during the water drainage so that it is lowered at a uniform speed (the coefficient of drainage velocity (k2) is set by users) until it decreases to the flood control level (145 m). To enhance the flexibility of the compensation regulation, regulation parameters may be different values based on the periods of time.
Thus, by studying the Chenglingji flow rate and changes in the water level at the TG Dam, we believe there are huge defects in the regulation method currently adopted, which involves taking the real-time flow rate at Chenglingji as the basis of the compensation regulation. This is because the method fails to consider the time delay for water waves to spread to Chenglingji from the dam. We recommend adoption of the ladder regulation strategy for the flow rate to implement a divided period compensation.
CONCLUSION
In this paper, a PCsP approach was proposed to address the existing need for reconstructing sequential models to suit high-performance computing platforms. With the proposed approach, a sequential model is easily transformed into a module of the PCsP by merely inserting a procedural function in its main program. The PCsP controls each module through instructions contained in the dataset, where physical quantities and parameters are represented as blocks of data and locations are expressed as grid numbers. Moreover, templates and a graphical user interface were presented to enable novice users to manage the PCsP. An auxiliary function library for simplifying the modeling process was developed using hybrid programming. This method changes neither the calculation logic nor the original model. Moreover, it adopts a simple algorithm for performing distributed and parallel calculations. We tested our proposed platform with real hydrodynamic data. The results indicate that the proposal can successfully exchange data and commands between models while maintaining accurate calculations with high efficiency.
Furthermore, two or more independent models can be combined in an organic unit by a simple operation for studying the complex relations between the models. This widens the scope of application for the PCsP. Dividing and combining the models provides an effective method for addressing complex large-scale problems. Each model can be completed by a different developer, and there is no need for the developers to communicate with each other. Each developer can choose a familiar computational environment for developing the model. When the models are combined, they not only interact through the exchange of data, but they also provide user-defined parameters for data exchange. This enhances the flexibility of the procedure. Thus, the PCsP is suitable for the development of complex mathematical models through collaboration of multiple developers.
In this paper, we furthermore discussed the basic features of the PCsP and its realization. The PCsP provides a novel method for rejuvenating traditional mathematical models by enabling the parallel computing of seamlessly converging models. Nevertheless, the PCsP's features require further improvement and expansion. Future research will focus on two aspects. The first aspect concerns the characteristics of different model combinations. Models can be combined with an explicit or implicit scheme, both of which affect the platform's performance. This paper offers a set of tools designed to couple models on a trial-and-error basis. However, it does not develop clear theoretical conclusions and questions remain. What kinds of models can be combined together and under which scheme? What about the stability and precision of the platform? Can the combined models simulate the physical processes of complex phenomena? The above questions will be covered in future studies on the characteristics of model combinations, which will be an important research contribution to the parallelized model-development approach.
The second focus for future research concerns compatibility. This paper discusses the PCsP's application for environmental modeling and software. However, the PCsP is a generic method for process simulations; in theory, it can be applied to all time-stepping mathematical models. Therefore, PCsP is fully open and applicable to multiple domains and models that simulate the time process and comply with the program structure of time-steps. By exploring the compatibility of the PCsP with such models, we can expand its scope of application. For example, OpenMI is an interface standard based on modular development, and it is widely used to construct many excellent models. Thus, we intend to generate time-series dynamic simulations based on the OpenMI components and the PCsP's data exchange and timing control.
ACKNOWLEDGEMENTS
This work was supported by the National Natural Science Foundation of China (Grants 51579248, 51109112, 51309254), and the Open Research Fund of the State Key Laboratory of Hydroscience and Engineering (Grant 2015-B-03), State Key Laboratory of Water Resources and Hydropower Engineering Science (Grant 2014SWG03), State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin (Grant IWHR-SKL-201517), and CRSRI (Grant CKWV2014224/KY). Furthermore, a special fund was received for platform development from the China Institute of Water Resources and Hydropower Research (Contract WR0145B02201500000).