Table 6 gives an overview of the influence of the noise, gap between anomalies, and the parameters considered to run the algorithm on the obtained results.

Table 6

Overview of the influence of some characteristics of the datasets and the parameters considered to run the algorithm on the obtained results

Characteristics of dataset and parameterEffect
Noise Higher noise values lead to a decrease of the estimated amplitude of the block functions – especially visible in dataset 1 
Higher noise values make the algorithm more sensitive to the f1 penalty coefficient: for datasets 1 and 2 the algorithm fails to identify block functions when higher values for the f1 penalty coefficient are considered 
Gap between anomalies Overall results for datasets 1 are better than the results for datasets 2. The difference between sets 1 and 2 is the duration of the added anomalies: for sets 2 anomalies last longer, and the gap between them is shorter. This makes it harder for the algorithm to clearly identify two separate block functions 
For datasets 2, the algorithm has more difficulties in identifying the four necessary steps to describe the block functions. For several tests, the algorithm uses, or less or more steps, than the ones required for the block identification. For datasets 1 and 3, and for the majority of the tests, the four necessary steps are well identified 
Number of clusters The number of clusters significantly influences the computational time. When three and four clusters are considered the average computational times are respectively 6 to 17 times longer than when two clusters are considered. Since the generated datasets have only two anomalies, setting the number of clusters equal to two is ideal. However, when performing the test to real data, from which anomalies are not known beforehand, but instead are desired to be identified, setting the number of clusters to two can entail some risks such as not identifying more anomalies than two, if they exist. On the other hand increasing the number of clusters can lead to the identification of more blocks than the actual anomalies, mainly if anomalies occur soon after each other and there is some noise in the data. A suitable value for the f1 penalty factor should be chosen to prevent this issue 
Number of steps The number of considered steps also influences the computational time. When using five or six steps instead of four, the computational times are five and eight times longer, respectively 
Increasing the number of steps can lead to better results, especially when more noise is added to the data. However, it also leads to the identification of extra block functions in some cases. A suitable value for the f1 penalty factor should be chosen to prevent this issue 
Lx norm Using the L2 norm to determine the steps size leads to worse results in terms of the distance between the identified block functions and the matrix of b-factors. This effect becomes even more evident when the added noise increases. On the other hand, the use of the L2 norm seems to decrease the risk of identifying a third block 
Two intermediate values for the Lx norm were also considered (0.7 and 1.25). In some tests the lower value lead to better results, while the higher value leads to worse results 
Penalty f1 For several tests, when using a very small f1 penalty, (0.01), the algorithm identifies a third block function, located between the anomalies. With this very small penalty, the algorithm is not penalizing the use of more block functions and adds a block which is fitting the added noise. Increasing the f1 penalty solves this problem. For datasets 1a–1c, it is sufficient to consider a f1 penalty of 0.33 However, for datasets 2a–d, the algorithm benefits from higher f1 penalty values, and in some cases to avoid the identification of a third block it is necessary to increase the f1 value to 0.7 
Penalty f2 For most of the performed tests the value of the f2 penalty has no influence on the results. The exceptions are for datasets 2c where increasing the f2 penalty avoids identifying a third block 
Characteristics of dataset and parameterEffect
Noise Higher noise values lead to a decrease of the estimated amplitude of the block functions – especially visible in dataset 1 
Higher noise values make the algorithm more sensitive to the f1 penalty coefficient: for datasets 1 and 2 the algorithm fails to identify block functions when higher values for the f1 penalty coefficient are considered 
Gap between anomalies Overall results for datasets 1 are better than the results for datasets 2. The difference between sets 1 and 2 is the duration of the added anomalies: for sets 2 anomalies last longer, and the gap between them is shorter. This makes it harder for the algorithm to clearly identify two separate block functions 
For datasets 2, the algorithm has more difficulties in identifying the four necessary steps to describe the block functions. For several tests, the algorithm uses, or less or more steps, than the ones required for the block identification. For datasets 1 and 3, and for the majority of the tests, the four necessary steps are well identified 
Number of clusters The number of clusters significantly influences the computational time. When three and four clusters are considered the average computational times are respectively 6 to 17 times longer than when two clusters are considered. Since the generated datasets have only two anomalies, setting the number of clusters equal to two is ideal. However, when performing the test to real data, from which anomalies are not known beforehand, but instead are desired to be identified, setting the number of clusters to two can entail some risks such as not identifying more anomalies than two, if they exist. On the other hand increasing the number of clusters can lead to the identification of more blocks than the actual anomalies, mainly if anomalies occur soon after each other and there is some noise in the data. A suitable value for the f1 penalty factor should be chosen to prevent this issue 
Number of steps The number of considered steps also influences the computational time. When using five or six steps instead of four, the computational times are five and eight times longer, respectively 
Increasing the number of steps can lead to better results, especially when more noise is added to the data. However, it also leads to the identification of extra block functions in some cases. A suitable value for the f1 penalty factor should be chosen to prevent this issue 
Lx norm Using the L2 norm to determine the steps size leads to worse results in terms of the distance between the identified block functions and the matrix of b-factors. This effect becomes even more evident when the added noise increases. On the other hand, the use of the L2 norm seems to decrease the risk of identifying a third block 
Two intermediate values for the Lx norm were also considered (0.7 and 1.25). In some tests the lower value lead to better results, while the higher value leads to worse results 
Penalty f1 For several tests, when using a very small f1 penalty, (0.01), the algorithm identifies a third block function, located between the anomalies. With this very small penalty, the algorithm is not penalizing the use of more block functions and adds a block which is fitting the added noise. Increasing the f1 penalty solves this problem. For datasets 1a–1c, it is sufficient to consider a f1 penalty of 0.33 However, for datasets 2a–d, the algorithm benefits from higher f1 penalty values, and in some cases to avoid the identification of a third block it is necessary to increase the f1 value to 0.7 
Penalty f2 For most of the performed tests the value of the f2 penalty has no influence on the results. The exceptions are for datasets 2c where increasing the f2 penalty avoids identifying a third block 

Close Modal

or Create an Account

Close Modal
Close Modal