Table 6 gives an overview of the influence of the noise, gap between anomalies, and the parameters considered to run the algorithm on the obtained results.

Table 6

Characteristics of dataset and parameter . | Effect . |
---|---|

Noise | Higher noise values lead to a decrease of the estimated amplitude of the block functions – especially visible in dataset 1 |

Higher noise values make the algorithm more sensitive to the f_{1} penalty coefficient: for datasets 1 and 2 the algorithm fails to identify block functions when higher values for the f_{1} penalty coefficient are considered | |

Gap between anomalies | Overall results for datasets 1 are better than the results for datasets 2. The difference between sets 1 and 2 is the duration of the added anomalies: for sets 2 anomalies last longer, and the gap between them is shorter. This makes it harder for the algorithm to clearly identify two separate block functions |

For datasets 2, the algorithm has more difficulties in identifying the four necessary steps to describe the block functions. For several tests, the algorithm uses, or less or more steps, than the ones required for the block identification. For datasets 1 and 3, and for the majority of the tests, the four necessary steps are well identified | |

Number of clusters | The number of clusters significantly influences the computational time. When three and four clusters are considered the average computational times are respectively 6 to 17 times longer than when two clusters are considered. Since the generated datasets have only two anomalies, setting the number of clusters equal to two is ideal. However, when performing the test to real data, from which anomalies are not known beforehand, but instead are desired to be identified, setting the number of clusters to two can entail some risks such as not identifying more anomalies than two, if they exist. On the other hand increasing the number of clusters can lead to the identification of more blocks than the actual anomalies, mainly if anomalies occur soon after each other and there is some noise in the data. A suitable value for the f penalty factor should be chosen to prevent this issue _{1} |

Number of steps | The number of considered steps also influences the computational time. When using five or six steps instead of four, the computational times are five and eight times longer, respectively |

Increasing the number of steps can lead to better results, especially when more noise is added to the data. However, it also leads to the identification of extra block functions in some cases. A suitable value for the f penalty factor should be chosen to prevent this issue _{1} | |

L_{x} norm | Using the L_{2} norm to determine the steps size leads to worse results in terms of the distance between the identified block functions and the matrix of b-factors. This effect becomes even more evident when the added noise increases. On the other hand, the use of the L_{2} norm seems to decrease the risk of identifying a third block |

Two intermediate values for the L_{x} norm were also considered (0.7 and 1.25). In some tests the lower value lead to better results, while the higher value leads to worse results | |

Penalty f_{1} | For several tests, when using a very small f_{1} penalty, (0.01), the algorithm identifies a third block function, located between the anomalies. With this very small penalty, the algorithm is not penalizing the use of more block functions and adds a block which is fitting the added noise. Increasing the f_{1} penalty solves this problem. For datasets 1a–1c, it is sufficient to consider a f_{1} penalty of 0.33 However, for datasets 2a–d, the algorithm benefits from higher f_{1} penalty values, and in some cases to avoid the identification of a third block it is necessary to increase the f_{1} value to 0.7 |

Penalty f_{2} | For most of the performed tests the value of the f_{2} penalty has no influence on the results. The exceptions are for datasets 2c where increasing the f_{2} penalty avoids identifying a third block |

Characteristics of dataset and parameter . | Effect . |
---|---|

Noise | Higher noise values lead to a decrease of the estimated amplitude of the block functions – especially visible in dataset 1 |

Higher noise values make the algorithm more sensitive to the f_{1} penalty coefficient: for datasets 1 and 2 the algorithm fails to identify block functions when higher values for the f_{1} penalty coefficient are considered | |

Gap between anomalies | Overall results for datasets 1 are better than the results for datasets 2. The difference between sets 1 and 2 is the duration of the added anomalies: for sets 2 anomalies last longer, and the gap between them is shorter. This makes it harder for the algorithm to clearly identify two separate block functions |

For datasets 2, the algorithm has more difficulties in identifying the four necessary steps to describe the block functions. For several tests, the algorithm uses, or less or more steps, than the ones required for the block identification. For datasets 1 and 3, and for the majority of the tests, the four necessary steps are well identified | |

Number of clusters | The number of clusters significantly influences the computational time. When three and four clusters are considered the average computational times are respectively 6 to 17 times longer than when two clusters are considered. Since the generated datasets have only two anomalies, setting the number of clusters equal to two is ideal. However, when performing the test to real data, from which anomalies are not known beforehand, but instead are desired to be identified, setting the number of clusters to two can entail some risks such as not identifying more anomalies than two, if they exist. On the other hand increasing the number of clusters can lead to the identification of more blocks than the actual anomalies, mainly if anomalies occur soon after each other and there is some noise in the data. A suitable value for the f penalty factor should be chosen to prevent this issue _{1} |

Number of steps | The number of considered steps also influences the computational time. When using five or six steps instead of four, the computational times are five and eight times longer, respectively |

Increasing the number of steps can lead to better results, especially when more noise is added to the data. However, it also leads to the identification of extra block functions in some cases. A suitable value for the f penalty factor should be chosen to prevent this issue _{1} | |

L_{x} norm | Using the L_{2} norm to determine the steps size leads to worse results in terms of the distance between the identified block functions and the matrix of b-factors. This effect becomes even more evident when the added noise increases. On the other hand, the use of the L_{2} norm seems to decrease the risk of identifying a third block |

Two intermediate values for the L_{x} norm were also considered (0.7 and 1.25). In some tests the lower value lead to better results, while the higher value leads to worse results | |

Penalty f_{1} | For several tests, when using a very small f_{1} penalty, (0.01), the algorithm identifies a third block function, located between the anomalies. With this very small penalty, the algorithm is not penalizing the use of more block functions and adds a block which is fitting the added noise. Increasing the f_{1} penalty solves this problem. For datasets 1a–1c, it is sufficient to consider a f_{1} penalty of 0.33 However, for datasets 2a–d, the algorithm benefits from higher f_{1} penalty values, and in some cases to avoid the identification of a third block it is necessary to increase the f_{1} value to 0.7 |

Penalty f_{2} | For most of the performed tests the value of the f_{2} penalty has no influence on the results. The exceptions are for datasets 2c where increasing the f_{2} penalty avoids identifying a third block |

This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy.