|
Preface |
6 |
|
|
Contents |
7 |
|
|
List of Figures |
9 |
|
|
List of Symbols |
10 |
|
|
1 What Is Redescription Mining |
11 |
|
|
1.1 First Examples of Redescriptions |
11 |
|
|
1.2 Formal Definitions |
15 |
|
|
1.2.1 The Data |
15 |
|
|
1.2.2 The Descriptions |
16 |
|
|
1.2.3 The Redescriptions |
18 |
|
|
1.2.4 Other Constraints |
21 |
|
|
1.2.5 Distance Functions: Why Jaccard? |
23 |
|
|
1.2.6 Sets of Redescriptions |
26 |
|
|
1.3 Related Data Mining Problems |
28 |
|
|
1.4 A Short History |
30 |
|
|
References |
31 |
|
|
2 Algorithms for Redescription Mining |
34 |
|
|
2.1 Finding Queries Using Itemset Mining |
35 |
|
|
2.1.1 The MID Algorithm |
37 |
|
|
2.1.2 Mining Redescriptions with the CHARM-L Algorithm |
38 |
|
|
2.2 Queries Based on Decision Trees and Forests |
39 |
|
|
2.2.1 The CARTwheels Algorithm |
41 |
|
|
2.2.2 The SplitT and LayeredT Algorithms |
44 |
|
|
2.2.3 The CLUS-RM Algorithm |
47 |
|
|
2.3 Growing the Queries Greedily |
49 |
|
|
2.3.1 The ReReMi Algorithm |
49 |
|
|
2.4 A Comparative Discussion |
53 |
|
|
2.5 Handling Missing Values |
55 |
|
|
References |
57 |
|
|
3 Applications, Variants, and Extensions of Redescription Mining |
59 |
|
|
3.1 Applications of Redescription Mining |
59 |
|
|
3.1.1 In Biology |
60 |
|
|
3.1.2 In Ecology |
63 |
|
|
3.1.3 In Social and Political Sciences and in Economics |
64 |
|
|
3.1.4 In Engineering |
67 |
|
|
3.2 Relational Redescription Mining |
69 |
|
|
3.2.1 An Example of Relational Redescriptions |
69 |
|
|
3.2.2 Formal Definition |
71 |
|
|
3.3 Storytelling |
74 |
|
|
3.3.1 Definition and Algorithms |
75 |
|
|
3.3.2 Applications |
77 |
|
|
3.4 Future Work: Richer Query Languages |
81 |
|
|
3.4.1 Time-Series Redescriptions |
81 |
|
|
3.4.2 Subgraph Redescriptions |
83 |
|
|
3.4.3 Multi-Query and Multimodal Redescriptions |
84 |
|
|
References |
87 |
|