مجال التميز | تميز دراسي وبحثي |
البحوث المنشورة |
|
البحث (1): | |
عنوان البحث: |
ReG-Rules: An Explainable Rule-Based Ensemble Learner for Classification |
رابط إلى البحث: |
https://ieeexplore.ieee.org/document/9364993?source=authoralert |
تاريخ النشر: |
26/02/2021 |
موجز عن البحث: |
The learning of classification models to predict class labels of new and previously unseen data instances is one of the most essential tasks in data mining. A popular approach to classification is ensemble learning, where a combination of several diverse and independent classification models is used to predict class labels. Ensemble models are important as they tend to improve the average classification accuracy over any member of the ensemble. However, classification models are also often required to be explainable to reduce the risk of irreversible wrong classification. Explainability of classification models is needed in many critical applications such as stock market analysis, credit risk evaluation, intrusion detection, etc. Unfortunately, ensemble learning decreases the level of explainability of the classification, as the analyst would have to examine many decision models to gain insights about the causality of the prediction. The aim of the research presented in this paper is to create an ensemble method that is explainable in the sense that it presents the human analyst with a conditioned view of the most relevant model aspects involved in the prediction. To achieve this aim the authors developed a rule-based explainable ensemble classifier termed Ranked ensemble G-Rules (ReG-Rules) which gives the analyst an extract of the most relevant classification rules for each individual prediction. During the evaluation process ReG-Rules was evaluated in terms of its theoretical computational complexity, empirically on benchmark datasets and qualitatively with respect to the complexity and readability of the induced rule sets. The results show that ReG-Rules scales linearly, delivers a high accuracy and at the same time delivers a compact and manageable set of rules describing the predictions made. |
المؤتمرات العلمية |
|
المؤتمر (1): | |
عنوان المؤتمر: |
36th International Conference on Artificial Intelligence, (AI 2016) |
تاريخ الإنعقاد: |
13/12/2016 |
مكان الإنعقاد: |
Cambridge, UK |
طبيعة المشاركة: |
Paper presentation |
عنوان المشاركة: |
Towards expressive modular rule induction for numerical attributes |
ملخص المشاركة: |
The Prism family is an alternative set of predictive data mining algorithms to the more established decision tree data mining algorithms. Prism classifiers are more expressive and user friendly compared with decision trees and achieve a similar accuracy compared with that of decision trees and even outperform decision trees in some cases. This is especially the case where there is noise and clashes in the training data. However, Prism algorithms still tend to overfit on noisy data; this has led to the development of pruning methods which have allowed the Prism algorithms to generalise better over the dataset. The work presented in this paper aims to address the problem of overfitting at rule induction stage for numerical attributes by proposing a new numerical rule term structure based on the Gauss Probability Density Distribution. This new rule term structure is not only expected to lead to a more robust classifier, but also lowers the computational requirements as it needs to induce fewer rule terms. |
الرابط: |
https://link.springer.com/chapter/10.1007/978-3-319-47175-4_16 |
المؤتمر (2): | |
عنوان المؤتمر: |
37th International Conference on Artificial Intelligence, (AI 2017) |
تاريخ الإنعقاد: |
12/12/2017 |
مكان الإنعقاد: |
Cambridge, UK |
طبيعة المشاركة: |
Paper presentation |
عنوان المشاركة: |
Improving Modular Classification Rule Induction with G-Prism Using Dynamic Rule Term Boundaries |
ملخص المشاركة: |
Modular classification rule induction for predictive analytics is an alternative and expressive approach to rule induction as opposed to decision tree-based classifiers. Prism classifiers achieve a similar classification accuracy compared with decision trees, but tend to overfit less, especially if there is noise in the data. This paper describes the development of a new member of the Prism family, the G-Prism classifier, which improves the classification performance of the classifier. G-Prism is different compared with the remaining members of the Prism family as it follows a different rule term induction strategy. G-Prism’s rule term induction strategy is based on Gauss Probability Density Distribution (GPDD) of target classes rather than simple binary splits (local discretisation). Two versions of G-Prism have been developed, one uses fixed boundaries to build rule terms from GPDD and the other uses dynamic rule term boundaries. Both versions have been compared empirically against Prism on 11 datasets using various evaluation metrics. The results show that in most cases both versions of G-Prism, especially G-Prism with dynamic boundaries, achieve a better classification performance compared with Prism. |
الرابط: |
https://link.springer.com/chapter/10.1007/978-3-319-71078-5_9 |
|
|
المؤتمر (3): | |
عنوان المؤتمر: |
17th IEEE International Conference on Machine Learning and Applications (ICMLA) |
تاريخ الإنعقاد: | 17/12/2018 |
مكان الإنعقاد: |
Orlando, FL, USA |
طبيعة المشاركة: |
Paper presentation |
عنوان المشاركة: |
A Rule-Based Classifier with Accurate and Fast Rule Term Induction for Continuous Attributes |
ملخص المشاركة: |
Rule-based classifiers are considered more expressive, human readable and less prone to over-fitting compared with decision trees, especially when there is noise in the data. Furthermore, rule-based classifiers do not suffer from the replicated subtree problem as classifiers induced by top down induction of decision trees (also known as ‘Divide and Conquer’). This research explores some recent developments of a family of rulebased classifiers, the Prism family and more particular G-Prism-FB and G-Prism-DB algorithms, in terms of local discretisation methods used to induce rule terms for continuous data. The paper then proposes a new algorithm of the Prism family based on a combination of Gauss Probability Density Distribution (GPDD), InterQuartile Range (IQR) and data transformation methods. This new rule-based algorithm, termed G-Rules-IQR, is evaluated empirically and outperforms other members of the Prism family in execution time, accuracy and tentative accuracy. |
الرابط: |
منال خلف هجاد المطيري
دكتوراه
العلوم والتقنية
The University of Reading