Seminar: Evaluation Evaluation
10 Oct 2019 11:00 AM - 12:00 PM
Presenter/Speaker: Prof. David Powers, Flinders University, Adelaide, South Australia
Poor choices of evaluation measures is holding back AI and leads to biased easily-spoofed classifiers. I am interested in biasing toward recognizing the “right” kind of feature and ignoring the “wrong” kind of artefact - and understanding the nature of bias and in particular, when a human-like bias is desirable, or a different bias is more appropriate. In the 1990s a student and I published a CoNLL paper demonstrating of how tuning to optimize accuracy and F-measure was making things worse not better in the NLP context, following which I derived an empirical multiclass evaluation method for estimating Informedness, the probability that you are making an informed decision rather than guessing.
In the dichotomous case informedness turns out to be equivalent to Pearce’s I from the 1890s and Youden’s J from the 1950s, and has also been reinvented as DeltaP' and deskewed WRAcc, the former related to Matthews Correlation, the latter derived from ROC.
In the balanced multiclass case, Informedness corresponds to the “correct” way of marking multiple choice exams, so that guessing gives you zero.
My subsequent work has gone in two directions, modifying learning/boosting algorithms to optimize informedness, and looking at the relationship between informedness and other popular measures (including F-measure, Kappa and ROC AUC). Informedness has now been demonstrated to give a better and more intuitive idea of how good a system is across a wide range of AI and CI technologies and applications.
David Powers is Professor of Artificial Intelligence and Cognitive Science at Flinders University in Adelaide, South Australia, and is recognized as a pioneer in several areas of Artificial Intelligence, Biomedical and Robotic Engineering, and Parallel Computing. Prof. Powers organized the first events in Computational Natural Language Learning and founded SIGNLL and CoNLL, the peak association and conference in that area. His intelligent computing technology has formed the basis for eight startups, selling under brands including Clipsal Homespeak, Clevertar, YourAmigo and YourAnswer. Many of his AI applications and much of current research is focussed on assistive and educational technology, helping people to overcome a variety of ageing and health related including autism spectrum disorders, addiction, dementia, quadraplegia and locked-in syndrome. He has authored around 300 scientific papers, as well as both edited and authored several books.