Ensemble Machine Learning Methods and Applications

Thể loại: AI ;Công Nghệ Thông Tin
  • Lượt đọc : 325
  • Kích thước : 7.08 MB
  • Số trang : 331
  • Đăng lúc : 2 năm trước
  • Số lượt tải : 123
  • Số lượt xem : 1.162
  • Đọc trên điện thoại :
Making decisions based on the input of multiple people or experts has been a common practice in human civilization and serves as the foundation of a democratic society. Over the past few decades, researchers in the computational intelligence and machine learning community have studied schemes that share such a joint decision procedure. These schemes are generally referred to as ensemble learning, which is known to reduce the classifiers’ variance and improve the decision system’s robustness and accuracy.

However, it was not until recently that researchers were able to fully unleash the power and potential of ensemble learning with new algorithms such as boosting and random forest. Today, ensemble learning has many real-world applications, including object detection and tracking, scene segmentation and analysis, image recognition, information retrieval, bioinformatics, data mining, etc. To give a concrete example, most modern digital cameras are equipped with face detection technology. While the human neural system has evolved for millions of years to recognize human faces efficiently and accurately, detecting faces by computers has long been one of the most challenging problems in computer vision. The problem was largely solved by Viola and Jones, who developed a high-performance face detector based on boosting (more details in Chap. 8). Another example is the random forest-based skeleton tracking algorithm adopted in the Xbox Kinect sensor, which allows people to interact with games freely without game controllers.

Despite the great success of ensemble learning methods recently, we found very few books that were dedicated to this topic, and even fewer that provided insights about how such methods shall be applied in real-world applications. The primary goal of this book is to fill the existing gap in the literature and comprehensively cover the state-of-the-art ensemble learning methods, and provide a set of applications that demonstrate the various usages of ensemble learning methods in the real world. Since ensemble learning is still a research area with rapid developments, we invited well-known experts in the field to make contributions. In particular, this book contains chapters contributed by researchers in both academia and leading industrial research labs. It shall serve the needs of different readers at different levels. For readers who are new to the subject, the book provides an excellent entry point with a high-level introductory view of the topic as well as an in-depth discussion of the key technical details. For researchers in the same area, the book is a handy reference summarizing the up-to-date advances in ensemble learning, their connections, and future directions. For practitioners, the book provides a number of applications for ensemble learning and offers examples of successful, real-world systems.

This book consists of two parts. The first part, from Chaps. 1 to 7, focuses more on the theory aspect of ensemble learning. The second part, from Chaps. 8 to 11, presents a few applications for ensemble learning.

Chapter 1, as an introduction for this book, provides an overview of various methods in ensemble learning. A review of the well-known boosting algorithm is given in Chap. 2. In Chap. 3, the boosting approach is applied for density estimation, regression, and classification, all of which use kernel estimators as weak learners. Chapter 4 describes a “targeted learning” scheme for the estimation of nonpathwise differentiable parameters and considers a loss-based super learner that uses the cross-validated empirical mean of the estimated loss as estimator of risk. Random forest is discussed in detail in Chap. 5. Chapter 6 presents negative correlationbased ensemble learning for improving diversity, which introduces the negatively correlated ensemble learning algorithm and explains that regularization is an important factor to address the overfitting problem for noisy data. Chapter 7 describes a family of algorithms based on mixtures of Nystrom approximations called Ensemble Nystrom algorithms, which yields more accurate low rank approximations than the standard Nystrom method. Ensemble learning applications are presented from Chaps. 8 to 11. Chapter 8 explains how the boosting algorithm can be applied in object detection tasks, where positive examples are rare and the detection speed is critical. Chapter 9 presents various ensemble learning techniques that have been applied to the problem of human activity recognition. Boosting algorithms for medical applications, especially medical image analysis are described in Chap. 10, and random forest for bioinformatics applications is demonstrated in Chap. 11. Overall, this book is intended to provide a solid theoretical background and practical guide of ensemble learning to students and practitioners.

We would like to sincerely thank all the contributors of this book for presenting their research in an easily accessible manner, and for putting such discussion into a historical context. We would like to thank Brett Kurzman of Springer for his strong support to this book.