서지주요정보
Fundamentals of Machine Learning for Predictive Data Analytics, Second Edition : Algorithms, Worked Examples, and Case Studies
서명 / 저자 Fundamentals of Machine Learning for Predictive Data Analytics, Second Edition : Algorithms, Worked Examples, and Case Studies.
저자명 Kelleher, John D.
Mac Namee, Brian. ; D'Arcy, Aoife.
발행사항 Cambridge : MIT Press, 2020.
Online Access https://ebookcentral.proquest.com/lib/dgist-ebooks/detail.action?docID=6383434 URL

서지기타정보

서지기타정보
청구기호 Q325.5 .K455 2020
판사항 2nd ed.
형태사항 1 online resource (853 pages)
총서명 The MIT Press Ser.
언어 English
내용 Intro -- Contents -- Preface -- Notation -- List of Figures -- List of Tables -- I. INTRODUCTION TO MACHINE LEARNING AND DATA ANALYTICS -- 1. Machine Learning for Predictive Data Analytics -- 1.1 What Is Predictive Data Analytics? -- 1.2 What Is Machine Learning? -- 1.3 How Does Machine Learning Work? -- 1.4 Inductive Bias Versus Sample Bias -- 1.5 What Can Go Wrong with Machine Learning? -- 1.6 The Predictive Data Analytics Project Lifecycle: CRISP-DM -- 1.7 Predictive Data Analytics Tools -- 1.8 The Road Ahead -- 1.9 Exercises -- 2. Data to Insights to Decisions -- 2.1 Converting Business Problems into Analytics Solutions -- 2.1.1 Case Study: Motor Insurance Fraud -- 2.2 Assessing Feasibility -- 2.2.1 Case Study: Motor Insurance Fraud -- 2.3 Designing the Analytics Base Table -- 2.3.1 Case Study: Motor Insurance Fraud -- 2.4 Designing and Implementing Features -- 2.4.1 Different Types of Data -- 2.4.2 Different Types of Features -- 2.4.3 Handling Time -- 2.4.4 Legal Issues -- 2.4.5 Implementing Features -- 2.4.6 Case Study: Motor Insurance Fraud -- 2.5 Summary -- 2.6 Further Reading -- 2.7 Exercises -- 3. Data Exploration -- 3.1 The Data Quality Report -- 3.1.1 Case Study: Motor Insurance Fraud -- 3.2 Getting to Know the Data -- 3.2.1 The Normal Distribution -- 3.2.2 Case Study: Motor Insurance Fraud -- 3.3 Identifying Data Quality Issues -- 3.3.1 Missing Values -- 3.3.2 Irregular Cardinality -- 3.3.3 Outliers -- 3.3.4 Case Study: Motor Insurance Fraud -- 3.4 Handling Data Quality Issues -- 3.4.1 Handling Missing Values -- 3.4.2 Handling Outliers -- 3.4.3 Case Study: Motor Insurance Fraud -- 3.5 Advanced Data Exploration -- 3.5.1 Visualizing Relationships between Features -- 3.5.2 Measuring Covariance and Correlation -- 3.6 Data Preparation -- 3.6.1 Normalization -- 3.6.2 Binning -- 3.6.3 Sampling -- 3.7 Summary -- 3.8 Further Reading. 3.9 Exercises -- II. PREDICTIVE DATA ANALYTICS -- 4. Information-Based Learning -- 4.1 Big Idea -- 4.2 Fundamentals -- 4.2.1 Decision Trees -- 4.2.2 Shannon's Entropy Model -- 4.2.3 Information Gain -- 4.3 Standard Approach: The ID3 Algorithm -- 4.3.1 A Worked Example: Predicting Vegetation Distributions -- 4.4 Extensions and Variations -- 4.4.1 Alternative Feature Selection and Impurity Metrics -- 4.4.2 Handling Continuous Descriptive Features -- 4.4.3 Predicting Continuous Targets -- 4.4.4 Tree Pruning -- 4.4.5 Model Ensembles -- 4.5 Summary -- 4.6 Further Reading -- 4.7 Exercises -- 5. Similarity-Based Learning -- 5.1 Big Idea -- 5.2 Fundamentals -- 5.2.1 Feature Space -- 5.2.2 Measuring Similarity Using Distance Metrics -- 5.3 Standard Approach: The Nearest Neighbor Algorithm -- 5.3.1 A Worked Example -- 5.4 Extensions and Variations -- 5.4.1 Handling Noisy Data -- 5.4.2 Efficient Memory Search -- 5.4.3 Data Normalization -- 5.4.4 Predicting Continuous Targets -- 5.4.5 Other Measures of Similarity -- 5.4.6 Feature Selection -- 5.5 Summary -- 5.6 Further Reading -- 5.7 Epilogue -- 5.8 Exercises -- 6. Probability-Based Learning -- 6.1 Big Idea -- 6.2 Fundamentals -- 6.2.1 Bayes' Theorem -- 6.2.2 Bayesian Prediction -- 6.2.3 Conditional Independence and Factorization -- 6.3 Standard Approach: The Naive Bayes Model -- 6.3.1 A Worked Example -- 6.4 Extensions and Variations -- 6.4.1 Smoothing -- 6.4.2 Continuous Features: Probability Density Functions -- 6.4.3 Continuous Features: Binning -- 6.4.4 Bayesian Networks -- 6.5 Summary -- 6.6 Further Reading -- 6.7 Exercises -- 7. Error-Based Learning -- 7.1 Big Idea -- 7.2 Fundamentals -- 7.2.1 Simple Linear Regression -- 7.2.2 Measuring Error -- 7.2.3 Error Surfaces -- 7.3 Standard Approach: Multivariable Linear Regression with Gradient Descent -- 7.3.1 Multivariable Linear Regression. 7.3.2 Gradient Descent -- 7.3.3 Choosing Learning Rates and Initial Weights -- 7.3.4 A Worked Example -- 7.4 Extensions and Variations -- 7.4.1 Interpreting Multivariable Linear Regression Models -- 7.4.2 Setting the Learning Rate Using Weight Decay -- 7.4.3 Handling Categorical Descriptive Features -- 7.4.4 Handling Categorical Target Features: Logistic Regression -- 7.4.5 Modeling Non-Linear Relationships -- 7.4.6 Multinomial Logistic Regression -- 7.4.7 Support Vector Machines -- 7.5 Summary -- 7.6 Further Reading -- 7.7 Exercises -- 8. Deep Learning -- 8.1 Big Idea -- 8.2 Fundamentals -- 8.2.1 Artificial Neurons -- 8.2.2 Artificial Neural Networks -- 8.2.3 Neural Networks as Matrix Operations -- 8.2.4 Why Are Non-Linear Activation Functions Necessary? -- 8.2.5 Why Is Network Depth Important? -- 8.3 Standard Approach: Backpropagation and Gradient Descent -- 8.3.1 Backpropagation: The General Structure of the Algorithm -- 8.3.2 Backpropagation: Backpropagating the Error Gradients -- 8.3.3 Backpropagation: Updating the Weights in a Network -- 8.3.4 Backpropagation: The Algorithm -- 8.3.5 A Worked Example: Using Backpropagation to Train a Feedforward Network for a Regression Task -- 8.4 Extensions and Variations -- 8.4.1 Vanishing Gradients and ReLUs -- 8.4.2 Weight Initialization and Unstable Gradients -- 8.4.3 Handling Categorical Target Features: Softmax Output Layers and Cross-Entropy Loss Functions -- 8.4.4 Early Stopping and Dropout: Preventing Overfitting -- 8.4.5 Convolutional Neural Networks -- 8.4.6 Sequential Models: Recurrent Neural Networks and Long Short-Term Memory Networks -- 8.5 Summary -- 8.6 Further Reading -- 8.7 Exercises -- 9. Evaluation -- 9.1 Big Idea -- 9.2 Fundamentals -- 9.3 Standard Approach: Misclassification Rate on a Hold-Out Test Set -- 9.4 Extensions and Variations -- 9.4.1 Designing Evaluation Experiments. 9.4.2 Performance Measures: Categorical Targets -- 9.4.3 Performance Measures: Prediction Scores -- 9.4.4 Performance Measures: Multinomial Targets -- 9.4.5 Performance Measures: Continuous Targets -- 9.4.6 Evaluating Models after Deployment -- 9.5 Summary -- 9.6 Further Reading -- 9.7 Exercises -- III. BEYOND PREDICTION -- 10. Beyond Prediction: Unsupervised Learning -- 10.1 Big Idea -- 10.2 Fundamentals -- 10.3 Standard Approach: The k-Means Clustering Algorithm -- 10.3.1 A Worked Example -- 10.4 Extensions and Variations -- 10.4.1 Choosing Initial Cluster Centroids -- 10.4.2 Evaluating Clustering -- 10.4.3 Choosing the Number of Clusters -- 10.4.4 Understanding Clustering Results -- 10.4.5 Agglomerative Hierarchical Clustering -- 10.4.6 Representation Learning with Auto-Encoders -- 10.5 Summary -- 10.6 Further Reading -- 10.7 Exercises -- 11. Beyond Prediction: Reinforcement Learning -- 11.1 Big Idea -- 11.2 Fundamentals -- 11.2.1 Intelligent Agents -- 11.2.2 Fundamentals of Reinforcement Learning -- 11.2.3 Markov Decision Processes -- 11.2.4 The Bellman Equations -- 11.2.5 Temporal-Difference Learning -- 11.3 Standard Approach: Q-Learning, Off-Policy Temporal-Difference Learning -- 11.3.1 A Worked Example -- 11.4 Extensions and Variations -- 11.4.1 SARSA, On-Policy Temporal-Difference Learning -- 11.4.2 Deep Q Networks -- 11.5 Summary -- 11.6 Further Reading -- 11.7 Exercises -- IV. CASE STUDIES AND CONCLUSIONS -- 12. Case Study: Customer Churn -- 12.1 Business Understanding -- 12.2 Data Understanding -- 12.3 Data Preparation -- 12.4 Modeling -- 12.5 Evaluation -- 12.6 Deployment -- 13. Case Study: Galaxy Classification -- 13.1 Business Understanding -- 13.1.1 Situational Fluency -- 13.2 Data Understanding -- 13.3 Data Preparation -- 13.4 Modeling -- 13.4.1 Baseline Models -- 13.4.2 Feature Selection -- 13.4.3 The 5-Level Model -- 13.5 Evaluation. 13.6 Deployment -- 14. The Art of Machine Learning for Predictive Data Analytics -- 14.1 Different Perspectives on Prediction Models -- 14.2 Choosing a Machine Learning Approach -- 14.2.1 Matching Machine Learning Approaches to Projects -- 14.2.2 Matching Machine Learning Approaches to Data -- 14.3 Beyond Prediction -- 14.4 Your Next Steps -- V. APPENDICES -- A. Descriptive Statistics and Data Visualization for Machine Learning -- A.1 Descriptive Statistics for Continuous Features -- A.1.1 Central Tendency -- A.1.2 Variation -- A.2 Descriptive Statistics for Categorical Features -- A.3 Populations and Samples -- A.4 Data Visualization -- A.4.1 Bar Plots -- A.4.2 Histograms -- A.4.3 Box Plots -- B. Introduction to Probability for Machine Learning -- B.1 Probability Basics -- B.2 Probability Distributions and Summing Out -- B.3 Some Useful Probability Rules -- B.4 Summary -- C. Differentiation Techniques for Machine Learning -- C.1 Derivatives of Continuous Functions -- C.2 The Chain Rule -- C.3 Partial Derivatives -- D. Introduction to Linear Algebra -- D.1 Basic Types -- D.2 Transpose -- D.3 Multiplication -- D.4 Summary -- Bibliography -- Index.
주제 Prediction theory..
Machine learning..
Data mining.
보유판 및 특별호 저록 Print version: Kelleher, John D. Fundamentals of Machine Learning for Predictive Data Analytics, Second Edition Cambridge : MIT Press,c2020 9780262044691
ISBN 9780262364911, 9780262044691
QR CODE

책소개

전체보기

목차

전체보기

홈으로
닫기