Intro -- Preface -- Contents -- 1 Basic Concept -- 1.1 Big Data -- 1.1.1 Big Data in the 5G Era -- 1.1.2 Characteristics of Big Data -- 1.1.3 Society 5.0 -- 1.1.4 5G -- 1.2 Key Concepts of Big Data Analysis -- 1.2.1 Interaction Between Real-World Data and Social Data -- 1.2.2 Universal Key -- 1.2.3 Ishikawa Concept -- 1.2.4 Single Event Data and Single Data Source -- 1.2.5 Process Flow of Big Data Analysis -- 1.3 Big Data's Vagueness Revisited -- 1.3.1 Issues -- 1.3.2 Integrated Data Model Approach -- 1.4 Hypothesis -- 1.4.1 What Is Hypothesis? -- 1.4.2 Hypothesis in Data Analysis -- 1.4.3 Hypothesis Generation -- 1.4.4 Hypothesis Interpretation -- 1.5 Design Principle and Design Pattern -- 1.6 Notes on the Cloud -- 1.7 Big Data Applications -- 1.7.1 EBPM -- 1.7.2 Users of Big Data Applications -- 1.8 Design Principles and Design Patterns for Efficient Big Data Processing -- 1.8.1 Use of Tree Structures -- 1.8.2 Reuse of Results of Subproblems -- 1.8.3 Use of Locality -- 1.8.4 Data Reduction -- 1.8.5 Online Processing -- 1.8.6 Parallel Processing -- 1.8.7 Function and Problem Transformation -- 1.9 Structure of This Book -- References -- 2 Hypothesis -- 2.1 What Is Hypothesis? -- 2.1.1 Definition and Properties of Hypothesis -- 2.1.2 Life Cycle of Hypothesis -- 2.1.3 Relationship of Hypothesis with Theory and Model -- 2.1.4 Hypothesis and Data -- 2.2 Research Questions as Hints for Hypothesis Generation -- 2.3 Data Visualization -- 2.3.1 Low-Dimensional Data -- 2.3.2 High-Dimensional Data -- 2.3.3 Tree and Graph Structures -- 2.3.4 Time and Space -- 2.3.5 Statistical Summary -- 2.4 Reasoning -- 2.4.1 Philosophy of Science and Hypothetico-Deductive Method -- 2.4.2 Deductive Reasoning -- 2.4.3 Inductive Reasoning -- 2.4.4 Generalization and Specialization -- 2.4.5 Plausible Reasoning -- 2.5 Problem Solving -- 2.5.1 Problem Solving of Pólya
2.5.2 Execution Means for Problem Solving -- 2.5.3 Examples of Problem Solving -- 2.5.4 Unconscious Work -- References -- 3 Science and Hypothesis -- 3.1 Kepler Solving Problems -- 3.1.1 Brahe's Data -- 3.1.2 Obtaining Orbit Data from Observation Data (Task 1) -- 3.1.3 Deriving Kepler's First Law (Task 2) -- 3.2 Galileo Conducting Experiments -- 3.2.1 Galileo's Law of Free Fall -- 3.2.2 Thought Experiments -- 3.2.3 Galileo's Law of Inertia -- 3.2.4 Galileo's Principle of Relativity -- 3.3 Newton Seeking After Universality -- 3.3.1 Reasoning Rules -- 3.3.2 Three Laws of Motion -- 3.3.3 The Universal Law of Gravitation -- 3.4 Darwin Observing Nature -- 3.4.1 Theory of Evolution -- 3.4.2 Population Growth Model -- 3.4.3 Fibonacci Sequence Revisited -- 3.4.4 Logistic Model -- References -- 4 Regression -- 4.1 Basics of Regression -- 4.1.1 Ceres Orbit Prediction -- 4.1.2 Method of Least Squares -- 4.1.3 From Regression to Orthogonal Regression to Principal Component Analysis -- 4.1.4 Nonlinear Regression -- 4.1.5 From Regression to Sparse Modeling -- 4.2 From Regression to Correlation to Causality -- 4.2.1 Genetics and Statistics -- 4.2.2 Galton -- 4.2.3 Karl Pearson -- 4.2.4 Neyman and Gosset -- 4.2.5 Wright -- 4.2.6 Spearman -- 4.2.7 Nightingale -- 4.2.8 Mendel -- 4.2.9 Hardy-Weinberg Equilibrium -- 4.2.10 Fisher -- References -- 5 Machine Learning and Integrated Approach -- 5.1 Clustering -- 5.1.1 Definition and Brief History of Clustering -- 5.1.2 Clustering Based on Partitioning -- 5.1.3 Hierarchical Clustering -- 5.1.4 Evaluation of Clustering Results -- 5.1.5 Advanced Clustering -- 5.2 Association Rule Mining -- 5.2.1 Applications -- 5.2.2 Basic Concept -- 5.2.3 Overview of Apriori Algorithm -- 5.2.4 Generation of Association Rule -- 5.3 Artificial Neural Network and Deep Learning -- 5.3.1 Cross-Entropy and Gradient Descent
5.3.2 Biological Neurons -- 5.3.3 Artificial Neural Network -- 5.3.4 Classification -- 5.3.5 Deep Learning -- 5.4 Integrated Hypothesis Generation -- 5.5 Data Structures -- 5.5.1 Hierarchy -- 5.5.2 Graph and Network -- 5.5.3 Digital Ecosystem -- References -- 6 Hypothesis Generation by Difference -- 6.1 Difference-Based Method for Hypothesis Generation -- 6.1.1 Classification of Difference-Based Methods -- 6.1.2 Difference Operations -- 6.2 Difference in Time -- 6.2.1 Analysis of Time Series Data -- 6.2.2 Time Difference: Case of Discovery of Satisfactory Spot -- 6.2.3 Time Difference: Case of Tankan of BOJ -- 6.2.4 Difference in Differences: Case of Effect of New Drug -- 6.2.5 Time Series Model: Smoothing and Filtering -- 6.2.6 Multiple Moving Averages: Case of Estimating Best Time to View Cherry Blossoms -- 6.2.7 Exponential Smoothing: Case of Detecting Local Trending Spots -- 6.2.8 Nested Moving Averages: Case of El Niño-Southern Oscillation -- 6.2.9 Time Series Forecasting -- 6.2.10 MQ-RNN -- 6.2.11 Difference Equation -- 6.3 Differences in Space -- 6.3.1 Image with Time Difference -- 6.3.2 Difference Analysis of Medical Images -- 6.3.3 Difference Analysis of Topographic Data -- 6.3.4 Difference in Lunar Surface Images: Case of Discovery of Newly Created Lunar Craters -- 6.3.5 Image Processing -- 6.4 Differences in Conceptual Space -- 6.4.1 Case of Creating the Essential Meaning of Concept -- 6.4.2 Case of International Cuisine Notation by Analogy -- 6.5 Difference Between Hypotheses -- 6.5.1 Case of Discovery of Candidate Installation Sites for Free Wi-Fi Access Point -- 6.5.2 Case of Analyzing Influence of Weather on Tourist Behavior -- 6.5.3 GWAS -- References -- 7 Methods for Integrated Hypothesis Generation -- 7.1 Overview of Integrated Hypothesis Generation Methods -- 7.1.1 Hypothesis Join -- 7.1.2 Hypothesis Intersection
7.1.3 Hypothetical Union -- 7.1.4 Ensemble Learning -- 7.2 Hypothesis Join: Case of Detection of High-Risk Paths During Evacuation -- 7.2.1 Background -- 7.2.2 Proposed System -- 7.2.3 Experiments and Considerations -- 7.3 Hypothesis Intersection: Case of Detection of Abnormal Automobile Vibration -- 7.3.1 Background -- 7.3.2 Proposed Method -- 7.3.3 Experiments -- 7.3.4 Considerations -- 7.4 Hypothesis Intersection: Case of Identification of Central Peak Crater -- 7.4.1 Introduction -- 7.4.2 Proposed Method -- 7.4.3 Experiments -- References -- 8 Interpretation -- 8.1 Necessity to Interpret and Explain Hypothesis -- 8.2 Explanation in the Philosophy of Science -- 8.2.1 Deductive Nomological Model of Explanation -- 8.2.2 Statistical Relevance Model of Explanation -- 8.2.3 Causal Mechanical Model of Explanation -- 8.2.4 Unificationist Model of Explanation -- 8.2.5 Counterfactual Explanation -- 8.3 Subjects and Types of Explanation -- 8.3.1 Subjects of Explanation -- 8.3.2 Types of Explanation -- 8.4 Subjects of Explanation Explained -- 8.4.1 Data Management -- 8.4.2 Data Analysis -- 8.5 Model-Dependent Methods for Explanation -- 8.5.1 How to Generate Data (HD) -- 8.5.2 How to Generate Hypothesis (HH) -- 8.5.3 What Features of Hypothesis (WF) -- 8.5.4 What Reason for Hypothesis (WR) -- 8.6 Model-Independent Methods of Explanation -- 8.6.1 LIME -- 8.6.2 Kernel SHAP -- 8.6.3 Counterfactual Explanation -- 8.7 Reference Architecture for Explanation Management -- 8.8 Overview of Case Studies -- 8.9 Case of Discovery of Candidate Installation Sites for Free Wi-Fi Access Point -- 8.9.1 Overview -- 8.9.2 Two Hypotheses -- 8.9.3 Explanation of Integrated Hypothesis -- 8.9.4 Experiments and Considerations -- 8.10 Case of Classification of Deep Moonquakes -- 8.10.1 Overview -- 8.10.2 Features for Analysis -- 8.10.3 Balanced Random Forest
8.10.4 Experimental Settings -- 8.10.5 Experimental Results -- 8.10.6 Considerations -- 8.11 Case of Identification of Central Peak Crater -- 8.11.1 Overview -- 8.11.2 Integrated Hypothesis -- 8.11.3 Explanation of Results -- 8.12 Case of Exploring Basic Components of Scientific Events -- 8.12.1 Overview -- 8.12.2 Data Set -- 8.12.3 Network Configuration and Algorithms -- 8.12.4 Visualization of Judgment Evidence by Grad-CAM -- 8.12.5 Experiments to Confirm Important Features -- 8.12.6 Seeking Basic Factors -- References -- Index