MIT press provides another excellent book in creative commons.
Algorithms for decision making: free download book
I plan to buy it and I recommend you do. This book provides a broad introduction to algorithms for decision making under uncertainty.
The book takes an agent based approach
An agent is an entity that acts based on observations of its environment. Agents
may be physical entities, like humans or robots, or they may be nonphysical entities,
such as decision support systems that are implemented entirely in software.
The interaction between the agent and the environment follows an observe-act cycle or loop.
- The agent at time t receives an observation of the environment
- Observations are often incomplete or noisy;
- Based in the inputs, the agent then chooses an action at through some decision process.
- This action, such as sounding an alert, may have a nondeterministic effect on the environment.
- The book focusses on agents that interact intelligently to achieve their objectives over time.
- Given the past sequence of observations and knowledge about the environment, the agent must choose an action at that best achieves its objectives in the presence of various sources of uncertainty including:
- outcome uncertainty, where the effects of our actions are uncertain,
- model uncertainty, where our model of the problem is uncertain,
3. state uncertainty, where the true state of the environment is uncertain, and - interaction uncertainty, where the behavior of the other agents interacting in the environment is uncertain.
The book is organized around these four sources of uncertainty.
Making decisions in the presence of uncertainty is central to the field of artificial intelligence
Table of contents is
Introduction
Decision Making
Applications
Methods
History
Societal Impact
Overview
PROBABILISTIC REASONING
Representation
Degrees of Belief and Probability
Probability Distributions
Joint Distributions
Conditional Distributions
Bayesian Networks
Conditional Independence
Summary
Exercises
viii contents
Inference
Inference in Bayesian Networks
Inference in Naive Bayes Models
Sum-Product Variable Elimination
Belief Propagation
Computational Complexity
Direct Sampling
Likelihood Weighted Sampling
Gibbs Sampling
Inference in Gaussian Models
Summary
Exercises
Parameter Learning
Maximum Likelihood Parameter Learning
Bayesian Parameter Learning
Nonparametric Learning
Learning with Missing Data
Summary
Exercises
Structure Learning
Bayesian Network Scoring
Directed Graph Search
Markov Equivalence Classes
Partially Directed Graph Search
Summary
Exercises
Simple Decisions
Constraints on Rational Preferences
Utility Functions
Utility Elicitation
Maximum Expected Utility Principle
Decision Networks
Value of Information
Irrationality
Summary
Exercises
SEQUENTIAL PROBLEMS
Exact Solution Methods
Markov Decision Processes
Policy Evaluation
Value Function Policies
Policy Iteration
Value Iteration
Asynchronous Value Iteration
Linear Program Formulation
Linear Systems with Quadratic Reward
Summary
Exercises
Approximate Value Functions
Parametric Representations
Nearest Neighbor
Kernel Smoothing
Linear Interpolation
Simplex Interpolation
Linear Regression
Neural Network Regression
Summary
Exercises
Online Planning
Receding Horizon Planning
Lookahead with Rollouts
Forward Search
Branch and Bound
Sparse Sampling
Monte Carlo Tree Search
Heuristic Search
Labeled Heuristic Search
Open-Loop Planning
Summary
Exercises
Policy Search
Approximate Policy Evaluation
Local Search
Genetic Algorithms
Cross Entropy Method
Evolution Strategies
Isotropic Evolutionary Strategies
Summary
Exercises
Policy Gradient Estimation
Finite Difference
Regression Gradient
Likelihood Ratio
Reward-to-Go
Baseline Subtraction
Summary
Exercises
Policy Gradient Optimization
Gradient Ascent Update
Restricted Gradient Update
Natural Gradient Update
Trust Region Update
Clamped Surrogate Objective
Summary
Exercises
Actor-Critic Methods
Actor-Critic
Generalized Advantage Estimation
Deterministic Policy Gradient
Actor-Critic with Monte Carlo Tree Search
Summary
Policy Validation
Performance Metric Evaluation
Rare Event Simulation
Robustness Analysis
Trade Analysis
Adversarial Analysis
Summary
Exercises
MODEL UNCERTAINTY
Exploration and Exploitation
Bandit Problems
Bayesian Model Estimation
Undirected Exploration Strategies
Directed Exploration Strategies
Optimal Exploration Strategies
Exploration with Multiple States
Summary
Exercises
Model-Based Methods
Maximum Likelihood Models
Update Schemes
Exploration
Bayesian Methods
Bayes-adaptive MDPs
Posterior Sampling
Summary
Exercises
Model-Free Methods
Incremental Estimation of the Mean
Q-Learning
Sarsa
Eligibility Traces
Reward Shaping
Action Value Function Approximation
Experience Replay
Summary
Exercises
Imitation Learning
Behavioral Cloning
Dataset Aggregation
Stochastic Mixing Iterative Learning
Maximum Margin Inverse Reinforcement Learning
Maximum Entropy Inverse Reinforcement Learning
Generative Adversarial Imitation Learning
Summary
Exercises
PART IV STATE UNCERTAINTY
19 Beliefs 373
Belief Initialization
Discrete State Filter
Linear Gaussian Filter
Extended Kalman Filter
Unscented Kalman Filter
Particle Filter
Particle Injection
Summary
Exercises
20 Exact Belief State Planning 399
Belief-State Markov Decision Processes
Conditional Plans
Alpha Vectors
Pruning
Value Iteration
Linear Policies
Summary
Exercises
Offline Belief State Planning
Fully Observable Value Approximation
Fast Informed Bound
Fast Lower Bounds
Point-Based Value Iteration
Randomized Point-Based Value Iteration
Sawtooth Upper Bound
Point Selection
Sawtooth Heuristic Search
Triangulated Value Functions
Summary
Exercises
Online Belief State Planning
Lookahead with Rollouts
Forward Search
Branch and Bound
Sparse Sampling
Monte Carlo Tree Search
Determinized Sparse Tree Search
Gap Heuristic Search
Summary
Exercises
Controller Abstractions
Controllers
Policy Iteration
Nonlinear Programming
Gradient Ascent
Summary
Exercises
PART V MULTIAGENT SYSTEMS
Multiagent Reasoning
Simple Games
Response Models
Dominant Strategy Equilibrium
Nash Equilibrium
Correlated Equilibrium
Iterated Best Response
Hierarchical Softmax
Fictitious Play
Gradient Ascent
Summary
Exercises
Sequential Problems
Markov Games
Response Models
Nash Equilibrium
Fictitious Play
Gradient Ascent
Nash Q-Learning
Summary
Exercises
State Uncertainty
Partially Observable Markov Games
Policy Evaluation
Nash Equilibrium
Dynamic Programming
Summary
Exercises
Collaborative Agents
Decentralized Partially Observable Markov Decision Processes
Subclasses
Dynamic Programming
Iterated Best Response
Heuristic Search
Nonlinear Programming
Summary
Exercises
APPENDICES
Mathematical Concepts
Measure Spaces
Probability Spaces
Metric Spaces
Normed Vector Spaces
Positive Definiteness
Convexity
Information Content
Entropy
Cross Entropy
Relative Entropy
Gradient Ascent
Taylor Expansion
Monte Carlo Estimation
Importance Sampling
Contraction Mappings
Graphs
Probability Distributions
Computational Complexity
Asymptotic Notation
Time Complexity Classes
Space Complexity Classes
Decideability
Neural Representations
Neural Networks
Feedforward Networks
Parameter Regularization
Convolutional Neural Networks
Recurrent Networks
Autoencoder Networks
Adversarial Networks
Search Algorithms
Search Problems
Search Graphs
Forward Search
Branch and Bound
Dynamic Programming
Heuristic Search
Problems
Hex World
2048
Cart-Pole
Mountain Car
Simple Regulator
Aircraft Collision Avoidance
Crying Baby
Machine Replacement
Catch
F.10 Prisoners Dilemma
Rock-Paper-Scissors
Travelers Dilemma
Predator-Prey Hex World
Multi-Caregiver Crying Baby
Collaborative Predator-Prey Hex World
Julia
Types
Functions
Control Flow
Packages
Convenience Functions
Book link