Research Statement
Will machines ever "think" for themselves?
I think the question will be a moot point within the foreseeable
future. I believe the gradual increase in machine intelligence will be
subtle but steady until we find ourselves attributing intentionality
and intelligence to machines without having considered any alternate
explanation.
To this end, I am interested in three foundational areas of AI that I
believe are crucial to this steady progression in machine intelligence:
learning,
reasoning and
planning. And I am
interested in combining the analytic formalisms of
logic,
probability, and
decision-theory to
develop efficient algorithms that solve real
problems at the confluence of these research areas.
Following are some
lines of inquiry that
drive my current research interests:
Planning
How can we plan in stochastic environments in a domain-independent manner?
-
First-order Markov Decision Processes:
The augmentation of the planning framework with stochastic action outcomes
is well-formalized as a Markov decision process (MDP).
However, when dealing with large relational problem domains,
grounded solution methods generally cannot scale with the size of the
domain instantiation. This problem is remedied by the first-order
MDP (FOMDP) formalism as summarized in my "Why FOMDPs?" slide. My PhD
thesis is concerned with approximating solutions to FOMDPs while obtaining
performance guarantees on the resulting policies.
For a solution technique that placed 2nd in the probabilistic
track of the 2006 International
Planning Competition,
see a recent paper I wrote on
approximate
planning techniques for first-order MDPs.
Reasoning
How can we efficiently reason with very large and
expressive knowledge bases?
- Specialized Reasoners: We need to exploit common modes of reasoning for which we
can develop efficient specialized reasoners. See my online Semantic Web DL-FOL
reasoner based on the elegant framework of
ordered theory resolution for an example of such an approach that
combines efficient DL satisfiability checking with
first-order resolution in a
sound and complete reasoning framework.
- Decomposition: We must decompose our reasoning into
tractable subtasks. Some recent interesting work
on reasoning in partitioned first-order logic KBs has focused
on decomposition procedures that permit sound
and complete reasoning. But there is
clearly a tradeoff between the value of information and its impact
on the complexity of reasoning. Can we decompose our reasoning based on
an approximate estimate of relevance?
And can we integrate probabilistic and
decision-theoretic notions into a logical reasoning framework
to optimally carry out such relevance-based reasoning?
This brings us to the next question...
How can we integrate logical, probabilistic, and decision-theoretic
paradigms of reasoning?
There are at least two general principles that help identify when
logic can be fruitfully combined with probability and decision-theory:
-
Exploiting Probabilistic and Decision-theoretic Structure for
Robustness: One well-noted problem with purely logical reasoning
is that it is very brittle in the context of uncertainty. By
assigning probabilities to logical statements we can avoid the
parasitic consequences of inconsistent knowledge bases (KBs) while gaining
the capability to reason about various probabilistic properties of KBs.
Furthermore, adding in decision-theoretic utilities allows us to
discard low-utility information during approximate reasoning while
bounding utility loss. The APRICODD
approximate stochastic planner serves as a great example of this latter use.
Learning
How can we learn complex value representations from delayed reward?
- Relational and First-order Representations: If we can easily
represent our knowledge relationally then it also makes sense to use
such representations in our learning algorithms. My recent work on relational
reinforcement learning is one potential approach to doing this,
but there are many additional extensions
and alternate
approaches worth exploring.
- Factored Representations: In many reinforcement learning domains,
the value function or model cannot be represented exactly, but instead must
be represented in a manner that approximately captures the structure of the
model. When the value function or model has a probabilistic
interpretation, Bayes nets and conditional Markov random fields (CRFs) make ideal
representations, but also induce questions as to how their parameters
and structure can be learned in a delayed reward setting. My
relational
reinforcement learning
work has used restricted Bayes nets to learn in Backgammon and Othello and my work with the
Applied
Games Group at Microsoft Research looked at reinforcement learning with
conditional Markov random fields (CRFs) and hierarchical
features in the game of Go.
How can we integrate planning and transfer learning?
- Model-based Transfer Reinforcement Learning: It is
well-known in reinforcement learning that if it is possible to obtain
a reasonable approximation of the transition and reward model, it is
far more efficient to perform off-line inference in the model than to
constantly sample from on-line experience. While much work has looked
at this model-based reinforcement learning paradigm, very little work
has focused on this problem in the context of transfer
learning, where one learned model is reused for related domains.
Here is where learning a model in the form of a domain-independent
first-order MDP (FOMDP) is ideal, because inference can be performed
at the model level while generalizing to any domain
instantiation (as long as the induced FOMDP is consistent with each
domain instantiation). While the framework itself is quite
straightforward, the difficulty lies in accurately inducing a
first-order model of action effects. Some recent work on learning
symbolic models of stochastic domains provides one potential
model-induction approach that could be used in this framework.
How can we learn to reason efficiently?
- A Reinforcement Learning Approach:
Reasoning is a sequential decision process and like any
other decision-making task, better decision-making during the
reasoning process leads to faster inference. While current work examines
fixed
reasoning strategies in logical inference, it is interesting to
ask whether we can cast logical inference as a reinforcement
learning task and decide in which context to make each reasoning decision.
The result should be faster and more powerful inference for
common reasoning tasks.