Richard Evans (Lionhead)
Black & White is a state-of-the-art
computer game which has just been
released to critical and commercial acclaim. In this talk I will describe
the AI techniques used in the game, and speculate about future extensions to
Black & White is a game in which you play the role of a God. How
you are depends on how many people believe in you. People need to be shown
miracles in order to believe in you: "Show me a sign!", they demand. The
player starts with a small community of worshippers, and must expand his
influence, gaining more believers, in order to wipe out the other deities,
and eventually become the One True God.
To help you in your quest, you are provided with a magical creature
help you in your quest. He is The Word Made Flesh. He will learn from you
how to behave, and will, with careful training, become an invaluable tool in
your quest. In this talk, I will focus on the AI techniques used to
implement these magical creatures.
1) Creature AI in B&W
The creatures use a standard BDI (Belief, Desire, Intention) architecture,
augmented in various ways. A creature's beliefs allow for slack between the
way the world is, and the way the creature thinks it is, which creates the
possibility of error and deception. As well as beliefs about the locations
and attributes of particular objects, the creature has Opinions about what
sorts of objects are most suitable for satisfying different desires. These
Opinions are implemented as decision trees. The creature has a variety of
desires. Each desire is implemented as a perceptron, each with a number of
different desire-sources. (For instance, the Hunger desire starts to
increase if the creature is low on energy, if he is bored, if he sees
something tasty, or if he sees another agent eating. These different sources
each have their own separate weighting, so you can train a creature to only
get hungry when his energy is low, or alternatively encourage him to eat
whenever he is bored). The creature's deliberation involves finding the most
important goal and the most important sort of object to act on. To speed up
the planning, the search space is abstracted into two layers: large,
game-important container objects: (towns, flocks of animals, citadels,
creatures) on the one hand, and small objects (villagers, rocks, animals) on
the other. To keep things fast for a real time game, planning is staggered
over a number of game-turns. Once a plan has been decided on, it is broken
down into atomic sub-actions.
2) Varieties of learning
Creatures learn in a variety of ways. They learn how to perform actions:
they learn to fish from watching the villagers fishing, and learn how to
perform miracles from watching the gods casting miracles. They learn what
sorts of situations should trigger what sorts of desires: you can train a
creature to be playful whenever he is bored, or train him not to be
aggressive after he has been damaged. They also learn what sorts of objects
are most appropriate for satisfying which sorts of desires. I looked at
Quinlan's IDE to implement learning distinctions: the creatures remember the
learning episodes, and build up decision-trees to minimise the entropy
between those episodes.
Perhaps the most important thing about the creature learning is that
represents a departure from the skinnerian carrot and stick approach to
training. The only approach tried before in computer games is to watch what
your critter does, and then punish or reward after the event. The trouble
with this is that it is immensely slow and frustrating. It also pushes the
critter into local minima from which he cannot escape. Let me explain.
Suppose you want to teach your creature to only ever attack enemy villagers.
Your critter arrives in a friendly village, behaves aggressively, you punish
him severely. Now the trouble is that you may have punished him so much that
he never tries to be aggressive again, ever. Your creature has learnt too
much from the lesson, and now his aggressive urge has been permanently
muted. If the only way of teaching is after-the-event reinforcement, then
once your critter has got himself into such a state, he can never get out of
it. So to solve this problem, I introduced another, much more important type
of teaching: teaching by showing. In a game environment in which you can
perform things you want your critter to do, you can let him learn from your
actions. He sees the action you are performing, makes an intelligent guess
at what your goal was, constructs a belief about the object you were acting
on, and constructs a training episode from your action. The creature keeps a
model of what goals the player has been trying to achieve. This is the first
step towards having an empathetic agent. By using training by showing, you
can always get your creature out of any rut he might be in.
3) Ways in which the architecture can be expanded / improved.
There are two serious limitations to the creature architecture:
The creature plans at the goal level, but once he has chosen a goal
suitable object, he finds a suitable action for satisfying that goal by
looking in a precomputed plan library. (For example, the creature knows that
there are various ways of being destructive to an object: throwing it,
throwing something at it, kicking it, casting a spell at it). It would be
nice if the creature could plan dynamically how to satisfy a goal, rather
than just looking in a list of suitable actions.
The other limitation of the creature architecture is that there are
small, finite number of desires (about 40). Real agents can construct new
goals, goals that have never been had by any one else before.
These two limitations fit together: you can only use a plan library
have a finite number of goals.
The easiest way to give desires some sort of compositional structure
add quantifiers. A goal is now the desire to make a certain proposition
true, where that proposition is expressed in first-order logic. Although
this does give us the in-principle guarantee of an infinite number of
different desires, it is unattractive because it means we can no longer use
plan-libraries, and also because there are a number of different ways of
satisfying a proposition, some of which are not what was intended, but it is
difficult to see how to rule them out. Eg: one way of "satisfying" the
desire that (for all men x) there is a woman who is married to x, is to kill
all the men, but this is probably not what was intended.
Another way to add compositional structure to desires is much more
attractive to me: to add one new desire: wanting to perform an activity, and
allowing an activity itself to have compositional structure. For example, a
tournament is a n-player activity which takes a two-player activity as a
parameter and returns a tournament in which all the permutations of the n
players play the two-player activity to find an overall winner. The fact
that activities have parameters is what gives them compositional structure,
which enables the number of different goals to be indefinitely large.