Breath Mints for Penguins: Random discrete distributions (decks) in game development

Randomness plays an important role in many games. Enemies appear randomly, or events occur at randomly spaced intervals. A relatively easy way to think of random events in a game is as a deck of note cards. The probability of a particular word being drawn is a function of the number of cards in the deck with that word on them, and the total number of cards in the deck.

This approach is called a discrete distribution.

In the case of circle the aardvark, I used a discrete distribution to determine whether an aardvark or an enemy spawned, and how long each character should remain on the screen before timing out and going away.

The game is written in Monkey X, but the code in this post is in Python.
You can find the Monkey X version here.

Read on for details

Background: Two kinds of people can spawn in the pickup trucks, Aardvarks (who you are supposed to cricle), or Otto von Bismarcks (who you are not supposed to circle). In early levels, there should be lots of aardvarks spawning and not very many Bismarcks. In later levels, most of the people who spawn should be Bismarck, and very few should be aardvarks.

Problem: We need a function that when called will return either "Aardvark" or "Bismarck". The probability of returning one or the other should not necessarily be equal, but should be controllable by the caller.

Approach: We'll create a class for discrete distributions. Conceptually, we can think of the class as a "deck" where we can any number of cards with different labels, and draw the cards randomly with replacement.

Additional details:

Items can be drawn from a discrete distribution in one of two different ways: with replacement, or without replacement. If items are drawn with replacement, then the probabilities are the same for every call (the card is put back in the deck and the deck is reshuffled every time a card is drawn). If items are drawn without replacement, then every time a particular result is returned, the chances of drawing the same result on a subsequent draw decrease (unless there is only one kind of item in the set, in which case, the chances of drawing it remain 100%). In circle the aardvark, I only used the approach with replacement, but I'll give code for both here.

Interface:

The first thing I like to do when writing a class or set of functions is to decide on an interface. Before I start worrying about which algorithms and data structures will the the best suited to the problem, I think it is important to define the problem as precisely as possible.

In this case, we'll take an object-oriented approach, and create a class called "Deck". The class will have only one public function: a constructor to instantiate new instances. Instances will have four public methods:
add, remove, draw, draw_with_replacement

The prototypes for the function and methods will be as follows:
new()

add(value, count=1) #value is the label on the card. Count is the number of instances of the card to put into the deck.

remove(value, count=None) #value is the label on the card. Count is the number of instances of that value to remove from the deck. If count == None, then all instances of value will be removed.

draw() #returns a random value from the deck and removes one instance of that value from the deck.

draw_with_replacement() #returns a random value from the deck, but does not remove any values from the deck.

Data Structure:

There are many different ways we could implement the above functions. The kind of data structure we use as the basis will largely depend on which operation we expect to use most frequently, and thus want to optimize for speed. Here are some possibilities:

The built-in list structure from python seems like the obvious choice, but because deleting items in the middle of a python list (what would be called an array in C) is a slow operation, some kind of linked might be better, especially for games using drawing without replacement. A third possibility would be to use a dictionary where keys are deck entries and values are the number of times that entry appears in the deck.

In most practical situations for games, the discrete distribution code will probably not be a speed bottleneck (you probably won't need to be calling it thousands of times per second), so it probably won't make a lot of difference which data structure you use.

For this implementation I'll use a dict as the underlying data structure because it makes the implementation simple (though not necessarily fast).

from __future__ import division #needed for test code
import random

random.seed()

class Deck():
  def __init__(self):
    self.__data = dict() #keys are card label, values are number of times the label occurs in the deck.
    self.__total = 0 #the total number of cards from the deck. Should always be equal to sum(self.__data.keys()).

  def add(self, value, count=1):
    '''
    value is the label on the card. Count is the number of instances of the card to put into the deck.
    '''
    #ensure valid parameters
    if count <= 0:
      raise ValueError("count must be > 0")
    count = int(count)
    
    #increment the card in the data dict
    
    if value not in self.__data:
      self.__data[value] = 0
    self.__data[value] += count
    
    #update total
    self.__total += count
  
  def remove(self, value, count=None):
    '''
    value is the label on the card. Count is the number of instances of that value to remove from the  deck. If count == None, then all instances of value will be removed
    '''
    #ensure valid parameters
    if count is not None and count < 0:
      raise ValueError("count must be >= 0 or None")
    if value not in self.__data:
      raise ValueError("value not in Deck")
    if count is not None:
      count = int(count)
    
    #decrease the count in the dict and decrement __total
    old_count = self.__data[value] #in some cases all cards will be removed, so count is not necessarily equal to the number that __total should be changed by. old_count helps keep track of that.
    if count is not None:
      self.__data[value] -= count
    if count is None or self.__data[value] <= 0:
      self.__total -= old_count
      del self.__data[value]
    else:
      self.__total -= count
  
  def draw_with_replacement(self):
    '''
    returns a random value from the deck, but does not remove any values from the deck.
    if deck is empty, returns None
    '''
    
    #check input
    if self.__total == 0:
      raise RuntimeError("Attempt to draw from empty Deck")
    
    #generate a random number that is less than or equal to the number of cards in the deck
    rand_int = random.randint(1, self.__total)
    
    #Find the card at the position in the deck corresponding to the random number.
    running_sum = 0
    for val, weight in self.__data.iteritems():
      running_sum += weight
      if rand_int <= running_sum:
        out = val
        break
        
    return out

  def draw(self):
    '''  
      returns a random value from the deck and removes one instance of that value from the deck.
      if deck is empty, returns None
    '''
    out = self.draw_with_replacement()
    self.remove(out, 1)
    return out
  
  def as_dict(self):
    '''
      returns a dict representation of the Deck. Where dict keys are the card names, and values are the number of times the card occurs in the deck
    '''
    return self.__data.copy()
  
  def total_cards(self):
    '''
      returns the total number of cards in the Deck
    '''
    return self.__total

Test code:

import unittest
class TestDeck(unittest.TestCase):
  def test_add(self):
    d = Deck()
    d.add("card1", 3)
    d.add("card2", 10)
    d.add("card1", 2)
    result = d.as_dict()
    self.assertEqual(result["card1"], 5)
    self.assertEqual(result["card2"], 10)
    
    self.assertRaises(ValueError, d.add, "card1", 0)
    self.assertRaises(ValueError, d.add, "card1", -1)
    self.assertRaises(ValueError, d.add, "card1", None)
    #test exceptions
    
  def test_remove(self):
    d = Deck()
    d.add("card1", 3)
    d.add("card2", 10)
    d.add("card1", 2)
    d.add("card3", 20)
    
    d.remove("card1", 1)
    d.remove("card2")
    d.remove("card3", 30)
    
    result = d.as_dict()
    
    self.assertEqual(result["card1"], 4)
    self.assertNotIn("card2", result)
    self.assertNotIn("card3", result)
    
    self.assertRaises(ValueError, d.remove, "card1", -1)
    self.assertRaises(ValueError, d.remove, "card7")
  
  def test_total(self):
    d = Deck()
    self.assertEqual(d.total_cards(), 0)
    d.add("card1", 3)
    d.add("card2", 10)
    d.add("card1", 2)
    self.assertEqual(d.total_cards(), 15)
    result = d.as_dict()
    self.assertEqual(d.total_cards(), sum(result.values()))
    
    d = Deck()
    d.add("card1", 3)
    d.add("card2", 10)
    d.add("card1", 2)
    d.add("card3", 20)
    
    d.remove("card1", 1)
    d.remove("card2")
    d.remove("card3", 30)
    self.assertEqual(d.total_cards(), 4)
    result = d.as_dict()
    self.assertEqual(d.total_cards(), sum(result.values()))
    
  def test_draw_with_replacement(self):
    d = Deck()
    c1_ct = 25
    c2_ct = 75
    d.add("card1", c1_ct)
    d.add("card2", c2_ct)
    
    r_draw = {"card1":0, "card2":0}
    total_draws=1000
    for x in range(total_draws):
      r_draw[d.draw_with_replacement()] += 1
    
    self.assertAlmostEqual(c1_ct/(c1_ct+c2_ct), r_draw['card1']/total_draws,1)
    self.assertAlmostEqual(c2_ct/(c1_ct+c2_ct), r_draw['card2']/total_draws,1)
  
  def test_draw(self):
    d = Deck()
    c1_ct = 25
    c2_ct = 75
    d.add("card1", c1_ct)
    d.add("card2", c2_ct)
    
    r_draw = {"card1":0, "card2":0}
    total_draws=c1_ct + c2_ct
    #draw all of the cards
    for x in range(total_draws):
      r_draw[d.draw()] += 1
    
    self.assertEqual(c1_ct/(c1_ct+c2_ct), r_draw['card1']/total_draws)
    self.assertEqual(c2_ct/(c1_ct+c2_ct), r_draw['card2']/total_draws)
  
if __name__ == '__main__':
    unittest.main()

Notes:
To test , paste the test code at the bottom of the file and run it from the command line.

The interfaces and test code will be the same regardless of the underlying data structure, so you can use this code as a framework for optimizing for some specific use. (Although optimization is probably not necessary in most cases).

Be sure to let me know if you find any bugs, or if you have questions, or think I'm doing something wrong!

This post is part of a series of posts about the code behind my game circle the aardvark in the back of the pickup truck.

Breath Mints for Penguins

Sunday, July 3, 2016

Random discrete distributions (decks) in game development

No comments:

Post a Comment

Labels

About Me