Let’s explore the fun, counter-intuitive world of combinatorics.

Combining values to form sets of distinct combinations can be a tricky thing. Even if you ignore order, the number of possible sets grows alarmingly.

For an array of two values [1, 2], you can generate:

  • [] (empty set)
  • [1]
  • [2]
  • [1,2] (or [2,1])

If repeats are allowed ([2, 2] for example), the increase is even greater. As the number of input values increase, the number of corresponding output sets shoots through the roof!

1*7DsXxogHWKpaynmwaTfE7Q
Combinatorial Explosion, indeed

Let’s call the input values items and each combination of those values a choice. Furthermore, let’s allow for multiple items, each with distinct choices. A good working example would be a menu. We’ll simulate the menu of Ye Olde Ice Cream Shoppe, which offers its customers combinations of ice cream, toppings, and syrup flavors.

The ice cream flavors are: CHOCOLATE, STRAWBERRY, VANILLA

Toppings: pineapple, strawberry, coconut flakes, pecans

Syrups: chocolate, marshmallow, butterscotch, maple

There are some constraints on the choices: customers can choose any two ice creams, two toppings, and one syrup. Ice cream and topping choices are exclusive, meaning that I can’t choose pineapple + pineapple, for instance. The customer may choose to have no toppings and no syrup, but must choose at least one ice cream. With these constraints, the rate of increase is exponential, of the order 2 to the nth power, which is considerably less than if order was significant and duplicates allowed.

Palatability

Ye Olde Ice Cream Shoppe is actually quite modern in its approach to business, and is developing an artificial intelligence expert system to judge which combinations of ice cream, topping, and syrup are palatable. Servers will be shown a warning on their registers when a customer chooses an unpalatable selection. The servers are then instructed to double check with the customer that their order is correct.

Step 1: Building the Data

Code for this article can be found here. I will assume you’re familiar with JavaScript and Node.js. A working knowledge of Lodash (or Underscore) is helpful. The code uses a map/reduce database for storage.

The first step will be to create a database of all ice cream, topping, and syrup combinations. The inputs will be as follows:

var menu = {
  iceCream: {min: 1, max: 2, values: ["CHOCOLATE", "STRAWBERRY", "VANILLA"]},
  topping: {min: 0, max: 2, values: ["pineapple", "strawberry", "coconut flakes", "pecans"]},
  syrup: {min:0, max: 1, values: ["chocolate", "marshmallow", "butterscotch", "maple"]}
}
menu.js: Ice cream values are capitalized so that CHOCOLATE ice cream is distinguishable form chocolate syrup

With this data, I can write a Combinator function that takes each menu item and generates all possible permitted combinations. Each combination is stored as an array. For example, ice cream combinations would look like:

[ [ ‘CHOCOLATE’, ‘STRAWBERRY’ ],
 [ ‘CHOCOLATE’, ‘VANILLA’ ],
 [ ‘CHOCOLATE’ ],
 [ ‘STRAWBERRY’, ‘VANILLA’ ],
 [ ‘STRAWBERRY’ ],
 [ ‘VANILLA’ ] ]

Once the combinations of ice cream, toppings, and syrups are determined, all that’s left is to iterate each item combination with the others:

var allChoices = [];

_.each(iceCreamChoices, function(ic) {
  _.each(toppingChoices, function(tp) {
    _.each(syrupChoices, function(sy) {
      allChoices.push([ic,tp,sy]);
    })
  })
})
buildChoices.js

This yields a combination of ice cream(s), topping(s), and syrup, like:

[ [ 'VANILLA' ], [ 'coconut flakes', 'pecans' ], [] ],
  [ [ 'VANILLA' ], [ 'coconut flakes' ], [ 'chocolate' ] ],
  [ [ 'VANILLA' ], [ 'coconut flakes' ], [ 'marshmallow' ] ],...

The choices shown translate as:

  • Vanilla ice cream with coconut flakes and pecans, no syrup
  • Vanilla ice cream with coconut flakes and chocolate syrup
  • Vanilla ice cream with coconut flakes and marshmallow syrup

Even with just a few restricted menu items, the number of permitted choices is 330!

Step 2: Storing the Data

With every combination of orderable items now determined, further work can be done. The AI system for determining palatable choice combinations is turning out to be complex and won’t be embedded in the registers’ operating system. Instead, an AJAX request will be made to a server housing the AI program. The inputs will be the customer’s menu choices, and the output will rate the palatability of those choices as one of: [ugh, meh, tasty, sublime]. A palatability rating of ugh triggers the aforesaid warning.

We need a fast response to the request, so the palatability ratings will be cached in a database. Given the nature of exponential increase, this could evolve to become a Big Data problem if more item choices are added to the menu in the future.

Let’s say it is decided to store choice combinations and ratings in a NoSQL database. Using PouchDB, each choice and palatability value are stored as JSON documents. A secondary index (a.k.a. view) with each choice as a key will allow us to quickly look up the palatability rating. Instead of pushing the data into an allChoices array as shown above in buildChoices.js, I can push JSON documents to the database for storage.

Proceeding naively, I can make a couple of changes in Step1.js to arrive at Step2.js: first of all, I need to install PouchDB via npm, then require it. Then, I create a NoSQL database called choices.

var PouchDB = require('pouchdb');
var db = new PouchDB('choices');

Now, each choice is posted to the choices database:

var count = 0;

_.each(iceCreamChoices, function(ic) {
  _.each(toppingChoices, function(tp) {
    _.each(syrupChoices, function(sy) {
      //allChoices.push([ic,tp,sy]);
      db.post({choice: [ic,tp,sy]}, function(err, doc){
        if (err) console.error(err);
        else console.log(`stored ${++count}`);
      });
    })
  })
});

console.log('done??');
pouchPost.js

This works! Sort of. As can be inferred by the callback parameter to db.post, that operation is asynchronous. What we see in the log is:

>node Step2.js
done??
stored 1
stored 2
stored 3
...

So the code says it’s done before even record 1 has been stored. This will be a problem if I have further processing to do against the database and all the records aren’t there yet.

Step 3: Fixing and Refining

There’s also a more subtle problem: potential resource exhaustion. If the database limits the number of concurrent connections, a large number of simultaneous post requests may result in connection timeouts.

For Step3.js, I did a bit of bug fixing, reformatting and refactoring of what was written in Step2.js. One bug was that each run added more and more records to the database, duplicating what was there before. The solution was to destroy the existing database, re-create it, and then run the main program:

// remove old
db.destroy(null, function () {
    db = new PouchDB('choices');
    run();
});

Next was to add a running count of documents stored and post requests in process so that the program: 1) knows when the last document is stored; 2) allows only five posts to proceed at any one time. The run() method looks like this now (with some omissions):

function run() {
    var menu = { //...
    }

    var iceCreamChoices = new Combinator({ //...
    });
    var toppingChoices = new Combinator({ //...
    });
    var syrupChoices = new Combinator({ //...
    });

    var count = 0;
    var total = iceCreamChoices.length * toppingChoices.length * syrupChoices.length;
    var postCount = 0;
    var postCountMax = 5;

    _.each(iceCreamChoices, function (ic) {
        _.each(toppingChoices, function (tp) {
            _.each(syrupChoices, function (sy) {
                var si = setInterval(() => {
                    if (postCount < postCountMax) {
                        clearInterval(si);
                        postChoice(ic, tp, sy);
                    }
                }, 10);
            })
        })
    });

    function postChoice(ic, tp, sy) {
        ++postCount;
        db.post({
            choice: [ic, tp, sy]
        }, function (err, doc) {
            --postCount;
            done(err);
        });
    }

    function done(err) {
        if (err) {
            console.error(err);
            process.exit(1);
        }

        console.log(`stored ${++count}`);
        if (count === total) {
            console.log('done');
        }
    }
}
run.js

The main changes to note are that:

  1. A postCount tracks how many posts are outstanding
  2. An interval timer checks the postCount and will post and exit when post slots are available
  3. a done() handler is called when all choices are stored

Step 4: Adding Palatability

With all possible menu choices in place, we can now have the AI determine the palatability of each. The AI is just a mock at the moment, which assigns random values to each document record in PouchDB. Those values will be stored in the database by updating each document with a taste rating.

var _ = require('lodash');

var PouchDB = require('pouchdb');
var db = new PouchDB('choices');

db.allDocs({
        include_docs: true
    })
    .then(docs => {
        _.each(docs.rows, r => {
            r.doc.taste = palatability();
            db.put(r.doc);
        });
    });

function palatability() {
    var scale = Math.round(Math.random() * 10);

    var taste;

    switch (true) {
    // this switch is a horrible hack;  don't ever do this ;-P
    case (scale < 2):
        taste = "ugh";
        break;
    case (scale < 5):
        taste = "meh";
        break;
    case (scale < 8):
        taste = "tasty";
        break;
    default:
        taste = "sublime";
        break;
    }

    return taste;
}
palatability.js

Just to verify that we stored things correctly, we can dump the docs in the database to the console:

db.allDocs({
        include_docs: true
    })
    .then(docs => {
        _.each(docs.rows, r => {
            console.log(r.doc.choice, r.doc.taste)
        });
    });
//output looks like:
/*
[ [ 'STRAWBERRY' ], [ 'coconut flakes' ], [ 'maple' ] ] 'sublime'
[ [ 'CHOCOLATE' ], [ 'pecans' ], [ 'chocolate' ] ] 'tasty'
[ [ 'CHOCOLATE', 'STRAWBERRY' ], [], [ 'chocolate' ] ] 'sublime'
[ [ 'VANILLA' ], [], [ 'marshmallow' ] ] 'meh'
[ [ 'CHOCOLATE', 'STRAWBERRY' ],
  [ 'pineapple' ],
  [ 'marshmallow' ] ] 'meh'
*/

Step 5: Looking Up Palatability

The documents are in the database, but now there needs to be a way to determine what the palatability is for a customer’s choices. This is done by defining a view, which is a function that returns a key for each document, along with a value. What should the key be?

I could use r.doc.choice as the key, but arrays have an order and that order might change if the menu items defined in Step 1 were later rearranged. The key is just an identifier of the choice selection and doesn’t carry an semantic meaning of its own. What should work is to:

  • flatten each r.doc.choice array,
  • order the elements alphabetically, then
  • concatenate them together
  • result is a key

If more choices are added in the future, though, the key length might be over the limit allowed by the database. Instead of using the key as constructed, a hash the key could be used as the real key. A SHA256 hash in hex is 64 characters long, and the likelihood of a hash collision, even for a quadrillion choices, is essentially zero. Writing the hash function for choices is easy, using the Node.js crypto module and a Lodash chain:

const crypto = require('crypto');
const _ = require('lodash')

function hash(choice) {
    var str = _.chain(choice)
        .flatten()
        .sortBy()
        .join('|')
        .value();

    return crypto.createHmac('sha256', 'old ice cream')
        .update(str)
        .digest('hex');
}

module.exports = hash;
hashChoices.js

Adding the hash to our existing documents is a simple matter of iterating through each database document, computing its hash, and updating the document with a key value:

const _ = require('lodash');
const hash = require('./hash');

const PouchDB = require('pouchdb');
const db = new PouchDB('choices');

db.allDocs({
        include_docs: true
    })
    .then(docs => {
        _.each(docs.rows, r => {
            r.doc.key = hash(r.doc.choice);
            db.put(r.doc);
        });
    })
    .catch(e => {
        console.error(e)
    });
addHashKey.js

Next, a database view is built using the document key field as an index; I’ll call it choice.

const PouchDB = require('pouchdb');
const db = new PouchDB('choices');

// doc that defines the view
var ddoc = {
    _id: '_design/choice',
    views: {
        by_key: {
            map: function (doc) {
                emit(doc.key, doc.taste);
            }.toString()
        }
    }
};

// remove any existing view, then add new one:
db.get(ddoc._id)
    .then(doc => {
        return db.remove(doc);
    })
    .then(() => {
        db.put(ddoc)
            .catch(function (err) {
                console.error(err);
            });
    });
view.js

For any document key (hash of choice array), I can find its taste via the view choice. Now everything is in place to determine whether a customer’s choice is ugh, meh, tasty, or sublime. To test this, we make some random choices and see if we can find the taste:

    const choices = [
        [['VANILLA'], ['coconut flakes', 'pecans'], ['marshmallow']],
        [['CHOCOLATE'], ['pecans'], ['chocolate']],
        [['STRAWBERRY', 'VANILLA'], ['pineapple', 'coconut flakes'], ['marshmallow']],
        [['STRAWBERRY'], ['pecans'], ['maple']],
        [['VANILLA'], ['coconut flakes', 'pineapple'], ['chocolate']],
        [['CHOCOLATE, STRAWBERRY'], ['pineapple', 'pecans'], ['butterscotch']],
    ];

    const keys = _.map(choices, c => {
        return hash(c);
    });

    db.query('choice/by_key', {
        keys: keys,
        include_docs: false,
    }, function (err, result) {
        if (err) {
            return console.error(err);
        }
        _.each(result.rows, (r, i) => {
            console.log(`${choices[i]} tastes ${r.value}`);
        })
    });
test.js

The results are:

=> node test
VANILLA,coconut flakes,pecans,marshmallow tastes ugh
CHOCOLATE,pecans,chocolate tastes sublime
STRAWBERRY,VANILLA,pineapple,coconut flakes,marshmallow tastes tasty
STRAWBERRY,pecans,maple tastes meh
VANILLA,coconut flakes,pineapple,chocolate tastes sublime

That’s it! All that’s left is to write client software that submits choices via AJAX and gets a taste (palatability) value back. If it’s ugh, then up comes a warning on the register.

In a subsequent post, I refine the algorithm used above. Check it out!

References

Exponential Growth Isn't Cool. Combinatorial Explosion Is.
So much of the tech industry is obsessed with exponential growth. Anything linear is dying, or has been dead for years…
www.torbair.com

Combinations and Permutations Calculator
Find out how many different ways you can choose items. For an in-depth explanation of the formulas please visit…
www.mathsisfun.com