Kami 2 Solver in F# - Part 2

This is the second post in a series on how to solve Kami 2 puzzles in F#.

Previously I had done the work to analyze screen shots of puzzles and reduce them to a graph. This post covers how to render these graphs using GraphViz and a primitive algorithm for solving the underlying puzzle.

Recap

Previously we converted Kami 2 puzzles from a screenshot into a graph representation:

type Region = { Color: int; AdjacentRegions: Set<int>; ... }

type Kami2Puzzle = {
    NumColors: int
    Regions: List<Region>
}

So all we need to do next is write a function to convert these puzzles into a list of moves to solve them. Simple, right?

type PuzzleMove = int * int

type SearchResults = {
    Moves: PuzzleMove list
}

let solvePuzzle (puzzle : Kami2Puzzle) : PuzzleSolution = ...

Presumably solving puzzles will be that simple. But first let’s make a detour to better understand what we are looking at.

Rendering Graphs

Just like annotating game screen shots, being able to visualize the graph representation of each puzzle would be immensely valuable for debugging. While the .NET platform offers a lot of graphical APIs (GDI+, WPF) if you are working on something other than Windows (like me) you are out of luck.

Forutnately GraphViz has our back.

GraphViz: Ugly but versitile

GraphViz is a super-powerful application for visualizing graphs and models. As it turns out, rendering a graph in a visually appealing manner is a really hard problem. While I wouldn’t claim that GraphViz produces “pretty” images, it does provide a decent solution.

The following is a side-by-side comparison of a Kami 2 puzzle and its associated graph representation.

analyzed puzzle with graph

There appears to be some good .NET wrappers for GraphViz, such as JamieDixon/GraphViz-C-Sharp.

But rather than dealing with configuration or install paths, I’ll just assume GraphViz is on my path and use shell execute directly. (Hey, I have two kids and am spending my weekends blogging about solving puzzles, don’t judge me for cutting corners here and there.)

Setting up GraphViz on a Mac is pretty easy if you are using HomeBrew. (Only mentioned here as a subtle hint that if you are using macOS, you should definitely use HomeBrew for installing apps not found in the AppStore.)

$ brew install graphviz

Dot file format

Once you have GraphViz (dot) installed, you just need to create DOT files for it to render. The wikipedia page has enough examples that you can figure out the basic syntax for simple graphs. See “DOT (graph description language)”.

The graph file from the previous image is encoded as follows:

strict graph kami_puzzle {
    // nodes
    r_0 [style="filled", fillcolor="#43362a"]
    r_1 [style="filled", fillcolor="#c1ac99"]
    r_2 [style="filled", fillcolor="#43362a"]
    r_3 [style="filled", fillcolor="#ab3f23"]
    r_4 [style="filled", fillcolor="#ab3f23"]
    r_5 [style="filled", fillcolor="#c1ac99"]
    < ... snip , you get the idea ... >
    // edges
    r_0 -- r_1
    r_0 -- r_3
    r_0 -- r_4
    r_0 -- r_7
    r_0 -- r_9
    r_1 -- r_0
    r_1 -- r_2
    r_1 -- r_4
    < ... snip , you get the idea ... >
}

To render the graph (e.g. test.dot) as an image, just use:

$ dot -T test.dot -o test.png
$ open test.png

Converting Kami 2 objects to Dot files

Converting Kami 2 puzzles to these dot files is very easy because we can use an F# Sequence Expression to build up a sequence of strings – each representing a line of the DOT file – and then write that to disk.

The code is beautifully straight forward, so I’ll only go over one stanza in detail.

// Emit the dot file representation for a puzzle's regions. (Nodes)
let emitRegionLabels (regions : seq<Region>) =
    regions
    |> Seq.map (fun region ->
        sprintf "    r_%d [style=\"filled\", fillcolor=\"%s\"]"
                region.ID
                region.ColorCode)

// Emit the dot file representation for a region's associations. (Edges)
let emitRegionAssociations (region : Region) =
    region.AdjacentRegions
    |> Seq.map (sprintf "r_%d")
    |> Seq.map (sprintf "r_%d -- %s" region.ID)


// Convert a Kami2 puzzle into a string dot file.
let ConvertToGraph (regions : seq<Region>) =
    seq {
        yield "strict graph kami_puzzle {"
        // Emit the nodes. Give each a number (since puzzles with more than
        // 26 regions are common) and assign the source color.
        yield "    // nodes"   
        yield! emitRegionLabels regions
        yield "    // edges"
        yield! regions
               |> Seq.map emitRegionAssociations
               |> Seq.concat
               |> Seq.map (sprintf "    %s")
            
        yield "}"
    }

// Renders the regions as a PNG file.
let RenderAsGraph (regions : seq<Region>) outputFilePath =
    let dotFilePath = Path.GetTempFileName()
    let dotFileLines = ConvertToGraph regions
    File.WriteAllLines(dotFilePath, dotFileLines)

    // Render, assuming `dot` is on the current path.
    let args = sprintf "-Tpng %s -o %s"
                        dotFilePath
                        outputFilePath
    Process.Start("dot", args) |> ignore

Assuming you are comfortable with sequence expressions, it is all pretty simple. There was one part of F# epicness that was a bit too terse, so let me unpack it:

        yield "    // edges"
        yield! regions
               |> Seq.map emitRegionAssociations
               |> Seq.concat
               |> Seq.map (sprintf "    %s")

The yield! keyword returns a seq<_> inside of a sequence expression, and merges it into the output sequence. Quick demo in F# Interactive:

> seq {
-     yield 1
-     yield! [2; 3; 4]
- };;
val it : seq<int> = seq [1; 2; 3; 4]

So we want to emit a single seq<string> which represents all of the edges between all of the nodes in the graph. Simple enough.

To build that list of edges, we map each region to emitRegionAssociations, which returns all the edges between that region and its adjacent neighbors. (So the function has type Region -> seq<string>.)

But we cann’t just return that in our sequence expression, because at that point we have seq<seq<string>> (from the Seq.map call). So Seq.concat is used to merge all of those sequences of sequences of strings, into a single sequence.

Finally, we map every string describing an edge in the graph to sprintf to prefix it with four spaces. This is just so that the resulting DOT file looks nicer. Note how we are using partial function application / currying to pass sprintf with the format string already specified to Seq.map.

If you are reading my blog and actually following along with this post, you are probably already well aware that F# is a beautifully concise and powerful language. This code snippet was your daily reminder.

Now that we have the ability to render graphs of Kami 2 puzzles, we can now see what we are up against when we try to apply some algorithmic might to solving them.

Here are a few couple examples of Kami 2 puzzles and their graph representations:

two puzzles with graph

Solving Puzzles

With our detour of graph rendering out of the way, we should not roll up our sleaves and start solving puzzles. To give you a visual walkthrough of how Kami puzzles work here are several graphs of a puzzle in the process of being solved.

First we start with this. While the graph representation looks complicated, you can see that the “center” region (r_5) is the most connected.

two puzzles with graph

By coloring that center region yellow, it merges with the two other yellow regions (previously r_1 and r_6) and the graph is now smaller. You can see how coloring a region “merges” it with adjacent regions of the same color.

two puzzles with graph

We repeat the process one more time, this time coloring the yellow region red. The next step is to color that red region teal, and the puzzle shall be solved.

two puzzles with graph

Formally, we want to write an algorithm that performs the “color region” operation as few times as possible to reduce a colored puzzle graph into a single node. (i.e. this is the formal definition of how to define and solve Kami 2 puzzles.)

Now we have the concept in-place, let’s go into code matters.

Command-line arguments

As part of cleaning up the code, I hooked up the Argu command-line processing library. It’s actually pretty slick. While I’m pretty sure the |> Option.isSome is unnecessary and I’m just using the library wrong, I am pretty impressed that I got it working in literally seconds.

type CommandLineArguments =
    | PuzzlesDirectory of dir:string
    | PuzzleName of name:string
    | PrintRegionInfo
    | SaveMarkupImage
    | SaveGraphImage
with
    interface IArgParserTemplate with
        member s.Usage =
            match s with
            | PuzzlesDirectory _ -> "directory containg puzzle images"
            | PuzzleName _ -> "name of a puzzle to solve"
            | PrintRegionInfo -> "print region information to STDOUT"
            | SaveMarkupImage -> "save marked up puzzle image"
            | SaveGraphImage -> "save graph output"

...

    let results = parser.ParseCommandLine argv

    let argPuzzlesDir = results.GetResult(<@ PuzzlesDirectory @>, defaultValue = "./puzzles/")
    let argPuzzleName = results.TryGetResult(<@ PuzzleName @>)
    let argPrintRegionInfo = results.TryGetResult(<@ PrintRegionInfo @>) |> Option.isSome
    let argSaveMarkupImage =  results.TryGetResult(<@ SaveMarkupImage @>) |> Option.isSome
    let argSaveGraphImage =  results.TryGetResult(<@ SaveGraphImage @>) |> Option.isSome

Search algorithm

The search algorithm itself looks complicated, but in reality it is a depth-first search of every valid “move” for a puzzle. So it is guaranteed to find a solution up to a specified maximum number of moves… but in reality the approach is limited by the (extremely) large branching factor of Kami 2 puzzles.

To combat this, a few tricks are applied to speed up computation.

First, we provide a numeric evaluation for each candidate move. This is pretty standard for tree-based search algorithms. For Kami 2, our evaluateMove function applies heuristics like the number of merged regions, total size of the resulting merged region, etc.

Second, we store a hash of every puzzle node. This dramatically speeds things up since we no longer reevaluate duplicate search nodes. i.e. if there are two sequences of moves that arrive at the same puzzle configuration, we only need to evaluate one of them.

Note I am calling JankHash rather than GetHashCode. There is an incorrect assumption I made about the default implementation of GetHashCode for F# record types and the immutable Set type. Perhaps I’ll cover the failure mode in a future blog post. But for now, I just wrote my own janky way of hashing each puzzle step. (Which if you look at the code is shamefull.)

Ultimately, here is the first pass at the search function. Each call returns a StepResult object which is only really useful for debugging and novelty diagnostics. It could easily be replaced with a PuzzleMove list option.

type StepResult = Cancelled | Culled | Duplicate | Solved of PuzzleMove list | OutOfMoves

let rec bruteForceStep (puzzleStep : Kami2PuzzleStep) movesList movesRemaining (state : SearchResults) =
    let puzzleStepHashCode = puzzleStep.JankHash()
    state.IncrementNodesEvaluated()

    // See if we can cull the search at this point.
    // User cancelled the search task?
    if state.CancellationToken.IsCancellationRequested then
        Cancelled
    // Have we already seen this puzzle step before?
    elif state.KnownNodes.Contains(puzzleStepHashCode) then
        state.IncrementDupNodes()
        Duplicate
    // Is the puzzle actually solved?
    elif puzzleStep.IsSolved then
        // We built up the move list "backwards", so reverse to make it right.
        Solved(List.rev movesList)
    // Ran out of moves? (Hopefully we never hit this, as we cull unsolveable
    // puzzles earlier in the search tree.)
    elif movesRemaining <= 0 then
        OutOfMoves
    // Otherwise, try making a move and recursing.
    else
        // Since we are in the process of evaluating this node, add it to the
        // known nodes list. So a peer / parent doesn't reevaluate.
        state.KnownNodes.Add(puzzleStepHashCode) |> ignore

        enumerateAllMoves puzzleStep
        // Evaluate each of these potential moves.
        |> Seq.map (fun (regionToColor, newColor) ->
            let evaluation = evaluateMove puzzleStep (regionToColor, newColor) movesRemaining
            (regionToColor, newColor, evaluation))
        // Filter out the ones that are unsolveable.
        |> Seq.filter (fun (_,_,eval) -> eval > 0)
        // Sort best to worst.
        |> Seq.sortByDescending (fun (_,_,eval) -> eval)
        // Recurse.
        |> Seq.map (fun (regionToColor, newColor, _) ->
            let updatedPuzzle = colorRegion puzzleStep regionToColor newColor
            assert (updatedPuzzle.JankHash() <> puzzleStepHashCode)

            let result = bruteForceStep updatedPuzzle ((regionToColor, newColor) :: movesList) (movesRemaining - 1) state
            result)
        // For each of these potential moves, did we find a solution?
        |> Seq.tryFind (function Solved(moves) -> true | _ -> false)
        |> (function Some(sln) -> sln | None -> Culled)

Enabling timeout

For really complicated puzzles (and/or super-inefficient algorithms) we need a way to stop searching for a solution and just fail with “no solution found”.

I wired that into the solver by putting the actual search step into a Task object, and then calling Wait with a specified timeout.

The only tricky part was threading through the CancellationToken object into the actual solver function itself. But I’m very happy with how this turned out. Note that the StartBruteForceSearch method returns a 3-tuple:

The Task object performing the operation. This is so we can call Wait.
A SearchResults object. This will be updated with the solution and stats while the search is taking place.
Finally a CancellationTokenSource object, which is used to signal cancelling the search if so desired.

let stopwatch = Stopwatch.StartNew()

let searchTask, searchResults, cts = Solver.StartBruteForceSearch puzzle moves
if not <| searchTask.Wait(TimeSpan.FromSeconds(timeout)) then
    cts.Cancel()

let timeResult =
    if cts.Token.IsCancellationRequested then "Timeout"
    else sprintf "%.2fs" stopwatch.Elapsed.TotalSeconds

Results

I extracted a semi random set of puzzles from the game, with at least one puzzle from every group of six. Here is how my solver fared with a five-second timeout:

two puzzles with graph

Actual CLI-output. Notice how most puzzles are solved in less than one second or not at all. That is somewhat expected for any search space with a 20-200x branching factor.

Also, note that for some puzzles there is an unusually small number of nodes that were evaluated. That means that our evaluateMove heuristic is actually pretty good.

Chriss-MacBook-Pro:kami2-solver chrsmith$ dotnet run -- --searchtimeout 5
Solving puzzle 'set 1 #1 (1)'...	0.06s	(2 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 1 #6 (6)'...	0.01s	(4 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 2 #6 (4)'...	0.00s	(41 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 3 #6 (5)'...	0.01s	(25 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 4 #1 (2)'...	0.00s	(3 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 4 #6 (4)'...	0.01s	(20 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 5 #2 (3)'...	0.01s	(37 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 5 #6 (4)'...	0.01s	(26 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 6 #6 (5)'...	0.16s	(488 nodes, 8 dupes)	SOLVED!
Solving puzzle 'set 7 #6 (3)'...	0.04s	(280 nodes, 19 dupes)	SOLVED!
Solving puzzle 'set 8 #6 (5)'...	0.01s	(31 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 9 #2 (3)'...	ERROR: Unable to determine puzzle colors.
Solving puzzle 'set 9 #5 (9)'...	0.00s	(12 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 9 #6 (7)'...	Timeout	(94915 nodes, 31149 dupes)	no solution found
Solving puzzle 'set 10 #1 (4)'...	0.15s	(1066 nodes, 330 dupes)	SOLVED!
Solving puzzle 'set 10 #6 (7)'...	0.00s	(20 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 11 #2 (6)'...	Timeout	(82838 nodes, 61025 dupes)	no solution found
Solving puzzle 'set 12 #6 (6)'...	Timeout	(7904 nodes, 2615 dupes)	no solution found
Solving puzzle 'set 13 #6 (6)'...	Timeout	(7408 nodes, 2499 dupes)	no solution found
Solving puzzle 'set 14 #6 (9)'...	2.06s	(9226 nodes, 3947 dupes)	SOLVED!
Solving puzzle 'set 15 #6 (5)'...	1.79s	(4120 nodes, 1686 dupes)	SOLVED!
Solving puzzle 'set 16 #6 (5)'...	0.00s	(20 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 17 #6 (6)'...	0.01s	(29 nodes, 0 dupes)	SOLVED!
Solving puzzle 'set 18 #6 (15)'...	Timeout	(1517 nodes, 0 dupes)	no solution found
Solving puzzle 'set 19 #1 (14)'...	Timeout	(11419 nodes, 5141 dupes)	no solution found
Solving puzzle 'set 19 #3 (17)'...	Timeout	(11346 nodes, 4822 dupes)	no solution found
Solving puzzle 'set 19 #4 (15)'...	Timeout	(6775 nodes, 2741 dupes)	no solution found
Solving puzzle 'set 19 #5 (17)'...	Timeout	(8079 nodes, 3177 dupes)	no solution found
Solving puzzle 'set 19 #6 (15)'...	Timeout	(10604 nodes, 4547 dupes)	no solution found
Solving puzzle 'user generated #1 (10)'...	Timeout	(12234 nodes, 4827 dupes)	no solution found
Solving puzzle 'user generated #2 (4)'...	0.00s	(15 nodes, 0 dupes)	SOLVED!
Solving puzzle 'user generated #3 (12)'...	Timeout	(3197 nodes, 189 dupes)	no solution found

Next steps

I’ll try to rerun the battery of tests with a much, much larger timeout (maybe a full hour?) and see how many more puzzles it can solve then. But given how I wrote the program entirely to solve puzzles 3-6 of set 19, clearly my work here is not done.

So next I’ll look into performance optimization (profilers? rewriting code?), tuning the algorithm (better handling 10+ move puzzles?), or just plain old debugging (maybe the solution is blocked by bugs?)

I’ll keep you posted. If you are interested in poking around the source, check it out on GitHub.