My Puzzle Solver Program  kordle KwonTom Addict Puzzles: 195 Best Total: 5h 8m 9s  Posted  2012.02.29 04:40:59 First time poster here.
I've been working on a program for solving these puzzles. I'm using python, and use a sat solver as a black box to to the number crunching.
I build CNF instances by associating a variable with every line segment, and asserting clauses corresponding to the following facts: For every point, either exactly 0 or exactly 2 of the segments that it touches are on. For every square with a number in it in the puzzle specification, exactly that many of the surrounding segments are on. No square has all 4 surrounding segments on.
This only gets me so far. The satisfying assignments to the variables of CNF formulas generated like this include the correct solution, but also possibly many solutions with multiple loops. As my programs finds unsatisfactory solutions (with more than one loop) returned by the SAT solver, it adds clauses asserting that precisely these loops do not occur. Often, the correct solution is found in a matter of 1 or 2 iterations of this process, but sometimes many more, which slows down solving dramatically, and is a severe impediment to the program I hope to make (for helping guide a human puzzlemaker in generating puzzles).
(Despite this, my program solves the user beasts posted in another thread (thanks!) in, typically, 20 seconds or so on my notsogreat machine.)
I have also tried including another set of variables for indicating whether or not each square is inside or outside a loop, but this has not helped me find a compact way to represent the restriction that there be but a single loop in the solution.
Now, from what I've read stalking these forums it sounds like most of the folks with programs for solving puzzles use a patternmatching style technique, and many of those who have written code for slitherlinks may not have the computer science background to understand what I'm going on about with CNF and SAT, but I thought that someone might find this interesting.
I'm interested if anybody had got a good technique for encoding the "single closed loop" restriction as a set of CNF clauses. The only way I've thought of involves creating n^2log(n) (where n is #rows*#columns) new variables, v_(i,j,k), asserting "square i and j are inside a loop and located no more than k squares away from each other" for i and j ranging over all squares, and k in {1,2,4,...,2^floor(log_2(n)) }. There is then a nice way to hook up these variables so that they calculate the distances and v_(i,j,2^floor(log_2(n))) is on for all pairs i,j iff both are in the same loop. This is then relatively clean to encode as a CNF formula.
Unfortunately, this solution uses an outrageous number of clauses and variables, making the resulting CNF formulas untenably large or, in any case, slower to solve than with the iterative loopeliminating technique I mentioned above.
So, does anybody know of a better technique? It's such a simple problem to solve outside of the CNF formula that I'd be surprised if n^2log(n) variables (and even more clauses) are really needed.
If anybody has an idea for this I would be very happy to discuss (or clarify myself, if this post has been unclear) it.
I also am happy to share with any interested parties my code, which can solve puzzles decently quickly and can verify uniqueness for (e.g.) user beasts in ~5 minutes.
Cheers, all 
Tilps KwonTom Obsessive Puzzles: 4657 Best Total: 20m 22s  Posted  2012.02.29 10:16:29 Pattern based solving is usually an augment to an underlying iterative rule or recursive search based approach, it is not the fundamental algorithm. I would think it is most likely an attempt to make puzzles interesting without introducing arbitrary difficulty.
I doubt that you will find a nice formulation for single loop using boolean algebra.
My solver doesn't have a formal SAT engine, because I am strongly focused on trying to 'control the difficulty level' for generated puzzles, so I want lots of control over the algorithm.
At the high level of the engine is try possibilities, if fail, assume opposite. It can either run in a purely iterative mode, where if a possibility doesn't fail, it's progress is undone and another possibility is tried, or a recursive mode, where possibilities are executed until failure occurs. Recursive mode can solve any puzzle, iterative mode is useful for controlling difficulty. (For solving puzzles, I run iterative until no more progress can be made, then switch to recursive on the partially solved puzzle. For generating puzzles I stick to iterative most of the time.)
The engine starts with the simplest rules like yours, number of edges defined by cell, number of edges defined by intersection. Then I added some inferred logic for the combination of a cell and one of the intersections it partakes. This gives rise to automatically putting edges or crosses on a 3 or a 1 where there is no exit opportunities at an intersection. Or that a line entering a 3 cannot exit at the same intersection. Then I added runtime derived logic. If you consider every possibility for the edges of a cell, or edges meeting at an intersection, you can determine that two edges must be the same, or opposite. These can combine to form long chains where if you know one edge, you know all of them. (Much later I extended this with deriving logic by considering every possibility for edges for two touching cells, because it is the only 'static' approach I found to get the end lines on adjacent 3's.) These derived logics can feed back into the logic generation process. As well as 'same or opposite' you can also derive 'not both' or 'not neither', but these do not easily form chains, so they are considerably less useful for solving. ('not both' overlap with 'not neither', alternating, can form a one direction chain of sorts.) Another way to do runtime derived logic, if you try two possibilities which can't both be true, but neither lead directly to failure, any common consequences are true, any conflicting consequences lead to same/opposite rules being derived between the trial and the consequence. I also implemented 'cell colouring' (identifying areas of the puzzle which are inside or outside of a given loop) which can provide nonlocal same/opposite logic and interacts with basic rules and derived logic. But it doesn't help avoid multiple loops at all.
For avoiding multiple loops my approach tracks which set of edges each edge belongs too, and if an edge would join one edge set back to itself, verifies that the entire puzzle would be solved, otherwise rejects that option. This is considered static analysis, so it can be performed inside trials.
More recently I implemented some more advanced loop deduction logic. After performing a trial it verifies that the puzzles has not been partitioned, if there are two edges which could not possibly become part of the same edge set, then the trial indicates failure. 'Not both' derived logic can limit the spread of the flood fill causing this to be quite smart at times.
I think there is even smarter path deduction algorithms, but I've failed to determine any which run in any feasible level of runtime.
Anyway, hopefully something in that ramble might be useful. 
psychman KwonTom Obsessive Puzzles: 785 Best Total: 28m 19s  Posted  2012.03.01 09:55:43 I also think that there is no straightforward way to describe single loop with boolean logic. I would propose to try heuristics to reach sooner to a solution with single loop
Some ideas are 1.Force variables that represent (inside,outside) loop to be inside 2.Force variables that represent (inside,outside) loop to be outside 3.Force variables that represent (inside,outside) loop to take a value based on a fixed probability i.e 80% inside 20% outside 4.Force variables that represent line or not line to be mostly lines 5.A combination of the above
Of course it is much easier to say these than implement them but if they are useful and you have better results let us know 
lodenkamper KwonTom Fan Puzzles: 21 Best Total: 47m 58s  Posted  2012.08.02 05:00:30 Interesting. I have written a solver and puzzle generator that is along the same basic lines  a SAT solver is the core, and then the single loop constraint is enforced. Some of the old user beasts are puzzles I generated this way.
I agree with the other comments that there is no way to cleanly do the loop constraint in a CNF, so trial and error is likely to be best. Thoughts you may want to consider:
1) The SAT solver itself is doing trial and error, so if a prematurely closed loop can be recognized at that level, the appropriate clause to prevent this can be added as part of the SAT solving process. This may improve speed by reducing the amount of back and forth between the SAT solver and your toplevel code. There are SAT solvers where the source code is available (e.g., MiniSat).
2) Early on, I added clauses to eliminate "small closed loops" e.g., 6segment loops, 8segment loops, etc., which is valid for most puzzles of interest in practice. CNFs for these are simple. This can prevent trial and error in some cases, and is clearly a partial implementation of the single loop constraint. However, this doesn't seem to reduce solving time when trial and error must be used.
3) I have found that there is a significant element of "luck" in solving hard puzzles this way  most of the time, solutions are found quickly, but in unfavorable cases, the time to solve can go way up (i.e., longer than I'm willing to wait). Seemingly trivial changes like rotating and/or reflecting the puzzle can have a big effect on solving time. Have you seen this kind of behavior?
4) Unfortunately, this is a thoroughly nonhuman way to solve a puzzle, so I've been assuming there is no real way to use this kind of solver to rate puzzle difficulty.
Last edited by lodenkamper  2012.08.02 20:56:08 
foilman KwonTom Admin Puzzles: 1726 Best Total: 24m 8s  Posted  2012.08.02 07:20:32 The puzzle generator I use to make the puzzles for this site uses a solver fairly similar to Tilps's description. I do try to keep the solving method as close as possible to how a person would go about finding a solution, so start by looking for wellknown patterns and later try different moves to see what ends in failure.
But that's because I want to end up with a rough estimate of how tricky it would be for a human to solve, so I need to do it that way. However, it is very interesting to read of other ways of finding solutions. (Or any techniques I can incorporate to make the "very hard" ones even tougher... ) 
kyrka KwonTom Addict Puzzles: 473 Best Total: 23m 56s  Posted  2012.08.04 00:44:57 Hi.
There is actually another interesting approach to solving these puzzles which is not using essentially "trial and error", but dynamic programming/memoization. The idea with that is to proceed e.g. column by column from left to right and to use just brutforce (with small optimisations) for a single column. As soon as a new column is calculated, you can "forget" about the previous ones and you only have to keep track of the lines of the last column and which open ends of the loop belong together.
Obviously this works best, if one side of the puzzle is quite short, as you can choose this side for the brutforcing and runtime will be linear in the length of the other side. However, if both sides are longer, you will run into memory problems very quickly, as the number of possible states you have to keep track of grows exponentially.
I was curious about how well this idea would turn out in practice so I implemented it for the 10x10size puzzles. (I doubt it will work for 20x14size or even bigger puzzles because of the aforementioned memory problems.) One column contains 21 line segments (11 horizontal, 10 vertical) and there are at most 10 open ends (it has to be an even number naturally), so if you think about it for a while, there are all in all 65 possible arrangements of open ends (hint: Catalan numbers ). So we have 2^21*65 (~100 million) possible states for one column and for one step we have to check in the worst case 2^10 = 1024 possibilities (if there is no further information like given numbers). At first, this seems of course too much for a ordinary computer, but in practice there are only about 1000 possible states to deal with at a time, so my program runs quite fast. I tested it on 10 different puzzles (i.e. 2 weeks from Monday to Friday) and all times were < 100 ms, average being about 27 ms.
I would also be interested in how fast your backtracking solutions are, just for comparison.^^
Here are the solving times for last week's puzzles: July 23rd: 8 ms July 24th: 88 ms July 25th: 4 ms July 26th: 20 ms July 27th: 32 ms 
arbor8 KwonTom Obsessive Puzzles: 3531 Best Total: 14m 2s  Posted  2012.08.04 18:38:27 Interesting solving times for Tuesday and Wednesday. Monday was easy and Tuesday needed some more work, but certainly not 10 times more than Monday. Again, for a noncomputer solver , Wednesday wasn't twice as easy as Monday 
lodenkamper KwonTom Fan Puzzles: 21 Best Total: 47m 58s  Posted  2012.08.07 23:43:57
Quote: Originally Posted by kordle First time poster here.
I've been working on a program for solving these puzzles. I'm using python, and use a sat solver as a black box to to the number crunching.
...
(Despite this, my program solves the user beasts posted in another thread (thanks!) in, typically, 20 seconds or so on my notsogreat machine.)
...
I also am happy to share with any interested parties my code, which can solve puzzles decently quickly and can verify uniqueness for (e.g.) user beasts in ~5 minutes.
Cheers, all 
Some follow up questions that occurred to me:
1) The user beasts vary in difficulty a great deal. Are there any puzzles which you've seen take much longer than 20 seconds to solve with your code?
2) How do you verify uniqueness with this approach? As I understand it, your method would terminate the first time the SAT solver returns a set of variable assignments that only has one loop. If so, the existence of other 1loop solutions would not be ruled out 
v_e_e_n_c_a KwonTom Obsessive Puzzles: 930 Best Total: 38m 37s  Posted  2012.08.08 21:28:50
Quote: Originally Posted by kordle (Despite this, my program solves the user beasts posted in another thread (thanks!) in, typically, 20 seconds or so on my notsogreat machine.)

My program don't use any backtrack, only rules and deductions. It can fully solve more than 90 percent of beast and the solving time is much less than second. 
