Review: German qualifier

Today I want to share some thoughts about the online qualifier for the German Puzzle Championship 2022 which took place last weekend. I will also discuss a few puzzles from the event; if you have not seen them yet and want to have an unbiased look, now would be the perfect time.

Let us start with the overall difficulty, which has been a major issue for me. As I have remarked numerous times already, the difficulty level of such events seems to have increased constantly. Over the last few years, no (German) participant has been able to solve all the puzzles, even when the time window has been stretched to 150 minutes. This is a trend I view as misguided.

We should keep in mind that the slope from the top down is rather steep; less experienced participants usually manage to work only a fraction of the puzzles the best competitors can solve in the same time. If the top solvers finish in the 80-90% range, the cut can be expected to lie in the magnitude of 25% and has repeatedly dropped below that line.

Also, the puzzles of the actual Championship are often harder than those used in the qualifier. This means that, in the worst case, some competitors there might be able to solve only a handful of puzzles at all and leave with virtually no points. Now, given that we do this mostly for fun, one could say it is not that bad as long as the participants enjoy themselves. But ultimately I feel such an outcome is not a good advertisement for our community, because newcomers might somehow feel left out.

I am not saying that the above problem can easily be corrected by making the puzzles substantially easier. As a consequence of the gradient in the field, the stronger participants will then solve all the puzzles with a significant amount of time left, which is not desirable either. It feels equally wrong to have a dozen or more competitors finishing all the puzzles and using only bonus points as a tiebreaker.

Basically, it must be the target to find a decent middle way here. Ideally, the top competitors should be able to barely complete the set, certainly not so fast as to get bored. On the other hand, there must be enough puzzles which are accessible even for the weaker half of the field, so that they do not end up in the single-digit percentage range.

This is something I have advocated before, and I am happy to report that it seems to have worked out nicely. Two of the official competitors were able to finish all the puzzles, and a few others came close. The cut lies above 30% which is not quite what I had anticipated, but still higher than in the previous years.)

(I am – respectfully – taking our international puzzle friends out of the equation here, since the target has to be the German community for this particular event. And, with even more respect, let me note that participants like Ken or Freddie will always turn any objective regarding the difficulty level upside-down, no matter what you try.)

On the downside, the total number of participants was smaller than I hoped. The idea of the selection of puzzle styles (we will get there shortly) was to motivate more inexperienced solvers, and it did not work out as planned. Perhaps it has to do with the rather limited announcements we made beforehand, but looking back that is all just speculation. In any case, this is something we should continue to work on in the following years.

Regarding the puzzle selection: I had made the decision to use only basic puzzle styles – in particular those from the blog you are reading – over a year ago, when I volunteered to design the puzzles for the qualifier. And I still believe in this approach; in my opinion there is no need to overcomplicate the qualifying process by adding weird variants, hybrids and innovations, which can be saved for the Championship itself instead.

As I see it, the variety of standard puzzles has enough to offer. This contest featured a reasonable assortment of filling puzzles, line segment puzzles, placement puzzles, shading puzzles and dissection puzzles (the five main categories in my book), distributed over what I consider an appropriate difficulty spectrum. There may be flaws in the selection (as I wrote in a previous article, people are typically reluctant to give negative feedback), but no major drawbacks I hope.

Nevertheless, when it comes to newbies, I keep wondering. Puzzle styles like Masyu, Tapa, Anglers or Galaxies are normal for those of us who have been around longer, yet they are not quite as accessible as we pretend they are. Many of them are not intuitive, and I could imagine that they appear scary to someone who has not seen their kind before.

Do not get me wrong; I am not suggesting that we limit ourselves to Sudoku (and perhaps Minesweeper) entirely. It makes no sense to deny the treasures that can be found in the huge vault of logical puzzles. But we must make an even larger effort to reach out to those who are not familiar with anything beyond the regular 9×9 grid.

Next, about the actual puzzles. Very early into the organization of the event, I put down a list of the contest puzzles, both those already finished (including a difficulty estimate based on my own experiences, since test solving was not yet underway) and the ones still in the pipeline. The idea was to maintain balance during the entire creation process.

Well, it did not go smoothy all the way. If I remember correctly, the first puzzles I designed were the Sudoku and the Kropki. Some other puzzles which also appear in the harder half followed, and at some point I realized that the total difficulty would be too high. Hence I created a bunch of easier puzzles, but this time the pendulum swung too far in the other direction. Most of them were one-minute puzzles, and there were no puzzles of intermediate difficulty.

I had to create new versions of some puzzles (and also again at a later stage, when the test solvers had already given me the first round of results). It is probably for the better, because I was under the impression that the early releases were not quite as balanced as the final set.

Two puzzles where last-minute changes were made were the Shikaku and the Masyu. Both were originally much smaller and easier. Such puzzles – which have gone through a lot of change – are particulary hard to evaluate. Everyone who has seen the old and the new version, author and test solvers alike, will be prejudiced one way or another. I think the point values were reasonable; in the end, though, my difficulty estimate for these puzzles was little more than a guess.

Which brings us to the matter of rating the puzzles. In order to set the point value for each puzzle, I am usually relying on both the results of the test solvers and my own gut feeling. No, actually, it is not just gut feeling, but rather a serious attempt to insert my own experiences as a solver (detached as much as possible from my authorship of the puzzle).

You see, as part of the job I am trying to put myself in the position of solvers with different experience levels. It may sound strange because that is what test solvers are for. And yet, I have found it useful in order to fine-tune the ratings, beyond a simple exploitation of the available test solving times.

It may appear that I am trying to distort the results from the test solvers, but I do not think this is the case. On the contrary, I consider this an attempt to fix the weak spots of the test solving process. You see, the entire rating system rests on the notion of a fixed ratio between the solving time and the point value. However, such a ratio does not really exist globally.

For example, the Slitherlink and the Tapa were very small and had solving times in the same range as the first two puzzles (Easy as ABC and Tents) from most of the test solvers. However, I felt that these two puzzles were less straightforward, and that there would probably be a skill level or a “threshold” where solvers needed more time to complete them. I therefore decided that they should be worth more points (15 instead of 10).

I made the same adjustment for the Fillomino because, in my experience, many solvers will enter all the numbers, which takes more time despite the low difficulty level of the puzzle. Such changes are certainly not distortions; it is just that I feel some issues with the puzzle may not be apparent from test solving, especially if there is only a small amount of feedback available. By the way, let me thank again the test solvers for their support in preparation of the event.

I should probably mention that the test solvers finished some puzzles faster than I expected. When that is the case, I take it very seriously; it may well be that they have found short cuts in the intended solving path. Two such puzzles were the Cave and the Hakyuu (see the final part of this post). In the end, it turned out that they were still among the hardest in the set, but I downgraded them a little.

A few words on manual score adjustments are in order. Several years ago it was customary for the organizer to contact the participants when they had minor errors (which could be attributed to mere typos), and also in cases when there was something missing, such as one of the rows for the answer key. We no longer do that for some time now, and I feel this is also a change for the better.

In the end, accuracy on the part of the participants costs time (which could be spent on other puzzles instead) and should be rewarded. I know competitors who invest a significant amount of time and energy checking and re-checking their results. It feels plain wrong to me to give other solvers the benefit of the doubt when they omitted this step and entered a wrong solution.

In fact I strongly suspect that I have been lied to in the past – not often, but a few times – when I contacted them about some of the answer keys. Perhaps some participants even believe this is legitimate – spot a mistake following an inquiry, correct it and send back a different solution than they originally had. After all, it is something the organizers cannot prove. This is just another reason why I think it is a bad idea.

Now, this article is already longer than I wanted it to be, so let me just go over my favorite puzzles from the set quickly in the final paragraphs. (Warning: Spoilers ahead!) They are (in the order they appeared in the event, for simplicity): Smashed Sums, Star Battle, Cave, Regional Pentominoes, Hakyuu.

The key to the Smashed Sums is the “X-Wing” for the shaded cells in the rows 3 and 5: The large clues yield number cells R4C2, R4C4, R3C6, R4C6 and R5C6, and one can deduce that either the pair R3C2 + R5C4 or the pair R3C4 + R5C2 must contain shaded cells. Either way, R2C4 and R6C4 must remain empty (or else the clue for the central column would be violated), and now the clues of 1 and 3 take over.

After playing around with the initial constellation, I figured out that one can place all the shaded cells (and all the 1’s as well) using the clues at the left of the grid and the 11/12/13 combination at the top. The remaining two clues are only required to determine the larger entries. I decided to give clues of 5 and 6 in the columns 3 and 5 because of the strong visual impression of the clue arrangement.

The Star Battle plays a lot with the outer three rows/columns on each side. I will not go over every single step, but in the intended solving path one has to go around the grid a couple of times, slowly making progress by eliminating potential positions for the stars and determining their exact locations only at a rather late stage.

In an earlier draft of the puzzle, the top-left region did not include the cell R3C4. It seems a tricky step exploiting column 4 is now needed in addition to everything else, but I still prefer the harder version because it avoids using the same formation – and, consequently, the same solving steps – both in the top-left and the top-right corner.

The Cave also went through several earlier drafts. Clearly, one can work a lot with the 3/5 clue combinations in the top half and the 4/6 combinations in the bottom half; sooner or later the 9’s in the corners come into play. In particular, one finds that there must be exactly one shaded cell in each boundary row/column, with a beautiful rotational symmetry around the grid, to satisfy the four corner clues.

I liked this argument a lot because it allows jumping between the top row and the bottom row a couple of times. (I had attempted to construct something similar in a Four Winds puzzle but it occurred to me that the theme was easier to implement in a Cave puzzle.) The bad news is that these jumps are not actually required to solve the puzzle logically; this is one of the short cuts I mentioned earlier. There is enough left of the the intended solving path, though, hence I decided to keep this version.

Designing the Regional Pentominoes started, as you can guess, with the 2/0/2/2 regions. Each of the 2-shaped regions can only contain one of the pentominoes U, V and Z, hence those can all be eliminated as candidates from the other regions. Based on this step I wanted to build a solving path that used many different kinds of techniques, but somehow I did not get it to work.

The next crucial step addresses the location of the X pentomino. It can only lie inside the 0-shaped region, eliminating a couple of cells around it and in particular in the region to its left. At this point, another pentomino group is forming (I, L, T and Y) which can only be used in certain regions, so it is basically the same argument over and over again. Still, I liked the final layout a lot.

Finally, the Hakyuu. It uses an interesting global step as a starting point: There are 15 regions of size 4 or larger, and since the grid has only 12 columns, one must find three columns to accommodate two 4’s each (in the top and bottom row, for lack of alternatives). It turns out there is only one possible formation for this, and once a couple of 4’s have been located, the next logical steps are available.

Again, there is a short cut which uses only local arguments to arrive at – more or less – the same conclusion. Plus, there is more than one way to proceed after the 4’s. This is not unusual for a Hakyuu, which is generally a very rigid puzzle style (meaning that the solving path often tends to develop a life of its own, with a unique solution emerging from different directions).

In the end, Ulrich’s solving path looked completely different from mine, but once again I accepted it without further ado. Apart from being under some time pressure, I was not sure if it was possible at all to tweak the first draft into something that would use the intended solving steps and no others.

So much about the puzzles (and other aspects) of the qualifier. I do not know whether I will do this again anytime soon. Anyway, thanks to all the participants; I hope you enjoyed the contest.

Leave a Reply

Your email address will not be published. Required fields are marked *