Post by FurbyFubar in Optimal starting guesses? I did an analysis and here's the result!

Viewing post in Optimal starting guesses? I did an analysis and here's the result!

But a binary search is the way to go when it's possible. Sure this isn't a sorted list, but starting by cutting the search space in half still sounds great to me as it's the closest we can get to a binary search? Guessing a single E will always remove half the possible answers from the list (with the info from just that one E), so that must be the most info we can (always) get from a single letter in the first word.

So yes, you will get more info if you guess and find another letter than E. But you will get less info in every case where you don't find that letter. On average, going for an as close split to 50/50 as possible will result in the fastest average search. For example, think of teaching a robot to looking up a word in a strange dictionary with one word per page. If we tell it to open the dictionary in the middle and look at the word, then it can remove 50% of the book no matter what. For the next step it looks at the middle of the pages it was left with, and repeats this until it's either looking at the word it's looking for, or it tries to go half a page forward (and thus knows that the word it's looking for wasn't in the dictionary). This can always be done in O(log n) steps, where n is the number of pages in the dictionary.

The analogy here is that guessing a letter that's in half the possible answers is like the robot guessing to look at the center of the dictionary first. If we guess an uncommon letter in Wordle that's in 10% of words, that's like programming the robot to opening the dictionary 90% towards the end and hoping that the word it's searching for still comes later in the dictionary, because then we've made the search even faster! Well, if it's lucky and its word is in the last 10% , then we have made the that search faster. But we can't ignore that in 90% of cases the random word it's looking for comes in the first 90% of the book. (Yes, I know that a human would assume things about where the word they're looking for is expected to be in a normal dictionary and thus make their first guess better for normal dictionaries, but this is still how a program searches in an ordered list, because it's mathematically proven to be the fastest way to do it when you don't have info about what's in the list.)

I'll admit that it's still *possible* that a single word without an E could result in 5 different pseudo "binary lookups" that when combined (always or on average) cuts the list down more than the 5 lookups of any single word that includes an E would, but that seems unlikely to me.

Of course, this whole argument is all ignoring any info green squares could give.

I'd go to a forum talking about Absurdle to see what starting guess(es) leave(s) the least *possible* answers given the worst possible luck (because someone is bound to have done that analysis already). I couldn't find it while trying to speed read on the games's own page https://qntm.org/absurdle And the first results I found with Google was some guy on a blog NOT using Wordle/Absurdle's word list in their analysis, so yeah - Your mileage may vary.

randytayler3 years ago

I can't back my argument up with big o notation or such, but I don't feel like this truly compares to a binary search. Is it because I only do hard mode?

(My gut feel is unscientific, and thus my hypothesis is pretty dann dubious, but let me provide one more useless piece of anecdotal data: have you seen the wordle solver that lets you put in the final word, and it tries to guess it? I match or beat it 4 out of every 5 times. But maybe it's not the best solver?)

Maybe it's this: In hard mode, you HAVE to use the letters you've found, right?

On my first guess, I find there is an E somewhere.

Now, because I have to use that E IN MY SUBSEQUENT GUESSES, I only have four spaces per guess now, and I still don't know where the E goes.

Blah. I'm not very convincing. Maybe what it comes down to is letter combos. Like if you rule out H, you also rule out CH, SH, TH, PH, WH, and GH. Those are two-letter combos ruled out for the price of one letter. With those combos gone, the likelihood of each companion letter goes down, too! (Or if you find an H, you're statistically way more likely to have one of those combos than not. (I think.))

But E... E is ubiquitous. It can go just about anywhere, with any adjacent letter.

In short, I think this game can't be simplified to binary searches. Words have patterns that shortcut things faster than a 50/50 search on one letter. That's my hypothesis.

dfmchfhf3 years ago

concur: without the restricted wordlist and in hard mode, it does feel like locking in a letter hurts a lot; many of the popularised starts can easily get trapped after the first guess:

* SIREN gets trapped by [B/C/D/F/K/L/M/N/P/S/T/V/W]INES

* AROSE/SOARE gets trapped by RA[C/G/J/K/L/P/T/V/X/Z]ES (and maybe RARES/RASES) as well as S[C/E/H/N/P/T/W]ARE

* SALET gets trapped by [D/F/H/M/P/R/V/W/Z]EALS (but not TEALS or SEALS)

* ORATE/ROATE gets trapped by TA[B/K/L/M/P/S/V/W/X]ER (and maybe TATER)

most, if not all, of the "best strategy" presentations depend on the reduced solution list to avoid corridor traps ([B/D/E/F/H/L/M/N/R/S/T/W]IGHT is reduced to [E/F/L/M/N/R/S/T/W]IGHT for example; also almost all S-terminal plurals and present verbs are not on the list). given that, by an exhaustive search posted by Alex Selby, the best word averages 3.42 guesses and the worst word averages 4.10 guesses (in normal mode), it's probably more important not to fall into any traps while discovering information on hard mode, since the worst average has 1.90 guesses to spare (noting though that the worst guess ends up falling back to the 'optimal' guess in half the cases, and fails to guess the word some of the time).

from the human side of things, digraphs probably do play a significant role in making guesses; if R, H, T, S, L, and N are all ruled out, for example, you can be fairly certain the word has at least two vowels, without having to guess the vowels and restricting further guesses.

it doesn't feel like a binary search is optimal in hardmode, since confirmation reduces the ability to get more information, unless the search space is sufficiently small (which is true with the restricted word list, but not as much with the full word list)

(personally, CRWTH is my starter of choice, followed with confirmed letters with S+L+P / S+L+N / S+N+D)

FurbyFubar3 years ago

I absolutely agree that you have a good point if we're talking about hard mode! In hard mode finding letters is not only the goal of the game, but also something that hinders you in achieving that goal. So finding common letters later can clearly be an advantage then, especially if the strategy you're running is to set aside some number of guesses at the start for letter-hunting.

This is just by my gut feeling, but In English it feels finding vowels is less helpful for narrowing down what letters can go next to it than finding a consonant, and especially an uncommon consonant. The only languages I really know are Swedish, English, and JavaScript, but it feels like the irregular spelling of words in English vs their sounds mean that vowels can go anywhere they damn please given just one or two other fixed (green) letters around them. Whereas consonants following each other can do weird things on occasion, but if you have a word starting with T you know that if it's not a vowel coming up, it's pretty much for sure is going to be H, R or W. T's in the middle of a word add C and another T as other common possibilities. But for vowels, loan words with spellings from languages that treat vowels differently mean that such rules for are much fewer and much more frequently broken. Very often when I get stumped trying to figure out what word could possibly fit with just one unknown and four greens, the reason is that I overlooked the word is because I pronounced it wrong when I checked the possibilities. This is less of an issue (but can still happen) when I play word games in Swedish, and I don't think it's mainly because it's my first language; having had a spelling reform, even if it was in 1906, means that Swedish is a bit more regular in how it uses its vowels.

itch.io

Viewing post in Optimal starting guesses? I did an analysis and here's the result!