<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=458506188731710&amp;ev=PageView&amp;noscript=1">

Feb

28

2022

Wordle Dashboardle: Beating Wordle with Tableau

By: Donovan Gleeson

Over the first few months of 2022, I’ve been on a bit of a Wordle craze. In early January, the game already had as many as 300,000 daily players. As soon as I learned about the daily word puzzle, I immediately started wondering how I could use data to optimize my guessing strategy. 

A short Wordle recap

First, a quick catchup for those who haven’t heard of Wordle: It’s a daily word puzzle designed by Josh Wardle, where players attempt to guess a five-letter word (or “Wordle”) in six or fewer tries.

After each guess, Wordle will give you hints in the form of colored tiles. These tiles are: 

  • Gray: In this case, the letter indicated by the gray tile is not in the Wordle. For example, if the Wordle is SAUCE and you guess RULES, the letters R and L will be gray.
  • Yellow: Here, the letter indicated by the yellow tile is in the Wordle but in the wrong position. If you thus guess BREAD while the Wordle is EMBER, the E and R in BREAD will be yellow. 
  • Green: The green tile shows that the letter is in the Wordle and in the correct position. For example, if the Wordle is EMBER and you guess EMAIL, the E and M in EMAIL will be green.

You then incorporate this information in future guesses in hope of getting it right in no more than 6 attempts.

Beating Wordle with Tableau

If you want to see the final product before reading through how I built it, check out the dashboard on my Tableau Public here.

wordledash

First, I needed the word list. Because the Javascript that runs Wordle is publicly accessible, anyone can simply export the official word list used by the game, which has 12,972 entries (the complete list of valid guesses). In addition, there is also a subset of these words that are the “answers” (words that have been assigned a day where that word will be the right answer). 

I pushed this word list through Alteryx to create a Tableau-friendly data source. First, I wanted wide-form data, because, in order to filter data by letters in specific positions, it’s easiest to spread each word across five columnsone per letter. However, in order to count overall letter frequency across all positions, I wanted long-form data, where each word was spread across five rows (one per letter), with a column for the position in the word and a column for the letter at that position. To achieve both objectives, I created a data source in Alteryx that was both wide AND long. As shown below, the final data source had nine columns: one for the word itself, one to flag whether the word would ever be used as an answer (“IsAnswer”), five columns by letter position (the wide aspect), and two additional columns for the long aspect (the final two columns shown). Thus, I was able to filter using wide data while analyzing the letter pool using long data. 

Early Iterations

For our early iterations, I only looked at overall letter frequency and letter frequency by position. Here, using Tableau showed me that A, E, and S were by far the most common letters used, followed by O, R, I, L, T, and N. As I expected, lots of vowels. 

I then broke it down to see how letter frequency changed by position in the word. 

S is a common first letter, but it’s an even more common last letter, due to pluralizing a lot of shorter words. 

Using these frequencies, I could narrow my guessing to strategic words containing most of the commonly-used letters. Since the aim of the game is to guess the Wordle in as few attempts as possible, guessing strategic words seems logical.  

A real challenge

However, once I made my first guess, I had new information provided by the color-coded tiles. I wanted to incorporate that information into my second guess, which led me to create an interactive filtering dashboard that adjusts the frequency distributions based on all prior information learned. 

To test my filtering dashboard, I used the web archive of games from previous days. As mentioned earlier, Wordle provides three types of hints indicated by the colored tiles. I filtered by parsing free-text string parameters, rather than standard filters, as Tableau’s filter defaults use “equals” logic rather than “contains” logic. Using parameters allowed me to work around this default logic and provided a more user-friendly experience. 

I set up three different sets of parameters: 

  • Gray parameter: With this parameter, I can perform a row-level calculation that tests EACH word to determine if it contains a letter in the gray parameter string. if the word contains ANY of the letters in the parameter, that word is dropped.  
  • Yellow parameters: Since a yellow tile means the target Wordle contains that letter, we first need to filter to only words that CONTAIN the letters from each yellow parameter. However, a yellow letter also means that the letter is NOT at that position, meaning we simultaneously need to filter to only words that do NOT have any of the yellow letters at those positions.
  • Green parameters: Finally, we can use the green parameters to filter the word list based on letters appearing at specific positions. So, if a word starts with an E, we can filter the list to include only words that start with an E.

By adding the new hint information to these parameters after each guess, Tableau filters the list of remaining possibilities to help me guess the Wordle more easily. 

optimizing guesses

Although the above method allowed me to narrow down the available guesses considerably, I wondered if I could also use Tableau to determine which of the possible remaining guesses were “better” than others. 

On a basic level, strategic guesses are ones that gather as much information as possible. Thus, it makes sense to guess words with commonly-used letters, to maximize the “narrowing” of the word pool with each guess. Example: JUDGY doesn’t capture nearly as much of the pool as CARES, because the letters are far less common. 

Let’s say there are 20 words left that meet all established hints and criteria, meaning 100 total letters in the “letter pool”. Suppose the guess was BRAVE. First, we count the total number of B’s in that pool of 100 letters and get 7. We do the same to get the number of R’s (6), A’s (8), V’s (2), and E’s (14). Summing those counts, we get a cumulative letter frequency of 7+6+8+2+14=37. Thus, guessing BRAVE reveals information about 37 of the 100 remaining letters in the letter pool. 

Conversely, if another possibility was HABIT, which had a cumulative letter frequency (hypothetically) of 21, it would not be listed as highly as BRAVE, as it only reveals information about 21% of the 100 remaining letters. 

The tiebreaker is a similar process, but it looks at letter frequencies only by position in the word, rather than across all positions. (Example: SHEAR and HARES will always have identical cumulative letter frequencies, but HARES is a more strategic guess, as S is an extremely common last letter.) 

In closing

Hopefully, this fun exercise illustrates how Tableau can help you solve the daily Wordle puzzle. As a reminder, the point of the game is to solve the Wordle by yourself, so you probably shouldn’t count your score if you’re using the dashboard. 

That said, the dashboard includes a link to an archive of past Wordles, so try it out on an old one you don’t remember doing and see how it works! 

Other resources