The Compression Puzzle Challenge
Life is short, exciting and full of small events. Small events can change your daily life and sometimes inspire you beyond your wildest imagination. This is one of those stories when a small and neat idea took off and challenged people beyond my or everyone’s imagination…
My dear friend David brought this small and exciting coding challenge/puzzle to our hackerspace a few weeks ago. A challenge that is, in essence, brutally simple.
Write a program that will take a string “AAABBAAC” and compress it into its compressed form “3A2B2A1C”.
Now. If you’ve ever been to a place where talent is in abundance, where people are creative and where nothing is impossible, you’ll correctly conclude that we had three solutions in the next 3 minutes and that in the next hour, we had five more.
Ten solid solutions in an hour… more than enough, some might say. Done. Nah,… We can do better.
Level up
One of the most appreciated and head to thought virtue to teach a creative person is for sure the “curiosity”, and this little trait that some people have and some don’t is hard to cherish. Make no mistake. It’s worth nurturing. It is also one trait that makes some of the best engineers best at what they do.
As soon as we put our solutions together, the discussions started. Looking at one’s solutions to understand what’s the approach became exciting. And to build on this momentum, people started replicating methods in more languages and began looking for inspiration outside of our loop.
The next logical step was to put it on GitHub, define some basic rules and slowly scale up the operation and the challenge to more people.
This could be opened up to the “whole universe”. But perhaps opening to everyone would also challenge no one. So I decided to pick the very best engineers from my network that I know and have worked with in the past to give this challenge a go. My “selection” was primarily based on languages that might be appealing to wider audiences and might have mechanisms and capabilities that would make solving the given task easier…
Basic approaches and beauty
At this point, I could break down a couple of solutions and compare the approaches that the engineers took and compare these approaches across programming languages. I could also benchmark each solution against the other. I could benchmark solutions within languages, benchmark on small data sets, or find the solution with the minimum memory footprint but the best FP approach. That would all for sure be an exciting comparison. However, I would like to emphasise the elegance of solutions and the readability of the implementation. As I’m pretty sure by now that you, dear reader, already have a solution of your own in your mind.
Let’s start by appreciating three Python solutions submitted by Andraž Brodnik and Luka Kacil. First, one uses mutations basic recursion and is one of few submissions where the author took time to comment on the code. The second one uses Python’s powerful intertools library for “efficient looping” and its groupby function. Although there are many more Python solutions now in the collection, I’m highlighting Luka’s “tail-recursion” solution as a third example. It also highlights the author’s consideration of challenges associated when dealing with recursion.
Looking into the abyss with JavaScript should always be on the table. So it makes also sense to see how we approached one of the most popular languages on the planet. In the first JS example below, you’ll see David’s solution that splits the sequence of input characters and does so-called reduction or folding with reduce() function (a.k.a. fold higher-order function). The author then increments counters accordingly. We can then follow the same theme in the second example submitted by Klemen Kogovšek, where the author also cleverly uses spread syntax (…) to convert string to a sequence of characters and conditional (ternary) operator as an elegant replacement for if…else statement. A similar approach was then also put to “extreme” within Goran Kodrun’s TypeScript solution, which can be found here. The third example in this group comes from me. It uses a regular expression to break together string into groups; for that, it uses the so-called negative lookahead (lookbehind). After these groups are made, it computes the length and picks the character representing the group, following with the nice flatMap to flatten everything into a sequence. After some discussion with other authors, we made a so-called “gentlemen’s agreement” that regular expressions should be considered as cheating and are not encouraged as wanted. :)
At this point, you get the gist. It’s time to look into other popular languages.
For the sake of elegance. Let’s compare modern C++, C# and Go next. The C++ opens with string::reserve and clever usage of mutation with string::push_back. The following C# submission pops with while loop and Go nails with its powerful rune function. It was submitted by Nejc Ilenič, Peter A. Pirc and Tit Petrič respectfully. However, the “rune-less” Go solution was also submitted by Mitja Živković.
Now that we sifted through the OOP and not-so-oop solutions, it’s time to uncover more powerful FP solutions and solutions somewhere between those two. So we have a neat comparison of F#, Clojure and Scala. Examples submitted by Peter Keše, Simon Belak and myself. In F# and Scala, you’ll first notice the usage of pattern matching to deconstruct strings and their “tails”. In Simon’s Clojure example, however, you should also see the use of partition-by that applies f to each value in the collection, splitting it each time f returns a new value. Clojure also nicely shows transduce and mapcat on top of collections.
Now that we understand that splitting the string when the sequence changes from one letter to another could be helpful, we can start getting the grip of the following three examples submitted by Simon Žlender. First two in Elixir and the last one in Rust. In Elixir, we can see more precise usage of pattern matching and Enum#chunk_while, and in Rust, we can get some power from its Peekable - next_if_eq.
Hold on, Oto! Did you get any FAANG engineers to do this? Sure.
These two Kotlin solutions were respectfully submitted by Jernej Virag (Google) and Miha Novak (ex-Outfit7). Although the solutions both follow similar patterns to other patterns above with their usage of the fold, reduce and map, we can also see Kotlin’s let function. A standard-library extension function to the Template class takes a lambda function as a parameter, apply a contract on it and ultimately return the execution of the lambda we passed as a parameter to it.
Exotic beauty
Now, before we continue, I would like to emphasise that I’m a firm believer that beauty is, as they say, in the eye of the beholder. And that beauty comes in so many shapes and forms, and perhaps language itself may not be your cup of tea. It might be about other things that make it unique.
Martek nicely mixed pattern matching, folding (foldr), and monadic binding/composition operators (=«). The first round of such solutions should be given to Marek Fajkus’s Haskell solution and Klemen Kogovšek’s ReScript variation. Klemen shows us that modern superscripts such as ReScript can bring new capabilities even to JavaScript so we can see pattern matching / destructing something that is “impossible” on common terms.
Blushing a bit. One of the languages I had never seen before that blew me away doing this exercise was a language called Red. Red is a programming language designed to overcome the limitations of the programming language Rebol. It is a so-called successor, designed to replace his older brother. Below you’ll find three solutions, written by Boleslav “Rebolek” Březovský, Gregg Irwin and Boris. One of the themes that make Rebol, Red and these solutions “unique” is that they are all “syntax free”. The theory being as you become fluent, you end up writing code in sentences, somewhat similar to human languages. Because our brains are well optimised for sequences of words, Rebol sentences feel natural, and you become more productive.
Now that you’re familiar with Rebol and its clean syntax, I also invite you to take a peek at Krištof Črnivec’s Ruby solutions. In both of his solutions, you’ll see the usage of modern Ruby’s numerated parameters, conditional slicing with Enumerable#slice_when, Enumerable#chunk_while and tally for counting.
Exotic exotic beauty
Although the solutions above are all awe-inspiring on their own, they are still written in languages that people use and know. But there is also this submission that blew my mind. My friend Janko Metelko has been working on his new programming language for the last couple of years. If you know anything about programming, you’ll appreciate how very hard it is to design and build a helpful programming language. I’m incredibly privileged to be able to invite Janko to give his new programming language as a spin. Please enjoy the compression puzzle implemented in a new programming language named Rye.
Janko reimplemented the solutions seen in Python, Go and finished in style by adding support for higher-order functions to the Rye language so that he could submit the third solution.
Impressive. Stats?
Today - 4th of March 2022 - we’ve collected 45 solutions spanning 19 programming languages written by 23 authors. Most of the authors were invited to participate, and many authors submitted several solutions in several languages.
The next step
Now that we established basic rules, procedures and GitHub actions with the CI pipeline (via Nix and Docker), the next step is to invite you, dear reader. What have we missed? What deserves to be among the best? It’s an easy challenge, after all. :)
P.s: yes, there is also SQLite solutions in the repo. But PHP, Java, Brain** and Assembler are missing…
Summary
I’m deeply thankful to everyone who took the time and gave their best shot. This is already a good story to tell! Thank you!
Share if you care, join if you dare.
This article was also published on my LinkedIn profile as an “The Compressin Puzzle Challange.”