As a programmer from a non-computer science background, most of the programming I have learned has been on the job. Probably the biggest downside has been a lack of opportunities to dive deeper into the elegance of theory that make the complex compute ecosystem run. Automata theory happens to one such topic.
Write a compiler.. seriously?
The rarity of circumstances under which one might decide to write a new language is understandable. And given this rarity, one might not encounter automata theory at a job. But considering the ubiquitous nature of programming language, the compilers had captured my imagination ever since I compiled the first Pascal program in 1999. Finally, in 2020 the stars aligned and I could dedicate time to scratch the surface of the topic.
Mostly I have noticed developers complaining about the compilers for either strictness or being too “unaware” of what they intend to express (I have done that too). But over the years I came to realize that the compiler is the only link between a developer and the runtime. The compiler is the last predictable scenario that a developer has under control. Once the binary is generated the chaotic universe takes over.
The above piece of C or C++ code technically does nothing, but wait how does one assume that? As a reader, if you happen to know these languages then the syntax gives away the context. How to explain that context to the machines? Welcome to the compilers odyssey. In essence, the conversion of the above code the following machine code is what automata theory allows programmers to accomplish.
push rbp mov rbp, rsp mov eax, 0 pop rbp ret
So study them but why?
The main purpose of this post is not to introduce one to compilers instead convince programmers to dig deeper and understand it for themselves. I have been procrastinating about reading the formal text on compilers for close to 10 years. It always seemed a daunting task that seemed too esoteric to be applicable to the daily programming problems I was solving. Like many who think this way, I was wrong. Compilers are not monoliths, they are a chain of programs that accomplish a complex task.
Compilers work in stages and communicate via the predefined format of messages. Sounds familiar? It should since compilers are programs too. They too parse strings, allocate memory, hold state in data structures, and need efficient algorithms to be used by millions of developers. The mathematics involved is not too difficult to understand. It does take some effort but there are enough resources out there to accomplish the same. Some books are listed here. Another great resource is the course on compiler design by Stanford on Edx.
Compilers deal with string parsing, n-ary tree traversals, quick lookups, graph theory, backtracking (or dynamic programming), and assembly language (which implies some computer architecture basics as well). It is a very vast subject and is actively researched as well. So anyone who has missed studying compilers is recommended to at least take a closer look at this elegant combination of ideas and implementations because it’s COOL.