It's been a long-running joke in the developer community that a lot of companies want to hire "rockstar developers." But what does it even mean? Do I need to be able to play an instrument, or, worse yet, be able to sing?
I'm a Rockstar Developer!—Building a Rockstar Language Transpiler
It's been a long-running joke in the developer community that a lot of companies want to hire "rockstar developers." But what does it even mean? Do I need to be able to play an instrument, or, worse yet, be able to sing?
Lately, someone had a really silly idea to actually create a Rockstar language, so that question finally has an answer.
On that Github repository, we have a pretty detailed specification of the language. There are also a bunch of transpilers linked at the bottom of the readme. This article is a write-up of the few lessons I’ve learned by building a Rockstar language transpiler.
In the first part, we’ll start with the basics: making the gem skeleton, filling it with some base code, and publishing it on RubyGems. The following parts will deal with more problems I’ve encountered by parsing the language—the tool I’m using, Parslet, is unfortunately a bit lacking in both examples and documentation, so I hope it might be useful. All the example code is also available on Github.
Generating a Gem Skeleton
First things first: we should have some structure for our project. The easiest way of starting a new gem is to use Bundler to generate a basic skeleton of everything a gem requires to work. All you need to do is to run a simple command:
$ bundle gem kaiser-tutorial
This may ask you a few questions about licenses and so on, and will generate a project folder with all the required files. The next step is to open your .gemspec
file and fill in all the missing information there, as Bundler won’t let you publish the gem if there are still TODO’s in it. This information is later used on the RubyGems website, so best to put something useful there.
Also, the .gemspec
is the place where you should put all the dependencies.
spec.add_development_dependency "pry"spec.add_dependency "parslet"
Take care that you keep your development tools like pry
or rspec
always as development_dependencies
so that later the gem doesn’t pollute the user’s Gemfile
and Gemfile.lock
with unnecessary things—and what’s even more important, with necessary things but in severely outdated versions, which has happened to me a few times.
We also might have to mess a bit with the skeleton if we entered the gem’s name with a –, because it will make a bit of a mess there. You do need to keep the main structure intact – a kaiser_ruby.rb
file in the lib directory, and a version.rb
in the kaiser_ruby
subdirectory. We could keep the structure, but this will have one less level and will be a bit more readable.
Don’t forget to update the require paths in the .gemspec
, spec/spec_helper.rb
and in the bin/console
files.
Finally, running bundle
inside the project’s folder will install all the dependencies.
Starting with Some Tests, Like a Proper Developer Should
Now that we have a skeleton, it’s time to flesh it out. We’ll use Parslet to help us write all the rules for parsing the Rockstar language files into actual Ruby code. To make sure we follow the specification, we’ll start with writing some tests and then develop the parser along so that they pass. We can be rock stars, but even rock stars have to follow the rules once in a while.
Let’s start with simple variable assignment. The specification has an example:
Tommy was a lean mean wrecking machine
Which should assign 14487
to the variable Tommy. Since Parslet parses everything depth-first, that means it will first parse Tommy
to a variable name, then a lean mean wrecking machine
should be converted to a number before it all gets put together. So let’s write a couple of tests for that.
RSpec.describe KaiserTutorial do context 'proper variable name' do it 'converts a word to a variable name' do expect(KaiserTutorial.transpile("Tommy")).to eq "tommy" end end context 'poetic number assignment' do it "assigns a number to a variable" do expect(KaiserTutorial.transpile("Tommy was a lean mean wrecking machine")).to eq "tommy = 14487" end endend
This should be pretty self-explanatory to anyone familiar with RSpec testing—basically what we have here is an input string. We feed it into a KaiserTutorial.transpile
method and expect the result to be a snippet of actual valid Ruby code.
These tests will obviously fail horribly if we try to run them right now, so it’s time to make them work.
Parsing the Rockstar Code with Parslet
The Parslet gem helps us transform text into a tree of names and values. Let’s start with the RockstarParser class first.
module KaiserTutorial class RockstarParser < Parslet::Parser rule(:proper_word) { match['A-Z'] >> match['A-Za-z'].repeat } rule(:proper_variable_name) { (proper_word >> (space >> proper_word).repeat).repeat(1).as(:variable_name) } rule(:string_as_number) { match['^\n'].repeat.as(:string_as_number) } rule(:poetic_number_keywords) { str('is') | str('was') | str('were') } rule(:poetic_number_literal) do ( proper_variable_name.as(:left) >> space >> poetic_number_keywords >> space >> string_as_number.as(:right) ).as(:assignment) end rule(:space) { match[' \t'].repeat(1) } rule(:string_input) { poetic_number_literal | proper_variable_name } root(:string_input) endend
This might seem a bit cryptic at first glance, so let’s explain it a bit.
Parslet works with something it calls atoms—here we’re using things like match['A-Z']
, which just matches a regular expression, str('is')
, which matches a whole string and >>
, which just means we go forward through the supplied input. And there are also | (alternative)
and repeat
atoms used.
This makes the rules pretty readable, unlike the pure regular expression form which we would need to use otherwise.
All this is put together in the rule
definitions which are then used by the Parslet::Parser
class. You also can nest the rules so that the resulting file is more readable. Finally, the root definition points at the starting point.
Let’s go through the above code line by line.
First, we will handle our proper Rockstar variable name—that’s one or more capitalized words. First rule matches a single capitalized word and the second matches several capitalized words separated by spaces.
rule(:proper_word) { match['A-Z'] >> match['A-Za-z'].repeat } rule(:proper_variable_name) { (proper_word >> (space >> proper_word).repeat).repeat(1).as(:variable_name) }
The next rule is a poetic representation of a number. It’s called “poetic” because that’s just how the Rockstar language calls it. Here in the Parser class, it’s just everything up to the end of the line, stopping at the newline character.
rule(:string_as_number) { match['^\n'].repeat.as(:string_as_number) }
The next set of rules deals with the actual assignment.
First we declare keywords our assignment can have: in Rockstar, Tommy was a rebel
, Tommy is a rebel
, and Tommy were a rebel
are all the same expression that assigns the number 15
to the variable Tommy
. The poetic_number_literal
is then declared in the next rule—as expected, it has a variable name, a keyword and then some value.
rule(:poetic_number_keywords) { str('is') | str('was') | str('were') } rule(:poetic_number_literal) do ( proper_variable_name.as(:left) >> space >> poetic_number_keywords >> space >> string_as_number.as(:right) ).as(:assignment) end
The last three rules are just helpers to make everything easier.
We declare a space—or a bunch of spaces really, as the .repeat(1)
atom makes it require at least one matched element—to simplify the rest of the rules.
rule(:space) { match[' \t'].repeat(1) }
Then we make both of our tests pass. We just list the whole assignment and just the variable name here, so that we can use them in the last line and let the Parslet parser know what it should search for in the input.
rule(:string_input) { poetic_number_literal | proper_variable_name } root(:string_input)
Running the parser on it’s own with the Tommy was a lean mean wrecking machine as the input will result in returning a following hash that matches what we wrote in the :poetic_number_literal rule. The numbers after the strings – @0 and @10 – are not really important to us right now, as they just point at where in the input the string was matched.
{ assignment: { left: { variable_name: "Tommy"@0 }, right: { string_as_number: "a lean mean wrecking machine"@10 } } }
That’s very cool, but also not exactly useful, yet.
Transforming a Tree into Code
To get some actually usable code from the above structure, we need to use another part of Parslet—the Transform class.
module KaiserTutorial class RockstarTransform < Parslet::Transform rule(variable_name: simple(:str)) { |context| parameterize(context[:str]) } rule(string_as_number: simple(:str)) { |context| str_to_num(context[:str]) } rule(assignment: { left: simple(:left), right: simple(:right) }) { "#{left} = #{right}" } def self.parameterize(string) string.to_s.downcase.gsub(/\s+/, '_') end def self.str_to_num(string) string.to_s.split(/\s+/).map { |e| e.length % 10 }.join.to_i endr endend
Since this class doesn’t parse the input and instead deals with the tree hashes from the Parser class, it’s even simpler and easier to read.
First, we take a variable name (that can be more than one word) and snake_case
it. That simple(:str)
is just Parslet’s way to express that this is a singular thing to be parsed and it should not expect anything else there. We have to use the |context|
to be able to access other methods.
rule(variable_name: simple(:str)) { |context| parameterize(context[:str]) } def self.parameterize(string) string.to_s.downcase.gsub(/\s+/, '_')
end
Next we make an actual number out of our set of words by taking the lengths of the words, doing a modulo 10 operation on them and joining them together. This means that metal
will have a value of 5
, while rock and roll
will equal 434
and so on.
rule(string_as_number: simple(:str)) { |context| str_to_num(context[:str]) } def self.str_to_num(string) string.to_s.split(/\s+/).map { |e| e.length % 10 }.join.to_i
end
Then we take an :assignment
hash that has one variable name and one value, and put it together into a valid Ruby string.
rule(assignment: { left: simple(:left), right: simple(:right) }) { "#{left} = #{right}" }
The last thing that I’m going to show here is the KaiserTutorial module in itself, that calls all of our methods. It’s pretty short as well.
require 'parslet'require 'kaiser_tutorial/rockstar_parser'require 'kaiser_tutorial/rockstar_transform' module KaiserTutorial def self.parse(input) KaiserTutorial::RockstarParser.new.parse(input) rescue Parslet::ParseFailed => failure puts failure.parse_failure_cause.ascii_tree end def self.transform(tree) KaiserTutorial::RockstarTransform.new.apply(tree) end def self.transpile(input) transform(parse(input)) endend
This should not require much explanation, so I’m just adding it here for completeness’ sake. And with this, all our tests should finally pass.
Running the Code
Obviously you shouldn’t just believe me that this will work, so we should test it. We can run the rspec
command and see all the tests passing, but there’s also another way—running the console.
I like to update it first so that it runs Pry instead of the basic IRB: Pry is much more useful with code completion and command history. The bin/console
file actually should include instructions on how to do this, but let’s show what goes into that file anyway:
#!/usr/bin/env ruby require "bundler/setup" require "kaiser_tutorial" require "pry" Pry.start
We can then try out our code easily like this:
$ bin/console [1] pry(main)> KaiserTutorial.transpile 'Mary is a great developer' => "mary = 159" [2] pry(main)>
As you can see, it works as expected.
Extending the Parser and Transformer
Since we have our first element of the Rockstar language working, we should continue by adding something a bit more advanced. First, let’s add a print statement, which in Rockstar looks like this—Shout Tommy
. Once again, we’ll start with writing a test for this case:
context 'print statement' do it "prints a variable" do expect(KaiserTutorial.transpile("Shout Tommy")).to eq "puts tommy" endend
Running this test fails, but in a bit of an unexpected way:
Failures: 1) KaiserTutorial print statement prints a variable Failure/Error: expect(KaiserTutorial.transpile("Shout Tommy")).to eq "puts tommy" expected: "puts tommy" got: "shout_tommy"
As you can see, the parser thought we were declaring another variable name, which isn’t really what we want to do here, but it doesn’t know that yet. We should add a separate rule for this, and we have to add it to our :string_input
rule so that the parser actually uses it.
rule(:print_function) do (str('Shout') >> space >> proper_variable_name.as(:output)).as(:print) end rule(:string_input) { print_function | poetic_number_literal | proper_variable_name }
One important thing to notice that can later introduce all sorts of problems and which can be very hard to find and debug is that the ordering of the rules really matters. We have to test for 'Shout '
before we start trying to make a variable out of it, or our parser won’t ever do its thing. Right now we can run the test again and see that it does:
1) KaiserTutorial print statement prints a variable Failure/Error: expect(KaiserTutorial.transpile("Shout Tommy")).to eq "puts tommy" expected: "puts tommy" got: {:print=>{:output=>"tommy"}}
We got our result from the parser, so we can write another transformer rule to make a Ruby statement out of it. Since our parser returns a :print
hash that has only :output
in it, the rule is really simple as well:
rule(print: { output: simple(:output) }) { "puts #{output}" }
And this is all we need to make our tests pass.
Time for Something Less Boring
Let’s finish the first part of the tutorial on making something a bit more advanced, so it’s less boring. So far all we’ve been doing is parsing single strings, but programs usually have more than one of these. We proceed as always.
First, we write a test:
context 'transpiles multiple lines' do let(:input) do <<~END Jane is a dancer World was spinning END end it 'makes multiple lines properly' do expect(KaiserTutorial.transpile(input)).to eq <<~RESULT jane = 16 world = 8 RESULT endend
The squiggly heredocs (<<~END
with the tilde inside—it’s how they’re officially called and they’ve been available from Ruby 2.3 onward) make it so that the following text up to the ending keyword is stripped of the whitespace, so we can indent it nicely and still treat it as if it didn’t have all these additional spaces inside.
This test will also obviously fail, but let’s look at the error output from Parslet for a moment, as it tells us what the parser tried to do and why it failed.
Expected one of [PRINT_FUNCTION, POETIC_NUMBER_LITERAL, PROPER_VARIABLE_NAME] at line 1 char 1. |- Failed to match sequence ('Shout' SPACE output:PROPER_VARIABLE_NAME) at line 1 char 1. | `- Expected "Shout", but got "Jane " at line 1 char 1. |- Failed to match sequence (left:PROPER_VARIABLE_NAME SPACE POETIC_NUMBER_KEYWORDS SPACE right:STRING_AS_NUMBER) at line 1 char 9. | `- Extra input after last repetition at line 1 char 17. | `- Failed to match [^\\n] at line 1 char 17. `- Don't know what to do with " is a danc" at line 1 char 5. makes multiple lines properly (FAILED - 1)
First it tried to match a print function, but it didn’t have ‘Shout’ at the beginning, so it skipped to the next rule, which is assignment. The Line 1 char 17 mentioned in the error message is the new line character, which we explicitly disallowed in the :string_as_number
rule, so this fails as well.
Next, the parser tries to make a proper variable name out of what it got, but it would have to be “Jane Is,” not “Jane is,” so that’s not going to work either. The parser exhausted its rules, so it throws an error.
What do we need to make it work? Obviously, we need to handle lines that end with a newline character. We do that by extending our parser with a few additional rules:
rule(:eol) { match["\n"] } rule(:line) { (string_input >> eol.maybe).as(:line) } rule(:lyrics) { line.repeat.as(:lyrics) } root(:lyrics)
We declare a newline character in rule :eol
to make the next rule look better. Then we take our :string_input
rule and make it a :line
that might or might not end with a newline character. Finally we declare :lyrics
as a set of repeated lines and put that as our root rule (obviously we have to delete our previous root rule as well, we should only ever have one of these). If we run our new test again, we can see that the parser works correctly:
1) KaiserTutorial transpiles multiple lines makes multiple lines properly Failure/Error: expect(KaiserTutorial.transpile(input)).to eq <<~RESULT jane = 16 world = 8 RESULT expected: "jane = 16\nworld = 8\n" got: {:lyrics=>[{:line=>"jane = 16"}, {:line=>"world = 8"}]}
We can see here that the transformer already transformed the contents of the lines into Ruby, but it doesn’t yet know what to do with the :lyrics
or the :line
itself. Fortunately, that’s easy to fix by extending the RockstarTransform
class a little:
rule(line: simple(:line)) { line } rule(lyrics: sequence(:lines)) { lines.size > 1 ? lines.join("\n") + "\n" : lines.join }
The :line
contents is already transformed, so there’s nothing else to do there other than simply output it; and the :lyrics
rule is just joining the lines together with newlines and adding an additional newline at the end if we got more than one line—this is so that we don’t have to go back and fix all of our tests by adding an extra newline to their results.
With this we’ve successfully made our very basic implementation of the Rockstar language. But we still got things to do.
A Common Variable and Its Assignment
Another type of variable in Rockstar is a common variable. It consists of one of the words a or the (and some others too, but we’re only going to use these two now), followed by a lowercase word. Sounds simple? It’s because it is. At the same time we’re going to introduce another type of assignment to make it more interesting.
Put the love into the heart
This should be parsed into the_heart = the_love
. Oh, and by the way—we just simply described what we need our code to do and we almost got another test ready. Now we need to wrap it up with some RSpec. Isn’t TDD great?
context 'common variable name' do it 'converts words to a variable name' do expect(KaiserTutorial.transpile("the world")).to eq "the_world" endend context 'assignment' do it 'assigns variables' do expect(KaiserTutorial.transpile("Put the love into the heart")).to eq "the_heart = the_love" endend
First off, let’s make some variables—we need to match one of our keywords with a space and then just a word with lowercase letters.
rule(:common_variable_name) do ((str('A ') | str('a ') | str('The ') | str('the ')) >> match['[[:lower:]]'].repeat).as(:variable_name) end
We return this to the Transformer as a :variable_name
, so it uses the same rule as it does for the proper variables we introduced before.
Next, we make a rule for the assignment—we can also reuse the Transform :assignment
rule here, so it expects a :left
side and a :right
side, which we have to mark appropriately, as it’s reversed in the Rockstar assignment.
rule(:basic_assignment_expression) do (str('Put ') >> variable_names.as(:right) >> str(' into ') >> variable_names.as(:left)).as(:assignment) end
Finally, we need to update our helper methods for the parser to include our new rules.
rule(:string_input) { print_function | basic_assignment_expression | poetic_number_literal | proper_variable_name | common_variable_name }
This final rule makes our tests pass again.
Publishing the Gem on RubyGems
We are now at a decent point of coding: we have a gem skeleton with our code in it, it does things that we expect it to do, and it even has passing tests—we can finish this post on how to actually release our code into the wild by putting it onto the RubyGems website.
Releasing a gem is a very simple process—all we need is a public git repository with our code and an account on the Rubygems website.
The next step is easy. Thanks to the Bundler gem, rake release
will build the gem, push a release tag onto our repository and make a release on Github, and finally push the package to RubyGems website. It should result in our gem being registered and visible on the website, except that it will throw an error instead.
rake aborted! ERROR: While executing gem ... (Gem::CommandLineError) Too many gem names (/Volumes/Projects/kaiser-tutorial/pkg/kaiser-tutorial-0.1.0.gem, Set, to, http://mygemserver.com); please specify only one
This error means that we left a few critical lines in our .gemspec
on purpose that explicitly prevent pushing to RubyGems, so in a real gem example we would remove them. But we can still easily use our gem locally—we can run rake install:local
instead and then just require 'kaiser_tutorial'
where we want:
$ rake install:local kaiser-tutorial 0.1.0 built to pkg/kaiser-tutorial-0.1.0.gem. kaiser-tutorial (0.1.0) installed. $ pry [1] pry(main)> require 'kaiser_tutorial' => true [2] pry(main)> KaiserTutorial.transpile('Rockstar is a fun programming language') => "rockstar = 1318" [3] pry(main)>
And that’s it for the first part of this article. In the next one, we’ll take a look at some of the more advanced tricks and gotchas of parsing text with Parslet so that we can write and run a pretty simple program in Rockstar. We’ll also use the Thor gem to put together a CLI for our transpiler to make the code usable on its own. And of course, there will be tests for everything.
Polcode is an international full-cycle software house with over 1,300 completed projects. Propelled by passion and ambition, we’ve coded for over 800 businesses across the globe. If you share our passion and want to become a part of our team, contact our HR department. We’ll be happy to answer all your questions and even happier to welcome you aboard 🙂 Or maybe you have an interesting project in mind? If so, drop us an email and let’s talk over the details.
On-demand webinar: Moving Forward From Legacy Systems
We’ll walk you through how to think about an upgrade, refactor, or migration project to your codebase. By the end of this webinar, you’ll have a step-by-step plan to move away from the legacy system.
Latest blog posts
Ready to talk about your project?
Tell us more
Fill out a quick form describing your needs. You can always add details later on and we’ll reply within a day!
Strategic Planning
We go through recommended tools, technologies and frameworks that best fit the challenges you face.
Workshop Kickoff
Once we arrange the formalities, you can meet your Polcode team members and we’ll begin developing your next project.