recursive descent go parser workflow

06-26-2014

The parser is pretty much done :D I only have a couple non-terminals left to finish, and of course all of the bugs to iron out, but I never thought I would get to this point! So happy right now.

To sort of immortalize how I spent the last two weeks of my time, I’m documenting my workflow for writing the parser. I noticed I very quickly fell into a couple of patterns, and I fell into a certain flow of work with it. Which I thought was cool!

If you want to dig into the code, it’s hosted on GitHub.

grammar.txt

First I delineate which part of the grammar I want to implement. After writing the test cases for it, I determine the top level terminal sets for the non-terminals. A top level terminal set is the set of all terminals that a non-terminal can start with (I’m also 90% sure I made the phrase up.) For instance, the top-level terminal set for Index is “[“, and the top level terminal set for PrimaryExprPrime is the union of the top level terminal sets of Selector, Index, Slice, TypeAssertion, and Call.

toplevelsets.go

As you can see, top level sets can be very clearly defined in Go as slices of tokens (if you are unfamiliar with Go, slices are lightweight, resizable arrays.) The append calls simply take a slice and append any number of arguments to that slice. The “…” syntax takes a slice of tokens and splats it as a set of arguments. I use those together to create slices of tokens from other slices. I don’t worry about redundancies: I’m more concerned with communicating the meaning than having the top level sets be as efficient as possible.

This work is done in a top-down fashion: I write the most general of the non-terminals’ top level sets first, creating them from the top level sets of their child grammars.

toplevelsets.go

You can see how the specification for non-terminals is clearly reflected in the syntax for the top level sets in Go.

Creating the top level sets is not challenging, and is more of an exercise in remembering which top level sets are plain tokens, and which are slices of tokens, so I can combine them with the correct syntax.

nodes.go

Then I create all of my node types. Nodes are defined by the following interface:

type Node interface {
        Valid() bool
        Eval()  string
}

Interfaces in Go are similar to interfaces in Java, or pure abstract classes in C++. For a type to fulfill the interface Node, it needs the function Valid() that returns a bool, and Eval() that returns a string. Valid is used to check that a node and its child nodes are constructed correctly. The Erro type always returns false, terminals and simple nodes return true, and all other nodes return true if all of their child nodes also return true and the node itself is well formed. A well formed node is a node that has all the children that are required of it by the grammar.

At first, I did not have the Valid method defined for nodes, but I needed to define it in order to implement backtracking later in the project. By the time I realized I needed the Valid method, I had already defined forty different nodes. It took a lot of willpower and about an hour of emacs/regex magic to create all of the Valid methods :)

The nodes themselves follow pretty easily from the grammar. Almost every non-terminal has its own node because I didn’t want to “simplify” and reduce the number of nodes I had before I had understood how the grammar worked. I’ll chop them down in a refactor.

This is also done in a top-down fashion: I create the more general nodes first and define them as combinations of their child nodes (though the Node type is general, the names I use for a node’s children are significant.)

grammar.go

This is core of the parser, the bit that is actually recursive descent :) This part is also the most fun to write. Each non-terminal gets a function with the same name, but starting with a lowercase letter. Keeping the names consistent with the grammar.txt makes writing the grammar really easy, and makes it easy to see how the grammar function follows from the specification.

If you look at ArgumentList, you can see it consists of an ExpressionList and an optional “…”. This is reflected in the function directly: First, and Args node is created, and then expressionList is used to create the Args’ child node, Exprs. Then, you check to see if the next token is “…”. If it is, then the token is thrown away with the p.next() call, and the DotDotDot field is set to true for Args. Then the node is returned.

Notice how I didn’t check to make sure that it was valid to create an ExpressionList by calling expressionList. Each function assumes that it is in the correct state when it starts, and you must check to make sure that you can start parsing a node before you call the function to create it. This invariant simplifies the code, and is a lesson I learned from writing my lexer, where I did error-checking before and after entering new states.

conclusions

To quote a fellow hacker schooler, “parsers take a while.” This took two weeks, and while I’m happy with how much it parses, it doesn’t parse the entire language, and I’ve yet to shake out all of the bugs. But I’m incredibly happy with what I’ve finished: there was a point where I didn’t think I would finish the parser at all, especially towards the end as I started tackling larger parts of the grammar like statements and expressions. Can’t believe I made it through :) I learned much more about dealing with frustration and tedious work than I thought I would. Parsers are a lot of work!

Having wrapped up my parser, I will never feel guilty for using a parser generator in the future :)

an experiment with focus and exercise

06-25-2014

I have had issues working on the tasks that I need to work on for as long as I can remember. The issue isn’t a lack of focus, but more of an impairment of my ability to change my focus from whatever I’m doing at the moment to what I need to be doing. There are two ways I usually get around this: if I can get hyped for whatever task I need to get done, I can usually sustain a reasonable amount of effort working on it. Otherwise, I drink enough coffee to laser in on what I need to get done. However, there are lots of times when that isn’t enough, especially for frustrating tasks (like writing a parser!) And so I need a little boost.

I’ve been reading about the impact of exercise on mood and productivity, and I’ve decided to replace the infinite stream of blog posts and Wikipedia articles that I seem to find when I’m distracted with running. When I feel myself slipping from the task at hand, I’m going to go outside and run until I can’t anymore. I’m also going to have one scheduled run a day, which I will do in the morning right after check-ins.

I’m going to keep track of the running breaks with Emacs. I’ll note how many breaks I take a day, the task I was supposed to be working on when I took the break, how long I ran, and my mood (1-10 on the completely objective bad-good scale :P) before and after the run. I don’t know how to measure how productive I am, so I’m going to use my mood as a proxy. If anyone has measured their productivity in a way that works, I’d like to hear about it.

I’m also going to start eating away from my laptop and stay hydrated throughout the day by always having a container of water near me. These will confound the data for my running thing, but I’m more interested in increasing my focus and productivity than running a completely legitimate experiment :P

Also, smileys are punctuation :) See? New sentence :D

parsers take a while

06-24-2014

I didn’t do much work Tuesday and I mostly felt overwhelmed by how much I needed to get done. I’ve finished enough of the parser to start playing with the abstract syntax tree, but I want to parse more of the language before I move on. I was reading my previous post about my plan when I started the parser, and I realized that I started to lose track of myself when I stopped writing test cases. So I’m going to write tests for what I want to get done today and tomorrow and set out to finish as much as I can before Thursday, which is my hard deadline. After Thursday, I’m moving on from the parser.

Also, I broke like everything over the weekend haha. Outgoing emails no longer work, and switching to https broke my piwik, so I can’t see how many people read my blog D: I’ll worry about those things when Friday comes.

cached conf reflections

06-23-2014

Cached Conf was a lot of fun! I came out of the “conference” with a ton of ideas, which normally doesn’t happen!

Bret Victor - Inventing on Principle

Good talk! Bret is super into ideas, to the point where he feels like an injustice has been done to the world when an idea isn’t able to grow. Which is interesting! And he showed off a bunch impressive visual programming things. Programming is about building and running a mental model of computation in your head, but Bret thinks that there’s no need to “play computer” if you have a computer sitting on your desk that you can harness it to do the heavy lifting for you. I think there’s some truth to that, because I’ve tutored people before and everybody struggles with thinking about the state the machine is in. Having an immediate visual representation of that state would be an incredible tool for teaching. Maybe it would also help for other tasks, but I’d have to try it myself to form an opinion.

One of his big points is how the motivation for something is more important than the end result. People don’t do significant things just because, they do them as the byproduct of chasing something else. For example, I am guided by a reverence for life, but the “end results” of this are reflected in my diet and my views on software. If you can only process the byproducts of someone’s philosophy, you will not understand why they act the way they do.

Gary Bernhardt - Wat

https://www.destroyallsoftware.com/talks/wat

Funny talk, illuminates how having a complex set of coercion rules may as well be magic: if it’s more clear to make a cast, the non-cast version may as well be undefined. Otherwise, you’re just causing confusion.

Guy Steele - Growing a Language

Eye-opening talk! Guy Steele took every one syllable word as assumed knowledge and could only use longer words after he defined them to draw an analogy to using small programming languages. The meaning of his words were clear, but harder to express topics could be more awkward, and he had to define a significant set of words to use his “language”. The trade off that he made was subtlety and expressiveness for clarity, and that’s true for big vs small languages as well. When you’re working in a large language, you can be more expressive, but only after a significant effort to learn the language and maintain that knowledge. And to draw an analogy to English, the biggest language I know: I use a lot of words where I’m not exactly sure of the meaning, but better express what I’m thinking. When your domain allows this sort of ambiguity, this is alright (or even desirable!), but software bugs thrive on “ambiguous” expressions that mean something different than what you’ve assumed. This complicates the task of debugging, because debugging is the act of going through every one of your assumptions about your code and finding the ones that don’t actually hold, and there are many more corners for bugs to hide in big languages.

His big idea is that a programming language can only be successful if it starts small and grows with its users. Sounds reasonable. What he said about “growing” a language gave me a couple ideas for how I can make my version of Go more extensible, because Go’s designers froze it at 1.0 as a pretty rigid language. An extensible Go would be fun to play around with.

status update

06-22-2014

Two weeks in! Just a quick check in for the things I’ve been doing:

Parser: Almost done with a reasonable subset of the language. Should wrap that up tomorrow so I can move on. I’ve learned a lot about parsing and my ability to grapple with tasks that are simulatenously challenging and tedious! I understand why parser generators exist now.
Daily ritual: I’ve actually gone through my checklist every day. It seems to get me in the right mindset to work because it answers the question of “what do I do” when I get to Hacker School in the morning. Big fan of this.
Github streak: Not happening! Haha, I didn’t really commit (kek) to this, and so there were days (like today!) where it fell apart. It might be fun to dedicate myself to a streak, though. I could easily roll this into my daily ritual by making one of the things I’m supposed to get done every day something that github would track.
Blogging streak: I’ve written a blog post for every day I’ve been here, which kind of blows my mind. I come up with ideas for blog posts more often now, and writing is getting slightly easier. My ideas have become more elaborate, though, and I need to give myself more time to research and draft them out.
Oats: Still eating lots of them!

I’ve been having a blast so far. It blows my mind how good Hacker School is for me. Ya’ll rock.

sim city 2000

06-21-2014

Played sim city 2000 today.

This is an overview of my city:

Look how big it is @_@ I usually give up before my cities grow to this size. My population was at 227k when I stopped playing, and I was in the year 2181. This time, I used the levelling tool to take out mountains and plant rows of houses.

This is the other corner of my city:

I enjoy maximizing the amount I zone the space. And you can see some rubble on the hill from when half my city set on fire!

A close up of Hollywood:

The land is elevated, and I have no other reason for why this corner of the map is Hollywood.

Abandoned buildings:

Though the land around it is so prosperous, Haunted Hill just can’t seem to get itself together. Check out that radiation.

Rich people:

I like how the rich all settle down in the same area. Just like in real life! Their proximity to the police station helps keep out undesirables.

break day

06-20-2014

Hey, another filler blog post. Gotta keep up my streak! I did the jobs challenge today in go. We had to build a phonebook command line applications, and I wrote mine in Go. I thought the task would be super easy, but it uncovered some deficits in my knowledge of Go, specifically when it came to writing and reading from files. I had more trouble with that than I’m comfortable to admit :) But it was a nice change of pace to work on a short, well-contained task. Also got groceries and had like five glasses of soy milk, I swear to god they actually put crack in it.

mental blocks: climbing out of the mud

06-19-2014

I spent a lot of time today spinning my wheels on the design of the parser, and a lot of time trying to get over mental blocks. These are blocks that seem to stem from my anxiety about the percieved difficulty of the task at hand, and feel like the wrong response because the most effective way of removing my anxiety about work is to actually start working. Also, feeling anxious about work that you want to complete probably isn’t healthy!

So I realized today that I would need backtracking to implement range clauses (and a lot of other statements). It was really, really hard to start even thinking about how to add backtracking to my parser on my own. I ended up roping Bert in to pear-design (heh) this addition to the parser, and it was actually a lot of fun, and I felt like we came up with a reasonably clear design. We designed a one dimensional backtracker: once a grammar requested backtracking, that grammar owned the backtracking, and none of the child-grammars of that grammar could use backtracking themselves. We settled on this because it was the easiest to design and implement, with the expectation that we would modify the backtracker if these constraints were too limiting.

In the middle of implementing the range clause after finishing the one dimensional backtracker, I realized that the tracker needed to be two dimensional in order to parse the range clause correctly. This was my second mental block, and I thought it would be best to take a break. I ended up going out with Bert and Mihai to get food (I got bananas, yum), and then with Dana to geek over vegan things (nobody has given me crap for being vegan in nyc! that shouldn’t be a big deal, but it is. ya’ll rock). When I got back, I still wasn’t in the mood to work, so I opened up a new buffer in emacs, wrote “design of 2D backtracker” at the top, and I thought about how much I really didn’t want to work for the next five seconds. Then I put aside those feelings and started designing the two dimensional backtracker.

To my surprise, I actually finished the design without thinking too much about how I didn’t want to do it, and the design for the two dimensional backtracker was much clearer than the one for the one dimensional backtracker. Implementing it was easy, and I thought I was back on track, until I hit my third mental block of the day: a grammar is invalid if its type is an error type, OR if any of its children are error types, OR if any of its children’s children are error types, etc! I wasn’t checking any of grammar’s children’s types, and that meant that I wouldn’t be able to accurately say if a grammar was invalid.

The change involved in order to see if a type is valid is substantial and tedious: each node needs a method that checks to see if any of its children are an error type. And I have like 30 different types of nodes ;_; (there’s probably a fair amount of redundancy, but I’m planning on cutting them down after I finish the parser). And that’s the mental block I’m stuck on now. I went home early to rest. I want to get a lot of sleep tonight and come in early tomorrow morning to start implementing this error checking method.

I am unsure of what my behavior should be when I get to a mental block. I took advantage of the ability to be social and productive with Bert to get over my first mental block, and my “think terrible thoughts and then put them aside” method worked for the second mental block, but I felt as if I had no way to tackle the third mental block, and I basically killed time until presentations. But with all three mental blocks, I still spent a significant amount of time staring at the computer screen feeling overwhelemed by the task at hand. That feels wrong; I want to come up with a systematic way to either get over mental blocks, or sidestep them long enough to come back with a fresh mind.

For those of you reading this who have tackled similar issues, I’d like to hear what has worked for you.

more hacker schoolz

06-18-2014

This is more of a filler blog post before I get back to work. I didn’t get as much done during the day as I wanted to (zero commits on github, broke my streak ;_;). Staying up late with Bert right now. We talked about programming languages and the differences between what appeal to us. Found out that we’re both super into jangle pop (well, I’m super into jangle pop, and he’s just getting started haha). Working on my parser, going to see how many statements I can stuff into it today. If I can parse the main today that would be cool… might be able to present it. Didn’t do as much this week as I did last week, probably because the parser is a much larger project. It’s much easier to slack when the thing I’m trying to accomplish is hard. Morrissey, give me strength.

hacker school, week one

06-17-2014

I implemented a couple non-terminals today, and worked on a midi-generator with Mihai, which was a lot of fun. The project was in C++, and I forgot how much fun it is to write clever C(++, technically, but I don’t normally write in C++) code! Clever things in any language make me giddy with excitement haha… Though I realize they have no place in production, they will always have a place in my heart. <3

I’ve had a fantastic time so far! I love the city, my batchmates, and the fact that I’ve actually gotten work done. I didn’t used to think I could function in a self-directed environment, because of how badly I function in directed ones, but I’ve been really thriving here!

I have been kind of anxious about how large of a project the Go compiler is, though. I’m going to try to come in early tomorrow morning and try to knock out a lot of work before people show up. There’s no harm in cutting down the scope of the project, but it’s something I would rather not do.

Also, a friend show me this, (｢･ω･)｢, and I can’t get over how adorable it is!!! (｢･ω･)｢ (｢･ω･)｢ (｢･ω･)｢

adventures in elisp

recursive descent go parser workflow

grammar.txt

toplevelsets.go

toplevelsets.go

nodes.go

grammar.go

conclusions

an experiment with focus and exercise

parsers take a while

cached conf reflections

Bret Victor - Inventing on Principle

Gary Bernhardt - Wat

Guy Steele - Growing a Language

status update

sim city 2000

break day

mental blocks: climbing out of the mud

more hacker schoolz

hacker school, week one

About