EBNF As a Mental Model to Learn Programming Languages
4 min read

EBNF As a Mental Model to Learn Programming Languages

Learning how syntax can be expressed gives you an extra mental model to learn new programming languages.

What is and why learn EBNF?

At some point, everybody gets syntax errors: a misspelled keyword or a missing semicolon. Yet, have you ever wondered how did you learn the syntax rules? Most of the time, when you start to code, you don't learn all the syntax rules directly. The same way that when you begin to talk or write, you don't understand about subject, verb, direct object, etc. You know the syntax by doing.

But when you want to become proficient at writing, you need to learn and understand the syntax rules: what is a direct object, where is the subject, to identify the verb, etc. Therefore, if you want to become a proficient software engineer, you also need to learn the syntax rules of the programming languages.

EBNF is a language to describe the syntax rules of programming languages.

An example of Python grammar:

  | 'if' named_expression ':' block elif_stmt
  | 'if' named_expression ':' block [else_block]

Syntax versus Semantics

The best example of syntax versus semantics is this sentence in English: "Colorless green ideas sleep furiously." This is a syntactically correct English sentence, but it has no meaning. The syntax is right, but not the semantics.

EBNF only helps to know the syntax, not the semantics. Yet, by understanding syntax, we learn to differentiate between syntax and semantics.

Another example in javascript is:

const a = 10;
a = 20;

The previous code is syntactically valid, yet the semantics of JS doesn't allow reassigning a const variable.

Learning the difference between the syntax and semantics rules gives you more tools and models to learn new languages.


Let's learn more about this language to describe grammar.

EBNF stands for Extended Backus-Narus Form. The Backus-Narus Form first appeared in 1959 in the definition of the ALGOL programming language. The author was John Backus.

EBNF definition of integer

Everything is better with examples. So let's look at the notation of integers in the EBNF.

digit := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
integer := [+|-] digit { digit }

EBNF is a list of rules:

  • Left-hand side: The name of the rule.
  • Right-hand side: The description of the rule.

Note: EBNF is more of a family of notations, they do not always conform to the same characters to denote the same concepts, but the concepts are the same.

  • := denotes the separation of left and right sides
  • | means OR

The digit rule is considered a terminal because it's composed only of tokens, not of other rules, like integer. On the other hand, the rule integer depends on the digit terminal; therefore, it's considered a non-terminal rule.

  • [ and ] mean optional
  • { and } mean zero or more,

In English, the rule for the integer is read as: "An integer is defined as an optional symbol of + or - followed by a digit and possibly by zero, one or more digits.

A few examples:

  • 0 -> is an integer
  • -+10 -> is NOT an integer. Only + or - are valid, not both.
  • -50 -> is an integer
  • 1.54 -> is not an integer. The dot is not correct in the EBNF rules.

EBNF for Python

The Python grammar is listed here in the official docs, but the notation does not follow precisely the EBNF rules. Like they mention in the docs: "The notation is a mixture of EBNF and PEG."

Let's review the example from the beginning:

  | 'if' named_expression ':' block elif_stmt
  | 'if' named_expression ':' block [else_block]

We can see that it has two options:

  • 'if' named_expression ':' block elif_stmt
  • 'if' named_expression ':' block [else_block]

Let's take a look at the second one: 'if' named_expression ':' block [else_block].

It starts with the "if" keyword, then a rule called named_expression. Then a colon ":" then a rule called block and an optional else_block.

The rule accepts the following:

if 10 < 20:
  print(‘in da if”)

But not the following:

if: 10 < 20
  print(‘in da if”)

EBNF for Javascript

We can find a notation of the JS grammar here. For example, the if statement says the following:

IfStatement ::= "if" "(" Expression ")" Statement ( "else" Statement )?

The notation is also not precisely EBNF, but close enough to understand: "if" "(" Expression ")" Statement ( "else" Statement )?

  • It starts with the "if" keyword.
  • Followed by an opening parenthesis "(".
  • Then another rule: Expression.
  • A closing parenthesis ")".
  • Then another rule: Statement.
  • Finally, an optional "else" keyword with another Statement rule.

A question you might be wondering is, isn't else if allowed? The answer is yes, and it's actually accepted in this notation as well: The Statement rule can be another IfStatement. This is why in JS, the notation is else if. The else is an optional keyword in the ifStatement rule, and whatever comes with the new "if" is considered a new statement syntactically.

This is a different approach than the python one, where the elif_stmt was another statement.

EBNF As a Mental Model

You don't need to learn or even know all the rules of EBNF or any other notation. The important aspect is that you understand and have the tools to learn a new programming language faster.

Understanding how grammar is defined gives you a model to apply when learning a new language. You can better understand why something is allowed or not allowed and use it in other scenarios.

If you like this post, consider sharing it with your friends on twitter or forwarding this email to them 🙈

Don't hesitate to reach out to me if you have any questions or see an error. I highly appreciate it.

And thanks to Sebastià for reviewing this article 🙏

Further Readings

Thanks for reading, don't be a stranger 👋

GIMTEC is the newsletter I wish I had earlier in my software engineering career.

Every other Wednesday, I share an article on a topic that you won't learn at work.

Join more than 3,000 subscribers below.

Thanks for subscribing! A confirmation email has been sent.

Check the SPAM folder if you don't receive it shortly.

Sorry, there was an error 🤫.

Try again and contact me at llorenc[at]gimtec.io if it doesn't work. Thanks!