Creating Your Own Programming Language Using Python
Ohidur Rahman Bappy
MAR 22, 2025
Introduction
In this guide, you'll learn how to create your own programming language using Python. We'll use SLY (Sly Lex-Yacc) to simplify the process of lexical analysis and parsing.
Install SLY
Start by installing SLY for Python:
pip install sly
Building a Lexer
The first phase of a compiler is to convert character streams into token streams through lexical analysis. SLY simplifies this process.
First, import the necessary module:
from sly import Lexer
Create a BasicLexer
class that extends the Lexer
class. This lexer will handle simple arithmetic operations, requiring tokens such as NAME
, NUMBER
, and STRING
. Define an ignore literal for spaces and line comments:
class BasicLexer(Lexer):
tokens = { NAME, NUMBER, STRING }
ignore = '\t '
literals = { '=', '+', '-', '/', '*', '(', ')', ',', ';' }
NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
STRING = r'".*?"'
@_(r'\d+')
def NUMBER(self, t):
t.value = int(t.value)
return t
@_(r'//.*')
def COMMENT(self, t):
pass
@_(r'\n+')
def newline(self, t):
self.lineno = t.value.count('\n')
Building a Parser
Import the Parser module:
from sly import Parser
Create a BasicParser
class extending the Parser
class. Pass the token stream from BasicLexer
and set precedence rules:
class BasicParser(Parser):
tokens = BasicLexer.tokens
precedence = (
('left', '+', '-'),
('left', '*', '/'),
('right', 'UMINUS'),
)
def __init__(self):
self.env = {}
@_("")
def statement(self, p):
pass
@_("var_assign")
def statement(self, p):
return p.var_assign
@_("NAME '=' expr")
def var_assign(self, p):
return ('var_assign', p.NAME, p.expr)
@_("NAME '=' STRING")
def var_assign(self, p):
return ('var_assign', p.NAME, p.STRING)
@_("expr")
def statement(self, p):
return p.expr
@_("expr '+' expr")
def expr(self, p):
return ('add', p.expr0, p.expr1)
@_("expr '-' expr")
def expr(self, p):
return ('sub', p.expr0, p.expr1)
@_("expr '*' expr")
def expr(self, p):
return ('mul', p.expr0, p.expr1)
@_("expr '/' expr")
def expr(self, p):
return ('div', p.expr0, p.expr1)
@_("'-' expr %prec UMINUS")
def expr(self, p):
return p.expr
@_("NAME")
def expr(self, p):
return ('var', p.NAME)
@_("NUMBER")
def expr(self, p):
return ('num', p.NUMBER)
By parsing arithmetic operations, you can create expressions that return parse trees. For example:
GFG Language > a = 10
GFG Language > b = 20
GFG Language > a + b
30
Execution
The interpreter takes the parse tree, evaluates it hierarchically, and retrieves the final result:
class BasicExecute:
def __init__(self, tree, env):
self.env = env
result = self.walkTree(tree)
if result is not None and isinstance(result, int):
print(result)
if isinstance(result, str) and result[0] == '"':
print(result)
def walkTree(self, node):
if isinstance(node, int):
return node
if isinstance(node, str):
return node
if node is None:
return None
if node[0] == 'program':
if node[1] is None:
self.walkTree(node[2])
else:
self.walkTree(node[1])
self.walkTree(node[2])
if node[0] == 'num':
return node[1]
if node[0] == 'str':
return node[1]
if node[0] == 'add':
return self.walkTree(node[1]) + self.walkTree(node[2])
elif node[0] == 'sub':
return self.walkTree(node[1]) - self.walkTree(node[2])
elif node[0] == 'mul':
return self.walkTree(node[1]) * self.walkTree(node[2])
elif node[0] == 'div':
return self.walkTree(node[1]) / self.walkTree(node[2])
if node[0] == 'var_assign':
self.env[node[1]] = self.walkTree(node[2])
return node[1]
if node[0] == 'var':
try:
return self.env[node[1]]
except LookupError:
print("Undefined variable '"+node[1]+"' found!")
return 0
Displaying the Output
To display the interpreter's output, integrate the lexer, parser, and execution:
if __name__ == '__main__':
lexer = BasicLexer()
parser = BasicParser()
print('GFG Language')
env = {}
while True:
try:
text = input('GFG Language > ')
except EOFError:
break
if text:
tree = parser.parse(lexer.tokenize(text))
BasicExecute(tree, env)
SLY will handle errors if any of your inputs don't match the defined rules.
To run your program, use:
python your_program_name.py
This setup provides a foundational understanding of creating a basic programming language using Python and SLY.
Source: GeeksforGeeks