What is a lexer?
When you compile a program, it goes through a lot of compilation steps before ending up as the final “machine code” or “binary code”. The very first step in this process is the lexical analysis. Your source file is nothing more than a stream of characters. The lexer breaks this stream into separate “words”, which in computer science we call “tokens”. Each token belongs to some category, like a keyword, constant, variable, …
A lexer breaks the stream of characters up in separate tokens. A token is a conversion unit. Each token belongs to some category, like a keyword, constant, variable, …
Once the lexer has split the stream in tokens, it can go a step further and actually color them. Well, of course that’s not needed in the compilation procedure. But we’re not interested in compilation right now. We just want to use a lexer for syntax highlighting.
A parser goes one level deeper than the lexer. It takes the tokens produced by the lexer and tries to determine if proper sentences have been formed. Herein lies the difference:
A lexer in QScintilla is an instance of the QsciLexer class – or one of its subclasses. QScintilla provides an extensive set of complete lexers for various languages like Python, C/C++, C#, … which can be used out of the box.
This is how you install a lexer on your editor:
# 1. Create a C++ lexer object
self. __lexer= QsciLexerCPP( self. __editor) # 2. Install the lexer onto your editor self. __editor.setLexer( self. __lexer)
Do you remember the class hierarchy from the introduction chapter? Here is a snippet from that figure, applied on this particular
|AVS lexer||Bash lexer|
|Batch lexer||CMake lexer|
|CoffeeScript lexer||C++ lexer|
|C# lexer||IDL lexer|
|CSS lexer||D lexer|
|Diff lexer||Fortran lexer|
|HTML lexer||XML lexer|
|JSON lexer||Lua lexer|
|Makefile lexer||Markdown lexer|
|Matlab file lexer||Octave file lexer|
|Pascal lexer||Perl lexer|
|PO lexer||PostScript lexer|
|POV lexer||Properties lexer|
|Python lexer||Ruby lexer|
|Spice lexer||SQL lexer|
|TCL lexer||TeX lexer|
|Verilog lexer||VHDL lexer|
If you want to create your own custom lexer, you need to subclass the QsciLexerCustom class. It requires more work than simply using a pre-cooked lexer, but you’ll get maximal flexibility ánd you’ll learn how to make some tokens clickable! Every modern IDE has clickable functions and variables making you jump to the original definitions. That’s our goal.
This is how you install a custom lexer on your editor:
# 1a. Subclass QsciLexerCustom [explained on next page...] # 1b. Create a lexer object from your subclass
self. __lexer= MyLexer( self. __editor) # 2. Install the lexer onto your editor self. __editor.setLexer( self. __lexer)