Multiline highlighting

Coloring multiple lines

Using some clever programming, we will attempt to color multiline comments green. Let’s consider the following code snippet:

    def styleText(self, start, end):
        # 1. Initialize the styling procedure
        # ------------------------------------

        # 2. Slice out a part from the text
        # ----------------------------------
        text = self.parent().text()[start:end]

        # 3. Tokenize the text
        # ---------------------
        p = re.compile(r"[*]\/|\/[*]|\s+|\w+|\W")

        # 'token_list' is a list of tuples: (token_name, token_len)
        token_list = [ (token, len(bytearray(token, "utf-8"))) for token in p.findall(text)]

        # 4. Style the text in a loop
        # ----------------------------
        # self.setStyling(number_of_chars, style_nr)
        multiline_comm_flag = False
        for i, token in enumerate(token_list):

            if multiline_comm_flag:
                self.setStyling(token[1], 3)

                if token[0] == "*/":
                    multiline_comm_flag = False


                if token[0] in ["for", "while", "return", "int", "include"]:
                    # Red style
                    self.setStyling(token[1], 1)

                elif token[0] in ["(", ")", "{", "}", "[", "]", "#"]:
                    # Blue style
                    self.setStyling(token[1], 2)

                elif token[0] == "/*":
                    multiline_comm_flag = True
                    self.setStyling(token[1], 3)

                    # Default style
                    self.setStyling(token[1], 0)


Before starting to loop, we create a multiline_comm_flag variable. If the token /* is encountered, the flag is set True. From that moment on, all next tokens get colored green. By the way, I’ve made a new style for this purpose: style 3 is bold and green.
Once the token */ gets encountered, the multiline comment stops.

Please note that I’ve changed the regex a little. Normally the symbols / and * would be in two separate tokens. By adding the entry \/[*] to the regex, the two symbols appear in one single token /*. I also added an entry to get */ in a single token.

The result is satisfying:


Looking back

There is still a huge problem in the code above. When you copy-paste a big chunk of code into the editor, everything is okay. But try to type an extra multiline comment in the editor and you will notice that it doesn’t get highlighted properly.
Remember the start and end parameters? That’s right. QScintilla provides them as a suggestion, such that you can perform syntax highlighting on a small piece of the text. So if you’re starting from the middle of a multiline comment, how would you know?

The answer is: look back. You have to ask QScintilla what the style was of the previous character. If it is styled according to the multiline comment style, then you need to set the flag before starting the highlight loop.

        multiline_comm_flag = False
        editor = self.parent()
        if start > 0:
            previous_style_nr = editor.SendScintilla(editor.SCI_GETSTYLEAT, start - 1)
            if previous_style_nr == 3:
                multiline_comm_flag = True