third_party.pylibs.pylint.src/pylint/checkers/misc.py

# Copyright (c) 2006, 2009-2013 LOGILAB S.A. (Paris, FRANCE) <contact@logilab.fr>
# Copyright (c) 2013-2014 Google, Inc.
# Copyright (c) 2014 Alexandru Coman <fcoman@bitdefender.com>
# Copyright (c) 2014-2016 Claudiu Popa <pcmanticore@gmail.com>

# Licensed under the GPL: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
# For details: https://github.com/PyCQA/pylint/blob/master/COPYING


"""Check source code is ascii only or has an encoding declaration (PEP 263)"""

# pylint: disable=W0511

import re

import six

from pylint.interfaces import IRawChecker
from pylint.checkers import BaseChecker


MSGS = {
    'W0511': ('%s',
              'fixme',
              'Used when a warning note as FIXME or XXX is detected.'),
    'W0512': ('Cannot decode using encoding "%s", unexpected byte at position %d',
              'invalid-encoded-data',
              'Used when a source line cannot be decoded using the specified '
              'source file encoding.',
              {'maxversion': (3, 0)}),
}


class EncodingChecker(BaseChecker):

    """checks for:
    * warning notes in the code like FIXME, XXX
    * encoding issues.
    """
    __implements__ = IRawChecker

    # configuration section name
    name = 'miscellaneous'
    msgs = MSGS

    options = (('notes',
                {'type': 'csv', 'metavar': '<comma separated values>',
                 'default': ('FIXME', 'XXX', 'TODO'),
                 'help': ('List of note tags to take in consideration, '
                          'separated by a comma.')}),)

    def _check_note(self, notes, lineno, line):
        # First, simply check if the notes are in the line at all. This is an
        # optimisation to prevent using the regular expression on every line,
        # but rather only on lines which may actually contain one of the notes.
        # This prevents a pathological problem with lines that are hundreds
        # of thousands of characters long.
        for note in self.config.notes:
            if note in line:
                break
        else:
            return

        match = notes.search(line)
        if not match:
            return
        self.add_message('fixme', args=line[match.start(1):-1], line=lineno)

    def _check_encoding(self, lineno, line, file_encoding):
        try:
            return six.text_type(line, file_encoding)
        except UnicodeDecodeError as ex:
            self.add_message('invalid-encoded-data', line=lineno,
                             args=(file_encoding, ex.args[2]))

    def process_module(self, module):
        """inspect the source file to find encoding problem or fixmes like
        notes
        """
        if self.config.notes:
            notes = re.compile(
                r'.*?#\s*(%s)(:*\s*.*)' % "|".join(self.config.notes))
        else:
            notes = None
        if module.file_encoding:
            encoding = module.file_encoding
        else:
            encoding = 'ascii'

        with module.stream() as stream:
            for lineno, line in enumerate(stream):
                line = self._check_encoding(lineno + 1, line, encoding)
                if line is not None and notes:
                    self._check_note(notes, lineno + 1, line)


def register(linter):
    """required method to auto register this checker"""
    linter.register_checker(EncodingChecker(linter))
Even more granular copyrights (thanks to copyrite) 2016-07-22 21:22:28 +00:00			`# Copyright (c) 2006, 2009-2013 LOGILAB S.A. (Paris, FRANCE) <contact@logilab.fr>`
			`# Copyright (c) 2013-2014 Google, Inc.`
			`# Copyright (c) 2014 Alexandru Coman <fcoman@bitdefender.com>`
			`# Copyright (c) 2014-2016 Claudiu Popa <pcmanticore@gmail.com>`

Add the new shorter license header, including to missing files. Close #894. 2016-06-01 15:11:29 +00:00			`# Licensed under the GPL: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html`
			`# For details: https://github.com/PyCQA/pylint/blob/master/COPYING`

forget the past. forget the past. 2006-04-26 10:48:09 +00:00
Keep a consistent copyright notice across the board. This was changed automatically in #894, but apparently we need to have the copyright notice somewhere. 2016-07-19 14:41:59 +00:00			`"""Check source code is ascii only or has an encoding declaration (PEP 263)"""`

			`# pylint: disable=W0511`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00
pylint fixes 2012-08-22 09:42:40 +00:00			`import re`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00
Fix new import related errors in pylint's codebase. 2015-11-25 13:12:59 +00:00			`import six`

forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`from pylint.interfaces import IRawChecker`
			`from pylint.checkers import BaseChecker`

d-t-w 2010-04-16 15:52:38 +00:00
forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`MSGS = {`
			`'W0511': ('%s',`
Closes #104572: symbolic warning names in output (by Martin Pool) triggered whatever the format using a command line option 2012-09-19 15:15:43 +00:00			`'fixme',`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`'Used when a warning note as FIXME or XXX is detected.'),`
Fix a typo (unexcepted -> unexpected) in the description of the invalid encoding warning. 2013-06-19 11:21:58 +00:00			`'W0512': ('Cannot decode using encoding "%s", unexpected byte at position %d',`
Added a new warning 'invalid-encoded-data' for files that contain an encoding, but whose contents cannot be decoded with this encoding. 2013-06-19 09:43:19 +00:00			`'invalid-encoded-data',`
			`'Used when a source line cannot be decoded using the specified '`
Ignore invalid-encoded-data on python3 It became a syntax error. 2013-06-19 16:27:43 +00:00			`'source file encoding.',`
			`{'maxversion': (3, 0)}),`
Fixing Issue #149 (W0511 false positive) 2014-07-08 22:44:35 +00:00			`}`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00
Added a new warning 'invalid-encoded-data' for files that contain an encoding, but whose contents cannot be decoded with this encoding. 2013-06-19 09:43:19 +00:00
forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`class EncodingChecker(BaseChecker):`
Fixing Issue #149 (W0511 false positive) 2014-07-08 22:44:35 +00:00
d-t-w 2010-04-16 15:52:38 +00:00			`"""checks for:`
			`* warning notes in the code like FIXME, XXX`
Added a new warning 'invalid-encoded-data' for files that contain an encoding, but whose contents cannot be decoded with this encoding. 2013-06-19 09:43:19 +00:00			`* encoding issues.`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`"""`
			`__implements__ = IRawChecker`

			`# configuration section name`
			`name = 'miscellaneous'`
			`msgs = MSGS`

			`options = (('notes',`
Fixing Issue #149 (W0511 false positive) 2014-07-08 22:44:35 +00:00			`{'type': 'csv', 'metavar': '<comma separated values>',`
			`'default': ('FIXME', 'XXX', 'TODO'),`
			`'help': ('List of note tags to take in consideration, '`
			`'separated by a comma.')}),)`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00
Clean up misc checker, use single regex instead of several ones. 2013-06-19 08:59:37 +00:00			`def _check_note(self, notes, lineno, line):`
Notes (TODO, XXX etc) are now searched for using a simple `in` before resorting to the regular expression, in order to avoid using the regexp on every line and to prevent a pathological problem for extremely long lines (50k+ characters) 2014-08-21 14:02:41 +00:00			`# First, simply check if the notes are in the line at all. This is an`
			`# optimisation to prevent using the regular expression on every line,`
			`# but rather only on lines which may actually contain one of the notes.`
			`# This prevents a pathological problem with lines that are hundreds`
			`# of thousands of characters long.`
			`for note in self.config.notes:`
			`if note in line:`
			`break`
			`else:`
			`return`

Code refactoring. 2014-07-09 08:06:37 +00:00			`match = notes.search(line)`
			`if not match:`
			`return`
			`self.add_message('fixme', args=line[match.start(1):-1], line=lineno)`
Added a new warning 'invalid-encoded-data' for files that contain an encoding, but whose contents cannot be decoded with this encoding. 2013-06-19 09:43:19 +00:00
			`def _check_encoding(self, lineno, line, file_encoding):`
			`try:`
Don't call unicode() directly --HG-- branch : python_6 2014-08-29 16:26:21 +00:00			`return six.text_type(line, file_encoding)`
Modernize to the point of working for Python 2.7 still --HG-- branch : python_6 2014-08-29 15:16:29 +00:00			`except UnicodeDecodeError as ex:`
Only emit symbolic warnings from the misc checker. 2014-04-17 09:07:03 +00:00			`self.add_message('invalid-encoded-data', line=lineno,`
some pylint and style fixes 2013-07-31 07:05:01 +00:00			`args=(file_encoding, ex.args[2]))`
Added a new warning 'invalid-encoded-data' for files that contain an encoding, but whose contents cannot be decoded with this encoding. 2013-06-19 09:43:19 +00:00
			`def process_module(self, module):`
python3: deal with astroid's module.file_stream returning bytes Use tokenize.tokenize() which wants a byte stream. Everywhere else, decode as necessary. 2013-06-19 11:31:16 +00:00			`"""inspect the source file to find encoding problem or fixmes like`
forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`notes`
			`"""`
Do not emit [fixme] for every line if the config value 'notes' is empty, but [fixme] is enabled. Also added a very basic test for checkers/misc.py. 2013-07-24 09:17:37 +00:00			`if self.config.notes:`
Fix a false positive regarding W0511. Closes issue #149. 2014-07-09 08:26:27 +00:00			`notes = re.compile(`
Fix #793 [MISCELLANEOUS] code tag without message doesn't track 2016-01-23 13:52:54 +00:00			`r'.?#\s(%s)(:\s.*)' % "\|".join(self.config.notes))`
Do not emit [fixme] for every line if the config value 'notes' is empty, but [fixme] is enabled. Also added a very basic test for checkers/misc.py. 2013-07-24 09:17:37 +00:00			`else:`
			`notes = None`
Added a new warning 'invalid-encoded-data' for files that contain an encoding, but whose contents cannot be decoded with this encoding. 2013-06-19 09:43:19 +00:00			`if module.file_encoding:`
			`encoding = module.file_encoding`
			`else:`
			`encoding = 'ascii'`
Fixing Issue #149 (W0511 false positive) 2014-07-08 22:44:35 +00:00
Use the new Module.stream, since Module.file_stream is deprecated. 2015-01-03 08:00:00 +00:00			`with module.stream() as stream:`
			`for lineno, line in enumerate(stream):`
			`line = self._check_encoding(lineno + 1, line, encoding)`
			`if line is not None and notes:`
			`self._check_note(notes, lineno + 1, line)`
Fixing Issue #149 (W0511 false positive) 2014-07-08 22:44:35 +00:00
d-t-w 2010-04-16 15:52:38 +00:00
forget the past. forget the past. 2006-04-26 10:48:09 +00:00			`def register(linter):`
			`"""required method to auto register this checker"""`
			`linter.register_checker(EncodingChecker(linter))`