Commit Graph

3 Commits

Author SHA1 Message Date
Pierre Sassoulas
391c8aa96c Update copyright notice with copyrite 2021-02-28 22:33:34 +01:00
Pierre Sassoulas
ac85223e5b Apply copyrite --contribution-threshold 2021-02-21 16:56:19 +01:00
Frank Harrison
b41e8d940d mapreduce| Fixes -jN for map/reduce Checkers (e.g. SimilarChecker)
This integrate the map/reduce functionality into lint.check_process().

We previously had `map` being invoked, here we add `reduce` support.

We do this by collecting the map-data by worker and then passing it to a
reducer function on the Checker object, if available - determined by
whether they confirm to the `mapreduce_checker.MapReduceMixin` mixin
interface or nor.

This allows Checker objects to function across file-streams when using
multiprocessing/-j2+. For example SimilarChecker needs to be able to
compare data across all files.

The tests, that we also add here, check that a Checker instance returns
and reports expected data and errors, such as error-messages and stats -
at least in a exit-ok (0) situation.

On a personal note, as we are copying more data across process
boundaries, I suspect that the memory implications of this might cause
issues for large projects already running with -jN and duplicate code
detection on. That said, given that it takes a long time to perform
lints of large code bases that is an issue for the [near?] future and
likely to be part of the performance work. Either way but let's get it
working first and deal with memory and perforamnce considerations later
- I say this as there are many quick wins we can make here, e.g.
file-batching, hashing lines, data compression and so on.
2021-01-02 09:56:39 +01:00