Commit Graph

3 Commits

Author SHA1 Message Date
Frank Harrison
b41e8d940d mapreduce| Fixes -jN for map/reduce Checkers (e.g. SimilarChecker)
This integrate the map/reduce functionality into lint.check_process().

We previously had `map` being invoked, here we add `reduce` support.

We do this by collecting the map-data by worker and then passing it to a
reducer function on the Checker object, if available - determined by
whether they confirm to the `mapreduce_checker.MapReduceMixin` mixin
interface or nor.

This allows Checker objects to function across file-streams when using
multiprocessing/-j2+. For example SimilarChecker needs to be able to
compare data across all files.

The tests, that we also add here, check that a Checker instance returns
and reports expected data and errors, such as error-messages and stats -
at least in a exit-ok (0) situation.

On a personal note, as we are copying more data across process
boundaries, I suspect that the memory implications of this might cause
issues for large projects already running with -jN and duplicate code
detection on. That said, given that it takes a long time to perform
lints of large code bases that is an issue for the [near?] future and
likely to be part of the performance work. Either way but let's get it
working first and deal with memory and perforamnce considerations later
- I say this as there are many quick wins we can make here, e.g.
file-batching, hashing lines, data compression and so on.
2021-01-02 09:56:39 +01:00
manderj
136ab0f960 lint multiprocessing pool shutdown 2020-10-30 22:13:16 +01:00
Frank Harrison
2970bf2ff2 check_parallel| Fixes module-hiding by function via file-rename 2020-10-19 12:26:17 -07:00