Integration System documentation!¶
Packages¶
scheme - package with classes to validate data scheme
metrics - package with classes that implements various distance metrics
comparators - package with classes that applies metrics to compare entries
content_parser - package with classes to extract data from heterogeneous data sources
entity_resolution - package to resolve entities after extraction
entity_resolution.blocking - package with classes that implements blocking of entries
entity_resolution.pairwise_matching - package with classes that performs pairwise matching
entity_resolution.clustering - package with classes that clusters entries, to identify entity
Supplementary Directories:
Directory |
Description |
|---|---|
test |
collection of tests, that follows the structure of the project |
bin |
collection of starter scripts, to run stages of integration |
config |
configuration of integration system |
config/extraction_rule |
collection of rules used to extract data from sources |
config/schemas |
collection of data schemas that describe domains in which integration system operates |
data |
directory that contains temporary supplemental data |
Architecture: