|
| 1 | +aboutcode.federated |
| 2 | +=================== |
| 3 | + |
| 4 | +This is a library of utilities to compute ids and file paths for AboutCode |
| 5 | +federated data based on Package URL |
| 6 | + |
| 7 | + |
| 8 | +Federated data utilities goal is to handle content-defined and hash-addressable |
| 9 | +Package data keyed by PURL stored in many Git repositories. This approach to |
| 10 | +federate decentralized data is called FederatedCode. |
| 11 | + |
| 12 | + |
| 13 | +Overview |
| 14 | +======== |
| 15 | + |
| 16 | +The main design elements for these utilities are: |
| 17 | + |
| 18 | +1. **Data Federation**: A Data Federation is a database, representing a consistent, |
| 19 | +non-overlapping set of data kind clusters (like scans, vulnerabilities or SBOMs) |
| 20 | +across many package ecosystems, aka. PURL types. |
| 21 | +A Federation is similar to a traditional database. |
| 22 | + |
| 23 | +2. **Data Cluster**: A Data Federation contains Data Clusters, where a Data Cluster |
| 24 | +purpose is to store the data of a single kind (like scans) across multiple PURL |
| 25 | +types. The cluster name is the data kind name and is used as the prefix for |
| 26 | +repository names. A Data Cluster is akin to a table in a traditional database. |
| 27 | + |
| 28 | +3. **Data Repository**: A DataCluster contains of one or more Git Data Repository, |
| 29 | +each storing datafiles of the cluster data kind and a one PURL type, spreading |
| 30 | +the datafiles in multiple Data Directories. The name is data-kind +PURL- |
| 31 | +type+hashid. A Repository is similar to a shard or tablespace in a traditionale |
| 32 | +database. |
| 33 | + |
| 34 | +4. **Data Directory**: In a Repository, a Data Directory contains the datafiles for |
| 35 | +PURLs. The directory name PURL-type+hashid |
| 36 | + |
| 37 | +5. **Data File**: This is a Data File of the DataCluster's Data Kind that is |
| 38 | +stored in subdirectories structured after the PURL components:: |
| 39 | + |
| 40 | + namespace/name/version/qualifiers/subpath: |
| 41 | + |
| 42 | +- Either at the level of a PURL name: namespace/name, |
| 43 | +- Or at the PURL version level namespace/name/version, |
| 44 | +- Or at the PURL qualifiers+PURL subpath level. |
| 45 | + |
| 46 | +A Data File can be for instance a JSON scan results file, or a list of PURLs in |
| 47 | +YAML. |
| 48 | + |
| 49 | +For example, a list of PURLs as a Data Kind would stored at the name |
| 50 | +subdirectory level:: |
| 51 | + |
| 52 | + gem-0107/gem/random_password_generator/purls.yml |
| 53 | + |
| 54 | +Or a ScanCode scan as a Data Kind at the version subdirectory level:: |
| 55 | + |
| 56 | + gem-0107/npm/file/3.24.3/scancode.yml |
| 57 | + |
| 58 | + |
| 59 | + |
| 60 | +License |
| 61 | +------- |
| 62 | + |
| 63 | +Copyright (c) AboutCode and others. All rights reserved. |
| 64 | + |
| 65 | +SPDX-License-Identifier: Apache-2.0 |
| 66 | + |
| 67 | +See https://github.com/aboutcode-org/vulnerablecode for support or download. |
| 68 | + |
| 69 | +See https://aboutcode.org for more information about AboutCode OSS projects. |
0 commit comments