|
1 | | -# Context |
2 | 1 |
|
3 | | -We build and release software by massively consuming and producing software |
4 | | -packages such as npm packages, RPMs, Ruby gems, etc. |
5 | | - |
6 | | -Each package manager, platform, type or ecosystem has its own conventions and |
7 | | -protocols to identify, locate and provision software packages. |
8 | | - |
9 | | -# Problem |
10 | | - |
11 | | -When tools, APIs and databases process or store multiple package types, it is |
12 | | -difficult to reference the same software package across tools in a uniform way. |
13 | | - |
14 | | -For example, these tools, specifications and API use relatively similar |
15 | | -approaches to identify and locate software packages, each with subtle |
16 | | -differences in syntax, naming and conventions: |
17 | | - |
18 | | -- Grafeas uses a scheme, namespace, name and version in a URL-like string. |
19 | | - https://github.com/Grafeas/Grafeas |
20 | | - |
21 | | -- Here.com OSRK uses a package manager, name and version field and a colon- |
22 | | - separated URL-like string |
23 | | - https://github.com/heremaps/oss-review-toolkit |
24 | | - |
25 | | -- JFrog XRay uses a scheme, namespace, name and version in a URL-like string |
26 | | - https://www.jfrog.com/confluence/display/XRAY/Xray+REST+API#XrayRESTAPI-ComponentIdentifiers |
27 | | - |
28 | | -- Libraries.io uses a platform, name and version |
29 | | - https://libraries.io/ |
30 | | - |
31 | | -- OpenShift fabric8 analytics uses ecosystem, name and version |
32 | | - https://github.com/fabric8-analytics/ |
33 | | - |
34 | | -- ScanCode and AboutCode.org use a type, name and version |
35 | | - https://github.com/nexB/scancode-toolkit |
36 | | - |
37 | | -- SPDX has an appendix for external repository references and uses a type and a |
38 | | - locator with a type-specific syntax for component separators in a URL-like |
39 | | - string |
40 | | - https://spdx.github.io/spdx-spec/latest/package-information/ |
41 | | - |
42 | | -- versioneye uses a type, name and version |
43 | | - https://github.com/versioneye/ |
44 | | - |
45 | | -- Sonatype Lifecycle uses a format id followed by format specific coordinates. |
46 | | - https://links.sonatype.com/products/nxiq/doc/component-identifier |
47 | | - |
48 | | -# Solution |
49 | | - |
50 | | -A `purl` or package URL is an attempt to standardize existing approaches to |
51 | | -reliably identify and locate software packages. |
52 | | - |
53 | | -A `purl` is a URL string used to identify and locate a software package in a |
54 | | -mostly universal and uniform way across programming languages, package managers, |
55 | | -packaging conventions, tools, APIs and databases. |
56 | | - |
57 | | -Such a package URL is useful to reliably reference the same software package |
| 2 | +# Package-URL (PURL) Specification |
| 3 | + |
| 4 | +Software ecosystems have evolved into highly interconnected networks of |
| 5 | +components, packages, and dependencies. Managing this complexity demands a |
| 6 | +robust, uniform mechanism to identify and track software packages across |
| 7 | +diverse ecosystems and tools. Package-URL (PURL) was developed to address |
| 8 | +this challenge by providing a simple, consistent, and flexible approach to |
| 9 | +identifying software packages with precision and clarity. |
| 10 | + |
| 11 | +PURL introduces a standardized URL-based syntax that uniquely identifies |
| 12 | +software packages, independent of their ecosystem or distribution channel. |
| 13 | +Unlike traditional identification methods, PURL embeds critical metadata |
| 14 | +directly into its structure, enabling efficient, accurate package |
| 15 | +identification at scale. This standardization ensures interoperability |
| 16 | +between tools and ecosystems, fostering greater collaboration and reducing |
| 17 | +ambiguity in software supply chain management. |
| 18 | + |
| 19 | +Challenges addressed by PURL: |
| 20 | + |
| 21 | +- **Ambiguity in Package Identification:** With diverse naming conventions |
| 22 | + across ecosystems, identifying software packages reliably has historically |
| 23 | + been a challenge. PURL eliminates this ambiguity by creating a universal |
| 24 | + identifier with a predictable structure. |
| 25 | +- **Cross-Ecosystem Interoperability:** Developers, organizations, and tools |
| 26 | + often work across multiple ecosystems, each with its own package management |
| 27 | + systems. PURL harmonizes these differences, enabling seamless |
| 28 | + interoperability. |
| 29 | +- **Enhanced Traceability and Risk Management:** In an era where supply chain |
| 30 | + security is critical, PURL provides the foundation for identifying and |
| 31 | + tracing packages to their origins, dependencies, and potential |
| 32 | + vulnerabilities. |
| 33 | +- **Tooling and Automation:** By standardizing package identification, PURL |
| 34 | + simplifies tooling development, automation, and integration for tasks such |
| 35 | + as software composition analysis, vulnerability management, and license |
| 36 | + compliance. |
| 37 | + |
| 38 | +PURL is an Ecma standard: [ECMA-427](https://tc54.org/purl/), and is |
| 39 | + in process to also become an ISO standard. |
| 40 | + |
| 41 | +## Why use PURL? |
| 42 | + |
| 43 | +A PURL is useful to reliably reference the same software package |
58 | 44 | using a simple and expressive syntax and conventions based on familiar URLs. |
59 | 45 |
|
60 | | -Check also this short `purl` presentation (with video) at FOSDEM 2018 |
61 | | -https://fosdem.org/2018/schedule/event/purl/ for an overview. |
| 46 | +PURL is used as a standard identifier for software components in: |
| 47 | +- A CycloneDX or SPDX SBOM |
| 48 | +- Most software vulnerability databases such as [OSV](https://osv.dev/), |
| 49 | +[Sonatype OSS Index](https://ossindex.sonatype.org/), and [VulnerablCode](https://public2.vulnerablecode.io/) |
| 50 | +- Many package repositories, such as [Crates.io](https://crates.io/) and |
| 51 | +[Packagist](https://packagist.org/) |
62 | 52 |
|
63 | | -## purl |
| 53 | +PURL was recently added to the standard [CVE Record Format v5.2.0](https://github.com/CVEProject/cve-schema/releases/tag/v5.2.0). |
64 | 54 |
|
65 | | -`purl` stands for **package URL**. |
| 55 | +## Getting started |
66 | 56 |
|
67 | | -A `purl` is a URL composed of seven components: |
| 57 | +A PURL is a URL composed of seven components: |
68 | 58 |
|
69 | 59 | scheme:type/namespace/name@version?qualifiers#subpath |
70 | 60 |
|
71 | 61 | Components are separated by a specific character for unambiguous parsing. |
72 | 62 |
|
73 | | -The definition for each components is: |
74 | | - |
75 | | -- **scheme**: this is the URL scheme with the constant value of "pkg". One of |
76 | | - the primary reason for this single scheme is to facilitate the future official |
77 | | - registration of the "pkg" scheme for package URLs. Required. |
78 | | -- **type**: the package "type" or package "protocol" such as maven, npm, nuget, |
79 | | - gem, pypi, etc. Required. |
80 | | -- **namespace**: some name prefix such as a Maven groupid, a Docker image owner, |
81 | | - a GitHub user or organization. Optional and type-specific. |
82 | | -- **name**: the name of the package. Required. |
83 | | -- **version**: the version of the package. Optional. |
84 | | -- **qualifiers**: extra qualifying data for a package such as an OS, |
85 | | - architecture, a distro, etc. Optional and type-specific. |
86 | | -- **subpath**: extra subpath within a package, relative to the package root. |
87 | | - Optional. |
| 63 | +The definition for each component is: |
88 | 64 |
|
89 | | -Components are designed such that they form a hierarchy from the most significant component |
90 | | -on the left to the least significant component on the right. |
| 65 | +| Component | Requirement | Description| |
| 66 | +| ---------- | ----------- |:------------------------------------------------------ | |
| 67 | +| scheme | Required | The URL scheme with the constant value of "pkg". One of the primary reasons for this single scheme is to facilitate the future official registration of the "pkg" scheme for Package-URLs. | |
| 68 | +| type | Required | The package "type" or package "protocol" such as maven, npm, nuget, gem, pypi, etc. | |
| 69 | +| namespace | Optional | A name prefix such as a Maven groupid, a Docker image owner, a GitHub user or organization. Namespace is type-specific. | |
| 70 | +| name | Required | The name of the package. | |
| 71 | +| version | Optional | The version of the package. | |
| 72 | +| qualifiers | Optional | Qualifier data for a package such as OS, architecture, repository, etc. Qualifiers are type-specific. | |
| 73 | +| subpath | Optional | Subpath within a package, relative to the package root. | |
91 | 74 |
|
92 | | -A `purl` must NOT contain a URL Authority i.e. there is no support for |
93 | | -`username`, `password`, `host` and `port` components. A `namespace` segment may |
94 | | -sometimes look like a `host` but its interpretation is specific to a `type`. |
| 75 | +Components are designed such that they form a hierarchy from the most |
| 76 | +significant component on the left to the least significant component on the |
| 77 | +right. |
95 | 78 |
|
96 | | -## Some `purl` examples |
97 | | - |
98 | | - pkg:bitbucket/birkenfeld/pygments-main@244fd47e07d1014f0aed9c |
| 79 | +### Some PURL examples |
99 | 80 |
|
100 | 81 | pkg:deb/debian/curl@7.50.3-1?arch=i386&distro=jessie |
101 | | - |
102 | 82 | pkg:docker/cassandra@sha256:244fd47e07d1004f0aed9c |
103 | | - pkg:docker/customer/dockerimage@sha256:244fd47e07d1004f0aed9c?repository_url=gcr.io |
104 | | - |
105 | 83 | pkg:gem/jruby-launcher@1.1.2?platform=java |
106 | | - pkg:gem/ruby-advisory-db-check@0.12.4 |
| 84 | + pkg:golang/google.golang.org/genproto#googleapis/api/annotations |
| 85 | + pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?repository_url=repo.spring.io%2Frelease&packaging=sources |
| 86 | + pkg:npm/%40angular/animation@12.3.1 |
| 87 | + pkg:nuget/EnterpriseLibrary.Common@6.0.1304 |
| 88 | + pkg:pypi/django@1.11.1 |
| 89 | + pkg:rpm/fedora/curl@7.50.3-1.fc25?arch=i386&distro=fedora-25 |
| 90 | + pkg:rpm/opensuse/curl@7.56.1-1.1.?arch=i386&distro=opensuse-tumbleweed |
107 | 91 |
|
108 | | - pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c |
| 92 | +(NB: the checksum for the docker example is truncated for brevity) |
109 | 93 |
|
110 | | - pkg:golang/google.golang.org/genproto#googleapis/api/annotations |
| 94 | +### PURL specification |
111 | 95 |
|
112 | | - pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?packaging=sources |
113 | | - pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?repository_url=repo.spring.io/release |
| 96 | +The PURL specification consists of a core syntax definition and specific |
| 97 | +PURL type definitions: |
114 | 98 |
|
115 | | - pkg:npm/%40angular/animation@12.3.1 |
116 | | - pkg:npm/foobar@12.3.1 |
| 99 | +The core PURL syntax is defined in the Package-URL Specification / [ECMA-427](https://tc54.org/purl/). See ECMA-427 *Clause 5 Package-URL specification* |
| 100 | +for syntax details. |
117 | 101 |
|
118 | | - pkg:nuget/EnterpriseLibrary.Common@6.0.1304 |
| 102 | +Each package manager, platform, type, or ecosystem has its own conventions |
| 103 | +and protocols to identify, locate, and provision software packages. The |
| 104 | +package **type** is the component of a Package-URL that is used to capture |
| 105 | +this information with a short string such as **maven**, **npm** and **pypi**. |
| 106 | +See ECMA-427 *Clause 6 Package-URL Type Definition Schema* for PURL type |
| 107 | +definition details. |
119 | 108 |
|
120 | | - pkg:pypi/django@1.11.1 |
| 109 | +## Package-URL type definitions |
121 | 110 |
|
122 | | - pkg:rpm/fedora/curl@7.50.3-1.fc25?arch=i386&distro=fedora-25 |
123 | | - pkg:rpm/opensuse/curl@7.56.1-1.1.?arch=i386&distro=opensuse-tumbleweed |
| 111 | +PURL type definitions are maintained in a set of JSON Schema files with a |
| 112 | +separate file for each PURL **type**. and a simple index of all currently registered PURL types. You can find comprehensive PURL type information in |
| 113 | +this repository as follows: |
124 | 114 |
|
125 | | -(NB: some checksums are truncated for brevity) |
| 115 | +- One JSON file for each PURL type definition at: |
| 116 | + https://github.com/package-url/purl-spec/tree/main/types |
126 | 117 |
|
127 | | -## Specification details |
| 118 | +- Markdown documentation, generated from each PURL type JSON |
| 119 | + definition at: |
| 120 | + https://github.com/package-url/purl-spec/tree/main/types-doc |
128 | 121 |
|
129 | | -The `purl` specification consists of a core syntax definition and independent |
130 | | -type definitions: |
| 122 | +- The JSON Index listing of all registered PURL types at: |
| 123 | + https://github.com/package-url/purl-spec/tree/main/purl-types-index.json |
131 | 124 |
|
132 | | -- `Package URL core <PURL-SPECIFICATION.rst>`_: Defines a versioned and |
133 | | - formalized format, syntax, and rules used to represent and validate `purl`. |
| 125 | +## Adopters |
134 | 126 |
|
135 | | -- `Type definitions <PURL-TYPES.md>`_: Defines `purl` types (e.g. maven, npm, |
136 | | - cargo, rpm, etc) independent of the core specification. Definitions also |
137 | | - include types reserved for future use. |
| 127 | +See the *ADOPTERS.md* file in this repository for a partial list of FOSS projects and software companies who have adopted PURL as a standard software identfier. |
138 | 128 |
|
139 | | -## Known implementations |
| 129 | +## Implementations |
| 130 | + |
| 131 | +In addition to the broad and growing adoption of PURL as a standard software |
| 132 | +identifier, there are many FOSS projects that implement PURL for languages or software ecosystems. A partial list is: |
140 | 133 |
|
141 | 134 | - .NET: https://github.com/package-url/packageurl-dotnet |
142 | 135 | - Erlang / Elixir: https://github.com/erlef/purl |
143 | 136 | - Go: https://github.com/package-url/packageurl-go |
144 | | -- Java: https://github.com/package-url/packageurl-java, |
145 | | - https://github.com/sonatype/package-url-java |
| 137 | +- Java: https://github.com/package-url/packageurl-java |
146 | 138 | - JavaScript: https://github.com/package-url/packageurl-js |
147 | 139 | - Kotlin: https://github.com/iseki0/PUrlKt |
148 | 140 | - Perl: https://github.com/giterlizzi/perl-URI-PackageURL |
149 | 141 | - PHP: https://github.com/package-url/packageurl-php |
150 | 142 | - Python: https://github.com/package-url/packageurl-python |
| 143 | +- Raku: https://github.com/lizmat/PURL |
151 | 144 | - Ruby: https://github.com/package-url/packageurl-ruby |
152 | 145 | - Rust: https://github.com/package-url/packageurl.rs |
153 | 146 | - Swift: https://github.com/package-url/packageurl-swift |
154 | 147 |
|
155 | | -## Users, adopters and links |
| 148 | +## Support |
| 149 | + |
| 150 | +If you have a specific problem, suggestion or bug, please submit a |
| 151 | +[GitHub issue](https://github.com/package-url/purl-spec/issues). |
| 152 | + |
| 153 | +For quick questions or socializing, join the PURL community discussions |
| 154 | +at: |
| 155 | +- [AboutCode Slack](https://aboutcode-org.slack.com/archives/C08LBQMA7DE) |
| 156 | +- [CycloneDX Slack](https://cyclonedx.slack.com/archives/C06KTE3BWEB) |
| 157 | +- [Gitter](https://matrix.to/#/#package-url_Lobby:gitter.im) |
| 158 | + |
| 159 | +## License |
| 160 | + |
| 161 | +Copyright (c) the purl authors |
| 162 | + |
| 163 | +License for the **purl-spec** software: |
| 164 | + |
| 165 | +- SPDX-License-Identifier: MIT |
156 | 166 |
|
157 | | -See the [dedicated adopters list](ADOPTERS.md). |
| 167 | +License for the ECMA-427 standard: |
| 168 | +- SPDX-License-Identifier: LicenseRef-scancode-ecma-standard-copyright-2024 |
0 commit comments