Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,8 @@ TODO: most of these are way underspecified
- /ipcidr
- /dns4, /dns6
- [/dnsaddr](protocols/DNSADDR.md)
- [/unix](protocols/unix.md)
- [/unix-abstract](protocols/unix.md)
- /tcp
- /udp
- /utp
Expand Down
6 changes: 5 additions & 1 deletion protocols.csv
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ code, size, name, comment
4, 32, ip4,
6, 16, tcp,
273, 16, udp,
403, 0, stream,
404, 0, seqpacket,
405, 0, dgram,
Comment on lines +5 to +7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nabijaczleweli these seem to be one-off.

multiformats/multicodec#382 uses 402-404, this uses 403-405, need to be unified and i assume we dont want a gap so this PR should be

Suggested change
403, 0, stream,
404, 0, seqpacket,
405, 0, dgram,
402, 0, stream,
403, 0, seqpacket,
404, 0, dgram,

?

33, 16, dccp,
41, 128, ip6,
42, V, ip6zone, rfc4007 IPv6 zone
Expand All @@ -13,7 +16,8 @@ code, size, name, comment
132, 16, sctp,
301, 0, udt,
302, 0, utp,
400, V, unix, Percent-encoded path to a Unix domain socket
400, V, unix, Percent-encoded path to a unix-domain socket
401, V, unix-abstract, Percent-encoded address of a linux abstract unix-domain socket
421, V, p2p, preferred over /ipfs
421, V, ipfs, backwards compatibility; equivalent to /p2p
444, 96, onion,
Expand Down
33 changes: 30 additions & 3 deletions protocols/unix.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# `unix`

This protocol encodes a Unix domain socket path to a resource. In the string
This protocol encodes a Unix-domain socket path to a resource. In the string
representation, the path is encoded in a way consistent with a single URI Path
segment per [RFC 3986 Section 3.3](https://datatracker.ietf.org/doc/html/rfc3986#autoid-23).

Expand All @@ -10,6 +10,9 @@ representation, no encoding is needed as the value is length prefixed.
When comparing multiaddrs, implementations should compare their binary
representation to avoid ambiguities over which characters were escaped.

The absence of a `/` character at the start of the decoded address indicates a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The absence of a `/` character at the start of the decoded address indicates a
The absence of a `/` character (encoded as `%2F`) at the start of the decoded address indicates a

relative path, otherwise the path is absolute.

## Examples

The following is a table of examples converting some common Unix paths to their
Expand All @@ -33,5 +36,29 @@ Multiaddr string form.
appear anywhere, for example in the case where we route through some sort of
proxy server or SSH tunnel.

The absence of a `/` character at the start of the decoded address indicates a
relative path, otherwise the path is absolute.
Comment on lines -36 to -37
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines may have been removed by accident? They should probably remain in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied

# `unix-abstract`

This protocol encodes a Linux Unix-domain abstract socket address,
which are distinguished by their first byte being 0.
It is encoded the same way as `unix`;
the marker byte is not part of the path.
Comment on lines +39 to +44
Copy link
Member

@lidel lidel Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec should have clear rationale why separate namespace is used instead of putting %00 in /unix, because there will be people / LLMs that naively try to put %00 in /unix.

That being said, @nabijaczleweli @achingbrain apologies if this was discussed elsewhere, but why are we adding second namespace and not encode the first byte explicitly?

for example, instead

  • /unix-abstract/f87a1c847a4ecaf3%2Fbus%2Fsystemd%2Fbus-api-system

could we just prefix %00

  • /unix/%00f87a1c847a4ecaf3%2Fbus%2Fsystemd%2Fbus-api-system

Any downside to doing this and removing /unix-abstract? (please document one way or another)


## Examples

In the following table, the address column follows the userspace convention
of 0 bytes in the address being rendered as an `@` for abstract addresses
for display only.

| Rendered Address | multiaddr string form |
| -------------------------------------------- | ------------------------------------------------------------------ |
| @f87a1c847a4ecaf3/bus/systemd/bus-api-system | `/unix-abstract/f87a1c847a4ecaf3%2Fbus%2Fsystemd%2Fbus-api-system` |
| @/run/fsid.sock@@@@@@@@@@@@@@@@@@@@@@@@@@... | `/unix-abstract/%2Frun%2Ffsid.sock%00%00%00%00%00%00%00%00%00...` |

# `stream`, `seqpacket`, `dgram`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these are generic terms exist outside of unix sockets (streams are very generic, datagrams also), would it be better to keep prefix sock- for all three?


These correspond to the *type* of Unix-domain socket:
`SOCK_STREAM`, `SOCK_SEQPACKET`, `SOCK_DGRAM`.
Comment on lines +57 to +60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage position unclear: Where do these appear in a multiaddr? After /unix? Example needed.


Previous versions of this specification did not contain these types;
for compatibility, their absence should not indicate an error,
if a default makes sense when decoding.
Comment on lines +62 to +64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default behavior underspecified: The doc says "their absence should not indicate an error, if a default makes sense when decoding" but doesn't specify what the default should be. Document that SOCK_STREAM should be assumed when no type is present?

This brings the other gap, use case not documented: Why would someone need to specify socket type at the multiaddr level? Most Unix socket usage is SOCK_STREAM. What's the practical scenario requiring SOCK_SEQPACKET or SOCK_DGRAM in a multiaddr?