[Ietf-behave] review of rfc3489bis-04
Review of draft-ietf-behave-rfc3489bis-04
Bruce B. Lowekamp
College of William and Mary and
SIPeerior Technologies, Inc.
I waited to write this review until we completed implementing this
STUN draft inside the reSIProcate stack. A number of these comments
stem from the design of that implementation, which does initial
classification and parsing at the stack layer and then passes the
message up for further processing at the dialog/application level when
necessary.
I've grouped comments into two sections: protocol issues, where I
think the current specification should be changed in some way, and
document clarifications, where I think the draft has a mistake that
doesn't reflect the intention of the protocol. I also made some notes
on a few more general clarity and grammar issues that I am forwarding
to Jonathan separately.
Having worked with a previous implementation guided only by the
original 3489 and also worked with this version, 3489bis is
significantly cleaner and easier to read, understand, and implement.
Tiffany Broadbent did a lot of the implementation and many of these
comments stem from questions she brought up while developing.
Protocol issues:
There are several issues that are motivated by the combination of
supporting new usages as extensions and trying to integrate our
implementation into an entire SIP stack.
- It would be much easier to implement a stack-level for STUN if
specific bits indicated whether a message is a request, response,
error response, or indication. In several spots the draft refers to
response and error response types as 0x100 and 0x110 higher, which I
assume is intended to indicate bits to be set, but I don't think it
technically reads that way. However, no way of determining whether
a type is an indication is mentioned in the draft. Providing for
future message types to be identified without recompiling the code
with a new list of types for a new usage would be very helpful.
- There are a number of issues if future usages are defined that are
not idempotent. For example, Section 6 allows bit-wise identical
requests to use the same transaction ID. I assume this is intended
to simplify periodic keepalives for low-power devices, however I
don't think that assuming re-using transaction IDs is safe for future
usages is a good idea. I suggest revising the two sentences (or one
if changed) in that paragraph that allow TID reuse to indicate that
TIDs cannot be reused except for retransmissions of the same request
unless allowed by the particular usage.
- In a similar vein, there is currently no specification for how long
a server must remember a previous TID to ensure that a
retransmission or duplication is not processed twice. Also, there
is no longer any specification of what a client should do if it
receives a duplicate response. Seems like in both cases there
should be a suggestion to remember the previous TID for 9500ms to
avoid processing the same transaction multiple times.
- Why is DTLS not mentioned?
- I would like to see an overall transaction timer. Multiplying the
9500ms request timeout by a set of shared secret, nonce,
ALTERNATE-SERVER, and SRV cycles, the worst-case time for a single
transaction is quite bad. A misconfigured set of servers could set
up an ALTERNATE-SERVER loop.
- Are NONCEs intended to be used to challenge each individual
request/response or as a per-session value? Section 8.3.2 says that
NONCE values should be cached and reused on subsequent
request/response. 8.3.1, item 2 says: "The message would not
contain the NONCE attribute," and at the end of the list the next
paragraph says that the NONCE is included only if the request is a
retry as a consequence of an error response. I think item 2 of
8.3.1 should say "The message MUST NOT contain the REALM attribute,"
regardless of the NONCE issue. My impression previously has been
that the intention of the NONCE is to default to a per-session value
unless the server wants to authenticate more often, but 8.3.1
doesn't seem to support that.
- I don't see an error code for unknown message type. Perhaps 405?
(generating these properly also requires being able to identify whether
the unknown type is a request or an indication)
- 8.3.2 436 response: this would be the wrong error code for an
expired short-term, but should the client get another
shared-secret if the username was acquired through a
shared-secret request?
- 8.3.2 500 response: several?
- We had a discussion at the last IETF of whether to retain the
current fingerprint, switch to a CRC, use a keyed CRC, or drop it.
I believe the consensus was to switch to a keyed CRC, but after
implementing this draft, I would favor making the fingerprint
optional. IMHO, the application multiplexing stun and application
data is in the best position to identify whether it is needed, and
in the vast majority of cases, it isn't needed. Even with our generic
implementation, it would be trivial for us to add an option to
include or not include FINGERPRINT when we're multiplexing over a
single socket, and none of our current applications would require
it.
- 11.6: for our implementation, it was useful to use 408 and 484 to
deliver errors from the stack to the application layer. Since
STUN isn't designed to be proxied, it may not make sense
to include them here, but it does make the implementation easier
to use them.
- 11.9 why not just pad out the extra field?
- in section 12, there is an explicit assumption that binding requests
will always require integrity, but keepalives won't. I don't think
I agree with this in all cases. If using STUN for binding inside a
TLS connection, I don't think integrity is required. In the
keepalive case, I also don't think I agree that re-establishing a
session is necessarily a trivial cost in time or in lost behavior.
First, a lot of work has been done with sip-outbound to stage this
sort of reconnection. Secondly, since STUN will be applied to other
protocols beside SIP, I think this distinction assumes
characteristics of the application protocol that are not known in
advance. I would prefer SHOULD-level strength here, and some
discussion of how to choose when to require message integrity and
when not to.
- 12.3.1: should a changed binding be interpreted as a clue to reduce
the refresh interval? (I assume the answer is yes, and mostly
think it should be explicitly suggested by the text)
- 12.3.2: assuming that keepalives are used for a fixed server also
extends from assuming SIP is the application protocol. This
assumption should probably be weakened/removed.
- 12.4.1/12.4.7: one says shared-secrets are specific to SRV Priority,
the other to domain name
Document clarifications:
section 6: as mentioned on the mailing list, the header includes the
magic cookie and the body has at least one attribute in the current
spec.
section 6, 3rd par: stun can also run over TLS/TCP, right?
end of 7.1 and 7.2: I think these are supposed to say the same thing
about trying multiple SRV results, but it's a bit confusing since
they're worded completely differently. I prefer the 7.1 version.
end of 8.2: should refer to a section
8.3.2 first par: what if there is a MESSAGE-INTEGRITY in the response
but not request
8.3.2 starting after the second paragraph: Every paragraph has a
sentence or two, worded slightly differently, about using a NONCE if
returned. Can this be grouped into a single paragraph before the
long list? Also, toward the end of the section, there is the
paragraph that refers to caching NONCEs. This paragraph should be
moved here along with the global NONCE rules.
8.3.2 starting after the second paragraph: After each code, put the
parenthesized reason phrases from 11.6. It's too hard to read
with just the codes.
8.3.2 third par: remind that the long-term password is pre-provisioned
here
- 8.3.2 300 response: should the same s-t credentials be used?
depends on domain or on srv response?
- 8.3.2 500 paragraph: Unknown response codes should start a new
paragraph, 3XX should be included.
- 8.3.2 paragraph after 500: I think this should say that the unknown
optional attributes should be ignored, not a response containing
unknown optional attributes.
- 8.3.2 last two paragraphs: why are these two paragraphs here?
Seems like they should be later on in the sections describing
handling the attributes.
- 8.3.1 and 8.4: UDP can be pipelined, as well.
- 9.1 the one sentence before 9.1.1 seems to apply only to 9.1.1
- 9.1.1 1st par: can do stun binding over tls too, as well
- 9.1.1: there are two different paragraphs that refer to generating a
432 if the USERNAME is not present.
- 9.1.1: can there be one paragraph describing the possibility of
attaching a NONCE to almost every error response?
- 9.1.2: exclusive-or of the source IP address and port from the
request -> exclusive-or of the source transport address and
magic cookie.
- 9.1.2: before the FINGERPRINT paragraph, there should be a paragraph
that if the request contained MESSAGE-INTEGRITY, the response
must have MESSAGE-INTEGRITY generated to match 8.3.2
- 9.1.2: FINGERPRINT paragraph : indication->response
- 9.1.2 last sentence: requests defined -> usages defined
- end of 9.2: depends on the message type -> depends on the usage.
for consistency with earlier client discussion.
- 10 end of 1st par: requests and responses -> messages and
indications
- 11 Figure 4: I think MESSAGE-INTEGRITY should be C for Response and
Error Response based on 8.3.2 requiring it if the request used it.
- 11.1: should probably say IP addr is in network byte order
- 11.3: used as the HMAC *key* in the
- 11.4: why "generally"?
- 11.4 last sentence: shared secret->password to include both short
and long term?
- 12.2.1/12.2.7: 12.2.1 "assumed" that the rendezvous gives the shared
secret, 12.2.7 says it MUST
- 12.2.2 last sentence: actual communications -> application data traffic.
- 12.4.1 is 9 minutes a MUST?
- 12.4.9: the very end refers to REFLECTED-FROM. I know this is a
historical leftover, but I can't figure out what the sentence
is trying to say.
- 14.1: since this draft allows for and refers to extensions,
including stun-relay explicitly, I think it's better to say
something like STUN is intended to provide the functionality
necessary to describe how to connect two endpoints regardless of
the location of type of NATs in the topology.