< Previous by Date Date Index Next by Date >
  Thread Index Next in Thread >

[Ietf-behave] review of rfc3489bis-04


Review of draft-ietf-behave-rfc3489bis-04

Bruce B. Lowekamp
College of William and Mary and
SIPeerior Technologies, Inc.


I waited to write this review until we completed implementing this
STUN draft inside the reSIProcate stack.  A number of these comments
stem from the design of that implementation, which does initial
classification and parsing at the stack layer and then passes the
message up for further processing at the dialog/application level when
necessary.

I've grouped comments into two sections: protocol issues, where I
think the current specification should be changed in some way, and
document clarifications, where I think the draft has a mistake that
doesn't reflect the intention of the protocol.  I also made some notes
on a few more general clarity and grammar issues that I am forwarding
to Jonathan separately.

Having worked with a previous implementation guided only by the
original 3489 and also worked with this version, 3489bis is
significantly cleaner and easier to read, understand, and implement.

Tiffany Broadbent did a lot of the implementation and many of these
comments stem from questions she brought up while developing.



Protocol issues:

There are several issues that are motivated by the combination of
supporting new usages as extensions and trying to integrate our
implementation into an entire SIP stack.

- It would be much easier to implement a stack-level for STUN if
 specific bits indicated whether a message is a request, response,
 error response, or indication.  In several spots the draft refers to
 response and error response types as 0x100 and 0x110 higher, which I
 assume is intended to indicate bits to be set, but I don't think it
 technically reads that way.  However, no way of determining whether
 a type is an indication is mentioned in the draft.  Providing for
 future message types to be identified without recompiling the code
 with a new list of types for a new usage would be very helpful.

- There are a number of issues if future usages are defined that are
 not idempotent.  For example, Section 6 allows bit-wise identical
 requests to use the same transaction ID.  I assume this is intended
 to simplify periodic keepalives for low-power devices, however I
 don't think that assuming re-using transaction IDs is safe for future
 usages is a good idea.  I suggest revising the two sentences (or one
 if changed) in that paragraph that allow TID reuse to indicate that
 TIDs cannot be reused except for retransmissions of the same request
 unless allowed by the particular usage.

- In a similar vein, there is currently no specification for how long
 a server must remember a previous TID to ensure that a
 retransmission or duplication is not processed twice.  Also, there
 is no longer any specification of what a client should do if it
 receives a duplicate response.  Seems like in both cases there
 should be a suggestion to remember the previous TID for 9500ms to
 avoid processing the same transaction multiple times.

- Why is DTLS not mentioned?

- I would like to see an overall transaction timer.  Multiplying the
 9500ms request timeout by a set of shared secret, nonce,
 ALTERNATE-SERVER, and SRV cycles, the worst-case time for a single
 transaction is quite bad.  A misconfigured set of servers could set
 up an ALTERNATE-SERVER loop.


- Are NONCEs intended to be used to challenge each individual
 request/response or as a per-session value?  Section 8.3.2 says that
 NONCE values should be cached and reused on subsequent
 request/response.  8.3.1, item 2 says: "The message would not
 contain the NONCE attribute," and at the end of the list the next
 paragraph says that the NONCE is included only if the request is a
 retry as a consequence of an error response.  I think item 2 of
 8.3.1 should say "The message MUST NOT contain the REALM attribute,"
 regardless of the NONCE issue.  My impression previously has been
 that the intention of the NONCE is to default to a per-session value
 unless the server wants to authenticate more often, but 8.3.1
 doesn't seem to support that.

- I don't see an error code for unknown message type.  Perhaps 405?
 (generating these properly also requires being able to identify whether
 the unknown type is a request or an indication)

- 8.3.2 436 response: this would be the wrong error code for an
     expired short-term, but should the client get another
     shared-secret if the username was acquired through a
     shared-secret request?

- 8.3.2 500 response: several?

- We had a discussion at the last IETF of whether to retain the
 current fingerprint, switch to a CRC, use a keyed CRC, or drop it.
 I believe the consensus was to switch to a keyed CRC, but after
 implementing this draft, I would favor making the fingerprint
 optional.  IMHO, the application multiplexing stun and application
 data is in the best position to identify whether it is needed, and
 in the vast majority of cases, it isn't needed.  Even with our generic
 implementation, it would be trivial for us to add an option to
 include or not include FINGERPRINT when we're multiplexing over a
 single socket, and none of our current applications would require
 it.

- 11.6: for our implementation, it was useful to use 408 and 484 to
    deliver errors from the stack to the application layer.  Since
    STUN isn't designed to be proxied, it may not make sense
    to include them here, but it does make the implementation easier
    to use them.

- 11.9 why not just pad out the extra field?

- in section 12, there is an explicit assumption that binding requests
 will always require integrity, but keepalives won't.  I don't think
 I agree with this in all cases.  If using STUN for binding inside a
 TLS connection, I don't think integrity is required.  In the
 keepalive case, I also don't think I agree that re-establishing a
 session is necessarily a trivial cost in time or in lost behavior.
 First, a lot of work has been done with sip-outbound to stage this
 sort of reconnection.  Secondly, since STUN will be applied to other
 protocols beside SIP, I think this distinction assumes
 characteristics of the application protocol that are not known in
 advance.  I would prefer SHOULD-level strength here, and some
 discussion of how to choose when to require message integrity and
 when not to.

- 12.3.1: should a changed binding be interpreted as a clue to reduce
      the refresh interval?  (I assume the answer is yes, and mostly
      think it should be explicitly suggested by the text)

- 12.3.2: assuming that keepalives are used for a fixed server also
      extends from assuming SIP is the application protocol.  This
      assumption should probably be weakened/removed.

- 12.4.1/12.4.7: one says shared-secrets are specific to SRV Priority,
      the other to domain name




Document clarifications:

section 6: as mentioned on the mailing list, the header includes the
magic cookie and the body has at least one attribute in the current
spec.

section 6, 3rd par: stun can also run over TLS/TCP, right?

end of 7.1 and 7.2: I think these are supposed to say the same thing
about trying multiple SRV results, but it's a bit confusing since
they're worded completely differently.  I prefer the 7.1 version.

end of 8.2: should refer to a section

8.3.2 first par: what if there is a MESSAGE-INTEGRITY in the response
   but not request

8.3.2 starting after the second paragraph: Every paragraph has a
   sentence or two, worded slightly differently, about using a NONCE if
   returned.  Can this be grouped into a single paragraph before the
   long list?  Also, toward the end of the section, there is the
   paragraph that refers to caching NONCEs.  This paragraph should be
   moved here along with the global NONCE rules.

8.3.2 starting after the second paragraph: After each code, put the
   parenthesized reason phrases from 11.6.  It's too hard to read
   with just the codes.

8.3.2 third par: remind that the long-term password is pre-provisioned
   here


- 8.3.2 300 response: should the same s-t credentials be used?
     depends on domain or on srv response?

- 8.3.2 500 paragraph: Unknown response codes should start a new
     paragraph, 3XX should be included.

- 8.3.2 paragraph after 500: I think this should say that the unknown
 optional attributes should be ignored, not a response containing
 unknown optional attributes.

- 8.3.2 last two paragraphs:  why are these two paragraphs here?
     Seems like they should be later on in the sections describing
     handling the attributes.

- 8.3.1 and 8.4: UDP can be pipelined, as well.

- 9.1 the one sentence before 9.1.1 seems to apply only to 9.1.1

- 9.1.1 1st par: can do stun binding over tls too, as well

- 9.1.1: there are two different paragraphs that refer to generating a
     432 if the USERNAME is not present.

- 9.1.1: can there be one paragraph describing the possibility of
     attaching a NONCE to almost every error response?

- 9.1.2: exclusive-or of the source IP address and port from the
     request -> exclusive-or of the source transport address and
     magic cookie.

- 9.1.2: before the FINGERPRINT paragraph, there should be a paragraph
     that if the request contained MESSAGE-INTEGRITY, the response
     must have MESSAGE-INTEGRITY generated to match 8.3.2

- 9.1.2: FINGERPRINT paragraph : indication->response

- 9.1.2 last sentence: requests defined -> usages defined

- end of 9.2: depends on the message type -> depends on the usage.
 for consistency with earlier client discussion.

- 10 end of 1st par: requests and responses -> messages and
 indications

- 11 Figure 4: I think MESSAGE-INTEGRITY should be C for Response and
 Error Response based on 8.3.2 requiring it if the request used it.

- 11.1: should probably say IP addr is in network byte order

- 11.3: used as the HMAC *key* in the

- 11.4: why "generally"?

- 11.4 last sentence: shared secret->password to include both short
    and long term?

- 12.2.1/12.2.7: 12.2.1 "assumed" that the rendezvous gives the shared
      secret, 12.2.7 says it MUST

- 12.2.2 last sentence: actual communications -> application data traffic.

- 12.4.1 is 9 minutes a MUST?

- 12.4.9: the very end refers to REFLECTED-FROM.  I know this is a
      historical leftover, but I can't figure out what the sentence
      is trying to say.

- 14.1: since this draft allows for and refers to extensions,
    including stun-relay explicitly, I think it's better to say
    something like STUN is intended to provide the functionality
    necessary to describe how to connect two endpoints regardless of
    the location of type of NATs in the topology.