Re: [Ietf-behave] comments on draft-macdonald-behave-nat-behavior-discovery-00
Jonathan,
Thanks for the feedback. Responses inline.
On 11/7/06, Jonathan Rosenberg <jdrosen@xxxxxxxxx> wrote:
Overall this is pretty decent. I'm still uncomfortable overall with some
of the tone which suggests its usage to adjust application behavior, and
I have comments in several places below about that.
My other big technical issues are on usage determination and discovery
and backwards compatibility with rfc3489.
Specific comments:
> 1. Applicability
>
> This STUN usage does not allow an application behind NAT to make an
> absolute determination of the NAT's characteristics. NAT devices do
> not behave consistently enough to predict future behaviour with any
> guarantee. Applications requiring reliable reach must establish a
> communication channel through a NAT using another technique such as
> ICE [I-D.ietf-mmusic-ice] or OUTBOUND [I-D.ietf-sip-outbound]. Where
> reliability is not as strong a requirment, applications can use this
> STUN usage to select operating modes and optimizations.
I think more meat is needed here. In particular, I think you need to
call out that this is primarily meant for diagnostics. A little bit more
on the unreliability of this is needed. You also need to discuss that
this mechanism is basically "black box testing" based on modeled
properties of nat which may or may not be true over time.
agreed. In particular, we probably should have a very large explicit
warning not to use this to decide not to do appropriate NAT traversal,
such as doing ICE passive mode.
> OPEN ISSUE: should we have an SRV target for this usage, or piggyback
> on 3489-bis SRV discovery. If we had a separate SRV we wouldn't have
> to rely on 420 when dns is used.
I think you need a separate SRV target. The reason is that you are
assuming that the server supports extra features (change requests and
the like), and this means a "discovery stun server" could be different
from one used just for binding requests.
I think it's fine to add an additional SRV target, but I think we need
to be careful to not essentially prohibit backward compatibility with
3489 with that requirement, i.e. if we say MUST NOT send a behavior
discovery question unless it finds an SRV mapping, a client wouldn't
contact a legacy server. Personally, I think that UNKNOWN-ATTRIBUTE
is sufficient for what we need, but I agree that it's better to try an
SRV target first.
The overview in section 3.1 - 3.5 would be more useful if it explained a
little on how each worked, rather than just talking about which
attributes get used.
agreed
> STUN Binding Requests allow a a client to determine whether it is
> behind a NAT that support hairpinning of datagrams. To perform this
> test, the client first sends a Binding Request to its STUN server to
> determine its mapped address. The client then sends a STUN Binding
> Request to this mapped address from a different port. If the client
> receives its own request, the NAT hairpins datagrams. This test
> applies to UDP, TCP, or TCP/TLS connections.
Hmm. I wonder if the fact that the source IP address matches the target
will have an impact on whether this hairpinning works or not. Have you
tested this?
Interesting question. I've never tested, but it's probably true that
we should try to send packets in each direction in case filtering is
appled. While I think it would be better to use different IP
addresses, I don't think that's a possibility for most clients.
> All of the previous tests can be performed with PADDING if a NAT's
> fragment behavior is important for an application, or only those
> tests which are most interesting to the application can be retested.
> PADDING only applies to UDP datagrams.
Is there any expectation that behavior varies based on padding?
Older netgear NATs (at least) drop hairpinned fragments. I think
Cullen told me that a couple other vendors have this same bug. This
was a PITA to diagnose...
> change behaviour under load and over time. An application must
> assume that NAT behvaiour can become more restrictive at any time.
> The tests described here are for UDP connectivity, NAT mapping
> behaviour, and NAT filtering behaviour; additional tests could be
> designed. Definitions for NAT filtering and mapping behaviour are
What happened to hairpinnning and timeout?
good point
> 4.1. Checking if UDP is Blocked
>
> The client sends a STUN Binding Request to a server. This causes the
> server to send the response back to the address and port that the
should clarify that this omits these various attributes defined here.
Also should note that the request gets send over UDP.
You can repeat the test for tcp too btw.
> This will require at most three tests. In test I, the client
> performs the UDP connectivity test but includes the OTHER-ADDRESS
> attribute in the binding request, with the attribute length set to 0.
> The server will return its alternate address and port in OTHER-
> ADDRESS in the binding response.
So these attributes are different than rfc3489, so this won't be
backwards compatible. Was there a reason you changed this?
Sorry, this wasn't clear. The intention is to be backward compatible,
but both Derek and I agreed that the CHANGED-ADDRESS name didn't make
any sense to us when we first read the 3489, and even after reading it
we found the name confusing, so we decided to propose renaming it to
OTHER-ADDRESS. However, the definition of the field and its
attributed code are identical, so it's still the same protocol and
should be backward compatible.
> The client examines the XOR-MAPPED-
> ADDRESS attribute. If this address and port are the same as the
> local IP address and port of the socket used to send the request, the
> client knows that it is not NATed and the effective mapping will be
> Endpoint Independent.
Should caveat that if the user is behind multi-layer nat, the reflexive
address may actually match the source (1 in 65536 probability in such a
topology).
> local IP address and port of the socket used to send the request, the
> client knows that it is not NATed and the effective mapping will be
> Endpoint Independent.
if you're not natted there is really no such thing as mapping. I
wouldn't characterize this case as "endpoint independent".
> If not, test III is performed: the client sends a Binding Request to
> the alternate address and port.
Need to clarify that this is a different port than test II.
all agreed
> This will also require at most three tests. If these tests should be
> performed using a port that wasn't used for mapping or other tests as
> packets sent during those tests may affect results.
this doesnt parse.
Remove the "If" and maybe change the "as" to "because":
to:
This will also require at most three tests. These tests should be
performed using a port that wasn't used for mapping or other tests
because packets sent during those tests may affect results.
> Mapping behaviour
> would not matter as ICE [I-D.ietf-mmusic-ice] would establish
> connectivity if an incoming packet can be received from the peer.
> This could be accomplished in only two tests: test I and test II of
> filtering discovery.
You don't need any of this with ICE.
You're right, that paragraph is trying to cover too many concepts too
quickly. Definitely need to fix.
> Care must be taken when parallelizing tests, as some NAT devices have
> an upper limit on how quickly bindings will be allocated.
you need to give more guidance.
> For this reason
> applications that require reliable bindings should send keep-alives
> as frequently as required by:
>
> OPEN ISSUE: ice specifies 15 seconds, outbound specifes 24-29..is
> there consensus here? Or can we say "the appropriate keep-alive
> interval" Some applications may wish to trade reliability for
> bandwidth savings. Applications can use this test to determine the
> best estimate of the NAT binding lifetime.
actually I don't think its the scope of this document to even say what
this should be.
Agreed, I think in both cases the right approach is to reference
nat-udp and the other docs that go into more detail on these issues.
> It is possible that the client can get inconsistent results each time
> this process is run. For example, if the NAT should reboot, or be
> reset for some reason, the process may discover a lifetime than is
> shorter than the actual one. For this reason, implementations are
> encouraged to run the test numerous times, and be prepared to get
> inconsistent results.
also note load might impact this.
> OPEN ISSUE: The only attack made possible by the RESPONSE-ADDRESS
> that wasn't possibly before is using the STUN server to reflect a dos
> against a cient that is behind a Address-Dependent Filtering NAT and
> has openend permission to the STUN server. A direct DOS against the
> same endpoint would prevent this attack. As REFLECTED-FROM already
> provides traceability the shared secret requirement could be replaced
> with rate-limiting behaviour on the server. This would allow the
> deployment of STUN servers w/out shared secrets.
I prefer having the shared secret and am glad that is what you
specified. This is a disaster waiting to happen.
> The client indicates it is interested in the NAT Behavior Discovery
> usage by including one of the attributes defined in this draft.
I don't think this really means anything.
That depends on the SRV issue and backward compatibility issue above.
The client indicates it's interested in the behavior usage by
including one of the behavior attributes. Otherwise a new client
can't talk with an old server that's not properly listed with the SRV)
and it's not possible to multiplex the two usages on the same port.
> A client MUST be prepared for receiving a 420 (Unknown Attribute)
> error to its requests. This response indicates that the server does
> not implement that particular attribute. An unsupported attribute
> prevents some tests from being run, but others may work. For
> example, only RESPONSE-ADDRESS is necessary for binding lifetime
> discovery, whereas mapping and filtering discovery do not require it,
> but do require OTHER-ADDRESS and CHANGE-REQUEST.
Yuck. This introduces a lot of complexity in the client. You should use
a different SRV for this usage and require servers to implement all
attributes.
I disagree (well, yes, it's a bit more complex, but it shouldn't be
*that* controversial to say a client must be prepared for a 420
error). It's harder to deploy a server with multiple addresses than
with only a single address, and binding lifetime discovery can be very
useful by itself. Part of the goal was to provide tools that allow
useful behavior tests to be built, but not necessarily proscribe only
a specific set of tests (the tests themselves are descriptive, not
normative).
In this particular case, though, it's a bit simpler, in that a client
shouldn't send a request with a CHANGE-REQUEST attribute without
having previously received a response from the server with an
OTHER-ADDRESS field. Will make sure that is specified.
> The server MUST NOT include any attribute defined in this document in
> the Response to a Request that does not contain at least one
> attribute defined in this document.
Huh?
This allows a server to multiplex both the behavior discovery usage
and traditional binding usage on the same port. Otherwise, a plain
3489bis client could receive a response with an OTHER-ADDRESS
attribute and reject it.
> If the Request contains a RESPONSE-ADDRESS attribute, but the message
> does not contain a MESSAGE-INTEGRITY attribute and a USERNAME, the
> server MUST generate an error response of type 401. If RESPONSE-
> ADDRESS is included, then the server must verify that it has
> previously received a binding request from the same address as is
> specified in RESPONSE-ADDRESS. If it has not, or if sufficient time
> has passed that it no longer has a record of having received such a
> request due to limited state, it MUST respond with an error response
> of type 430.
>
> OPEN-ISSUE: is this too strong? There is no amplification attack
> here. Maybe either require a shared secret, but don't track
> addresses, or track addresses that have done binding requests but not
> relate them to shared secret or just put in a rate limit and leave it
> at that?
This introduces a big burden on the server. I'm on the fence about
whethre its needed. Will think about it more...
Agreed. We talked about this awhile. Really seems to depend on how
bad you think the threat of misuse of RESPONSE-ADDRESS is and how much
you trust clients before you give them a shared secret in the first
place. If RESPONSE-ADDRESS isn't too big of a threat and short term
credentials are only issued to trusted nodes, it's probably
unnecessary. If untrusted nodes are involved, I'm still not sure.
It's not an amplification, but it sort of anonymizes the attack and it
could be obnoxious if used to attack an unsecured protocol.
> The server MUST add a SOURCE-ADDRESS attribute to the Binding
> Response, containing the source address and port used to send the
> Binding Response.
where does this get used?
preserved from 3489, the definition of SOURCE-ADDRESS in 7.3 suggests
that it could be used to detect twice NAT configurations, which I
think means trying to detect when the STUN server is behind a NAT.
> If it supports an alternate address and port and the Binding request
> contained a zero-length OTHER-ADDRESS attribute the server MUST add
> an OTHER-ADDRESS attribute to the Binding Response. This contains
> the source IP address and port that would be used if the client had
> set the "change IP" and "change port" flags in the Binding Request.
> As summarized in Table 1, these are Ca and Cp, respectively,
> regardless of the value of the CHANGE-REQUEST flags.
So you are using the zero-length OTHER-ADDRESS as a flag to return the
actual value in the response? Yuck. Per above, you should just always
send it and use SRV to differentiate usage.
Again, both multiplexing and backward compatibility come into play
here. I think the other idea to support multiplexing was to have a
CHANGE-REQUEST without either bit set.
> 8.1. Problem Definition
>
> >From RFC 3424 [RFC3424], any UNSAF proposal must provide:
extra ">"
> diagnose the cause of problems experienced by that or other
> applications or for an application to modify its behavior based on
> the current behavior of the NAT.
I'm uncomfortable with the second half of this.
how about "optimize" or a statement to the effect that it can be used
for trying to pick a starting algorithm for the application, but that
the application must be able to change (and detect the need to) if the
nat behavior changes from what is initially discovered?
> The STUN NAT Behavior Discovery usage does not itself provide an exit
> strategy. Instead, that is provided by other protocols.
> Specifically, the Interactive Connectivity Establishment (ICE)
> [I-D.ietf-mmusic-ice] mechanism allows two cooperating clients to
> interactively determine the best addresses to use when communicating,
> regardless of the type of NAT involved.
I disagree with the statements here. If you stick to just diagnostics,
then the exit path is easy - no nATs, no problems, no diagnostics needed.
true.
Bruce