< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

[SFtrack] Updated: (XECS-1589) SipXecs does not gracefully handle broken TCP streams


     [ 
http://track.sipfoundry.org/browse/XECS-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dale R. Worley updated XECS-1589:
---------------------------------

    Attachment: 0004f21a90d2-sipx-phone.cfg
                0004f21a90d2-app.log
                reboot3.pcap

Here is a log of the Polycom phone 0004f21a90d2 being rebooted and failing to 
establish a BLF subscription.  The phone is set to N.A. Eastern daylight time 
(GMT-4).

The Wireshark capture is reboot3.pcap.  Frames 597, 602, and 612 are the 
SUBSCRIBE and the not-responded-to NOTIFY.

The Polycom phone log is 0004f21a90d2-app.log.  The last reboot sequence 
matches the capture.  Note that it took the phone a while to contact NTP, so 
there are many lines without real timestamps.  The last BLF SUBSCRIBE starts 
around line 1101 of the log file.

The configuration files are 0004f21a90d2-sipx-phone.cfg, 
0004f21a90d2-sipx-sip.cfg, and 0004f21a90d2.cfg.


> SipXecs does not gracefully handle broken TCP streams
> -----------------------------------------------------
>
>                 Key: XECS-1589
>                 URL: http://track.sipfoundry.org/browse/XECS-1589
>             Project: sipXecs
>          Issue Type: Improvement
>         Environment: sipXecs 3.10
>            Reporter: Brad Marusiak
>            Assignee: Dale R. Worley
>         Attachments: 0004f21a90d2-app.log, 0004f21a90d2-sipx-phone.cfg, 
> polycomRepro.pcap, reboot1.pcap, reboot2.pcap, reboot3.pcap, sipx-snapshot.zip
>
>
> Restating Brad's analysis to clarify where sipX needs improvement:
> First, the Polycom phone on rebooting is not properly tearing down its TCP 
> connection. On a restart we attempt to unSUBSCRIBE from the BLF dialog 
> (packet 93) but the TCP stream is never concluded with a FIN tagged packet.  
> Ideally the phone should terminate the TCP stream to ensure that the other 
> SIP element does not attempt to use the stream afterward, but that cannot be 
> enforced, as the phone may restart due to a power failure.
> Second, on restart after the phone re-registers and resubscribes, the sipX 
> server (RLS) is sending the CSEQ 0 NOTIFY (packet 718) in answer to the new 
> SUBSCRIBE (#716). sipX does not initiate a new TCP stream because it believes 
> that the old TCP stream is still working.
> The TCP packet's contents will not be successfully received, since the phone 
> does not know of the TCP stream.  The phone returns an RST packet to sipX.  
> The sipX SIP stack returns a transmission failure to the RLS "Subscribe 
> Server" which terminates the subscription but the SIP stack does not attempt 
> to re-send the NOTIFY in any way.
> After 30 minutes have elapsed, the phone will initiate its resubscribe.  As 
> the RLS has destroyed the subscription, the phone will receive a 481 
> response, in which case it should attempt a new subscription.  Alternatively, 
> the phone may attempt a new subscription outright.  In either case, the phone 
> establishes a new subscription.
> Note that in SIP the NOTIFYs of a SUBSCRIBE are NOT required to be returned 
> by any particular transport or stream, other than as specified by the Contact 
> URL of the SUBSCRIBE.
> Technically, that sipX does not gracefully handle broken TCP connections is 
> probably not out-of-specification, but as seen above, it can cause 
> operational problems.  We should have a better strategy to handle broken TCP 
> connections.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://track.sipfoundry.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira