UnifySquare Logo
Nav Accent Bar
From technical articles to tid-bits of important news and information, stay up to date on the latest UC happenings. Unify Square Blog
Helping you deploy the world's leading platform for Unified Communications.
 

This blog is a response to a question somebody posted in the OCS forums about how to configure load balancers for OCS AV Edge servers.  I’ve copied the answer here and included a couple of sample Visio topology diagrams.

The A/V Edge enables relaying of real time media traffic (RTP packets), which can be carried over UDP or TCP.  UDP is preferred because when one packet is lost, delivery of subsequent packets is not disturbed and the OCS media stack can surprisingly large number of dropped packets without noticible audio quality degredation.  With TCP, losing one packet will cause the TCP stack to halt delivery of subsequent packets to the media stack until that packet is retransmitted and received.  This introduces higher latency in environments with higher packet loss, which is why UDP is preferred and TCP is used only if UDP isn't available.

Because of this need to relay UDP, the AV Edge looks different than a typical web server that listens only on TCP.  All load balancers support TCP affinity due to the connection oriented nature of TCP, and the OCS clients do rely on this for TCP connections.  Specifically, when OC makes a TCP connection to the A/V Edge via the load balancer, the connection remains through the load balancer.

With UDP, some load balancers can do UDP "stickiness", but it's not as deterministic because UDP is a connectionless protocol.  So when we designed the A/V Edge load balancing architecture (when I was on the OCS product team), we implemented a design that did not rely on UDP affinity for load balancers.  When clients send the first UDP packet through the load balancer, it'll get forwarded to a particular AV Edge instance.  In the response UDP packet, the AV Edge includes its own IP address as a parameter.  The client receives the IP address of this AV Edge instance in the response packet and sends subsequent packets directly to that IP address instance.

You can see how this UDP behavior differs considerably from a typical load balanced server that only listens on TCP.  This is what trips people up usually with load balancers in OCS.  In a two-armed topology, the load balancer must be configured to act both as load balancer exposing the appropriate VIPs AND as a router so traffic can be sent directly to the AV Edge.  The requirement for this router behavior is one reason why you can't do NAT in a load balanced AV Edge deployment.  (Another reason is if you need to expose the 50,000 port range, those ports need to be reached directly by clients without any NAT taking place.)

You may wonder why NAT is supported in non-load balanced environments.  Remember that "self" IP address that the AV Edge instance returns on the first UDP packet?  Well, in OCS 2007 R2, the AV Edge now exposes a setting that tells the AV Edge to change that IP from a "self" IP to the IP resolved by the external AV Edge FQDN.  So subsequent UDP requests go to that same public IP instead of the NATed IP address.  This works in a non load balanced environment because the there's only one AV Edge instance to forward to.  (Now in theory, an admin could configure this setting in a load balanced environment and rely on the load balancer’s UDP stickiness behavior to do the right thing.  This is not a supported configuration, and I have never heard of anyone trying this.  But it would be an interesting exercise to do in a lab environment.)

Anyhow, I've attached a couple diagrams that show some sample deployments for load balancing the edge in a one-armed and two-armed topology.  Where needed, you can see that the router symbol on the Load Balancer indicates you need to configure routing functionality in addition to VIP functionality.

Hope that helps…

Thanks,

Alan

 

Load Balanced One-armed.jpg (158.57 kb)

Load Balanced Two-armed.jpg (141.57 kb)


October 5, 2009 08:04 by alanshen Permalink | Comments (34) | Comment RSS RSS Button Image


One of the papers I authored back in June of 2008 was the following whitepaper, published through Microsoft.  http://www.microsoft.com/downloads/details.aspx?familyid=5ec060fd-ba9a-4c52-8bd8-148f502b791f&displaylang=en  It references a bandwidth calculator that I created and have posted as a link to this blog.  The calculator is just in Beta form and only covers OCS 2007 scearnios.  It may contain bugs and there are some limitations, such as lack of support for time zone considerations and not factoring address book download.  It also requires fairly intimate knowledge of OCS scenarios, so you wouldn't want a customer to fill this out without some guidance.  If you have any customers that need guidance on bandwidth planning, let me know and we'd be happy to assist.

We're currently working on an R2 offering that will take this spreadsheet to the next level.  I'll keep you posted when I have more concrete details on that.

 -Alan

BW Calculation v1.0 Beta.xlsx (350.50 kb)


March 18, 2009 09:02 by alanshen Permalink | Comments (81) | Comment RSS RSS Button Image


It’s been a while since I wrote this blog on OCS 2007 media traversal: http://communicationsserverteam.com/archive/2008/03/25/133.aspx.  I’ve since left Microsoft to join a Unified Communications consulting company called Unify Square, but media traversal is still near and dear.  This blog describes some of the improvements in media traversal that have been implemented in OCS 2007 R2.

Some things haven’t changed
The overall architecture of media endpoints using ICE and the STUN/TURN capabilities of the A/V Edge server has not changed.  Signaling is still protected by TLS encryption, media is still protected by SRTP encryption. STUN/TURN allocations against the A/V Edge are still protected by a digest authentication mechanism whose password rotates every eight hours, and obtaining this allocation password is still protected within a TLS encrypted SIP SERVICE message.  That said, a lot of improvements have been made in OCS 2007 R2.  Let’s take a look at some of them.

Support of Early Media
In OCS 2007, negotiation of a media path (i.e. ICE connectivity checks) started when the called party answered the call.  Specifically, ICE candidates were sent by the caller in the INVITE and by the callee in the 200 OK.  This resulted in a slight delay between the called party answering and when media would actually flow.  (The one exception to this was outbound calls to PSTN.  To support PSTN gateways that started sending audio before the 200OK, the mediation server would actually send ICE candidates in a 180 RINGING in addition to the 200 OK.  This enabled a poor man’s version of early media where one-way audio could be transferred from the mediation server to the calling endpoint before the full ICE negotiation occurred, preventing any initial “Hello?” audio from being clipped.)

OCS 2007 R2 endpoints support early media, a feature which enables negotiation of media before the call is accepted by the called party.  This addresses the audio clipping issue and enables a number of other scenarios such as playing custom ring back tones to the caller.  Practically speaking, this means that ICE must be negotiated before the 200 OK.  What you’ll notice is that the called party will send back ICE candidates in a 183 SESSION PROGRESS message.  Under the covers, this triggers a full ICE negotiation, enabling the media path to be ready the instant the called party actually answers the call.  (Note that the called party still sends candidates in the 200 OK message and a final ICE negotiation still happens, though this rarely results in a switch of the media path.)

If a called user has multiple R2 endpoints register, each will allocate ICE endpoints and negotiate an early media ICE path with the caller.  However, as soon as the caller receives an audio packet from one of the dialed endpoints, it will stop listening on the other early media paths.  In theory, the media path could switch after the final ICE negotiation occurs with the 200 OK.  (e.g. Let’s say an incoming call is set to simulring a user’s OC endpoint and a his cell phone.  The cell phone system generates a custom ring back tone, but the user ends up answering on OC.)  Typically, the endpoint that sent early media audio packets will be the same endpoint that actually answers the call and sends the 200 OK.

App Sharing Use of ICE/STUN/TURN
OCS 2007 R2 introduces a new modality called App Sharing, built upon the same RDP protocol used in Terminal Services.  Though functionally similar to the desktop sharing feature in Live Meeting, it functions as a totally separate modality outside of a Live Meeting conference.  For app sharing sessions involving two OC endpoints, the app sharing media stream flows point to point.  For conferences that use app sharing or if a CWA endpoint is involved, the media flows through the new app sharing MCU.  In either case, the same ICE/STUN/TURN mechanism used to negotiate an audio and video path is also used to negotiate an app sharing media path…with one key difference.  Unlike audio and video, the RDP protocol is not designed to be run over an unreliable transport protocol like UDP.  Therefore, the app sharing modality uses ICE/STUN/TURN in a TCP-only mode.  One interesting note is that in this TCP-only mode, TCP candidates are actually supported on the endpoint hosts, enabling a point to point TCP media stream.  For voice and video, only a point to point UDP stream is possible.

Support of ICE version 19
In OCS 2007 R2, all endpoints support ICE version 19.  In actually, OCS 2007 R2 endpoints support both ICE version 19 and the legacy ICE version 6 implemented in OCS 2007.  Full treatment of the differences between these two versions is beyond the scope of this blog and probably not something you’ll ever need to know, but let’s look at an SDP fragment from on R2 OC client to get a sense for some of the key differences:

------=_NextPart_000_0149_01C9A22E.BDA43360
Content-Type: application/sdp
Content-Transfer-Encoding: 7bit
Content-Disposition: session; handling=optional; ms-proxy-2007fallback

v=0
o=- 0 0 IN IP4 192.168.5.150
s=session
c=IN IP4 192.168.5.150
b=CT:99980
t=0 0
m=audio 50010 RTP/AVP 114 111 112 115 116 4 8 0 97 13 118 101
k=base64:ROFyvlcWFwsPej5xrWlQj+PFsw9Uyy0OSHoFv62mLTPvXdpnn5XvqcxI556k
a=candidate:Y821qEyRKswvPiFeMBgkQBTTL0vJDm//txizLAGyhKQ 1 o4IBYszjQDYWPTb58I7szQ UDP 0.830 192.168.5.150 50010
a=candidate:Y821qEyRKswvPiFeMBgkQBTTL0vJDm//txizLAGyhKQ 2 o4IBYszjQDYWPTb58I7szQ UDP 0.830 192.168.5.150 50008
a=candidate:VS7Zjeu4CJwh6kMO3xTuwAOhW6gGpoC9NpqEv7S8geA 1 9cJV/DeRmf+hwEws92rRNQ TCP 0.190 64.105.253.213 56653
a=candidate:VS7Zjeu4CJwh6kMO3xTuwAOhW6gGpoC9NpqEv7S8geA 2 9cJV/DeRmf+hwEws92rRNQ TCP 0.190 64.105.253.213 56653
a=candidate:cnsB1P6I85tVDpl/UgjTWRl8rFOYSkXOa8nPvnl2RJU 1 +Mkh11586TV6kN8IpnLVMQ UDP 0.490 64.105.253.213 58140
a=candidate:cnsB1P6I85tVDpl/UgjTWRl8rFOYSkXOa8nPvnl2RJU 2 +Mkh11586TV6kN8IpnLVMQ UDP 0.490 64.105.253.213 55208
a=candidate:/YhjMGvsupfnJrUraPnPUwnSUV3IsMpMLHwZIqW4aQI 1 Fvf+CecTZF6sVN/Svuunrg TCP 0.250 10.0.0.2 50014
a=candidate:/YhjMGvsupfnJrUraPnPUwnSUV3IsMpMLHwZIqW4aQI 2 Fvf+CecTZF6sVN/Svuunrg TCP 0.250 10.0.0.2 50014
a=candidate:VCZf8gadJG6G8Pb3xS7bj/4CVK/P+GeIhuew2tHBy9k 1 DIX0ZzFlrnlzdLGqfqWB0w UDP 0.550 10.0.0.2 50005
a=candidate:VCZf8gadJG6G8Pb3xS7bj/4CVK/P+GeIhuew2tHBy9k 2 DIX0ZzFlrnlzdLGqfqWB0w UDP 0.550 10.0.0.2 50017

a=cryptoscale:1 client AES_CM_128_HMAC_SHA1_80 inline:yEiOl3HA+vbDHvqSmvplV9BGpfg19jSxwjFElAPz|2^31|1:1
a=crypto:2 AES_CM_128_HMAC_SHA1_80 inline:HdnKHORdSJgC/rcYZ1y3uMRbKvybFruyFiD+UkoZ|2^31|1:1
a=maxptime:200
a=rtcp:50008
a=rtpmap:114 x-msrta/16000
a=fmtp:114 bitrate=29000
a=rtpmap:111 SIREN/16000
a=fmtp:111 bitrate=16000
a=rtpmap:112 G7221/16000
a=fmtp:112 bitrate=24000
a=rtpmap:115 x-msrta/8000
a=fmtp:115 bitrate=11800
a=rtpmap:116 AAL2-G726-32/8000
a=rtpmap:4 G723/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:97 RED/8000
a=rtpmap:13 CN/8000
a=rtpmap:118 CN/16000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=encryption:required

------=_NextPart_000_0149_01C9A22E.BDA43360
Content-Type: application/sdp
Content-Transfer-Encoding: 7bit
Content-Disposition: session; handling=optional

v=0
o=- 0 0 IN IP4 192.168.5.150
s=session
c=IN IP4 192.168.5.150
b=CT:99980
t=0 0
m=audio 50003 RTP/AVP 114 111 112 115 116 4 8 0 97 13 118 101
k=base64:ROFyvlcWFwsPej5xrWlQj+PFsw9Uyy0OSHoFv62mLTPvXdpnn5XvqcxI556k
a=ice-ufrag:VXim
a=ice-pwd:OKEB+HhXDUoNP4lrx8AH+syY
a=candidate:1 1 UDP 2130706431 192.168.5.150 50003 typ host
a=candidate:1 2 UDP 2130705918 192.168.5.150 50006 typ host
a=candidate:2 1 TCP-PASS 6556159 64.105.253.213 53119 typ relay raddr 64.105.253.213 rport 53119
a=candidate:2 2 TCP-PASS 6556158 64.105.253.213 53119 typ relay raddr 64.105.253.213 rport 53119
a=candidate:3 1 UDP 16648703 64.105.253.213 54183 typ relay raddr 64.105.253.213 rport 54183
a=candidate:3 2 UDP 16648702 64.105.253.213 51646 typ relay raddr 64.105.253.213 rport 51646
a=candidate:4 1 TCP-ACT 7076863 64.105.253.213 53119 typ relay raddr 64.105.253.213 rport 53119
a=candidate:4 2 TCP-ACT 7076350 64.105.253.213 53119 typ relay raddr 64.105.253.213 rport 53119
a=candidate:5 1 TCP-ACT 1684797951 10.0.0.2 50001 typ srflx raddr 192.168.5.150 rport 50001
a=candidate:5 2 TCP-ACT 1684797438 10.0.0.2 50001 typ srflx raddr 192.168.5.150 rport 50001
a=candidate:6 1 UDP 1694234623 10.0.0.2 50011 typ srflx raddr 192.168.5.150 rport 50011
a=candidate:6 2 UDP 1694234110 10.0.0.2 50009 typ srflx raddr 192.168.5.150 rport 50009

a=cryptoscale:1 client AES_CM_128_HMAC_SHA1_80 inline:yEiOl3HA+vbDHvqSmvplV9BGpfg19jSxwjFElAPz|2^31|1:1
a=crypto:2 AES_CM_128_HMAC_SHA1_80 inline:HdnKHORdSJgC/rcYZ1y3uMRbKvybFruyFiD+UkoZ|2^31|1:1
a=maxptime:200
a=rtcp:50006
a=rtpmap:114 x-msrta/16000
a=fmtp:114 bitrate=29000
a=rtpmap:111 SIREN/16000
a=fmtp:111 bitrate=16000
a=rtpmap:112 G7221/16000
a=fmtp:112 bitrate=24000
a=rtpmap:115 x-msrta/8000
a=fmtp:115 bitrate=11800
a=rtpmap:116 AAL2-G726-32/8000
a=rtpmap:4 G723/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:97 RED/8000
a=rtpmap:13 CN/8000
a=rtpmap:118 CN/16000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=encryption:required

------=_NextPart_000_0149_01C9A22E.BDA43360--

The first thing you notice is that this contains two complete sets of SDP.  The first SDP block contains a version 6 ICE candidate list and the second contains one for version 19.  You can see “ms-proxy-2007fallback” string identifies which one is the legacy block.  This is called a multipart SDP and explains how OCS 2007 R2 endpoints are still able to negotiate media with Exchange 2007 UM and other legacy OCS 2007 endpoints.  If the caller is R2, both SDPs are offered and the legacy endpoint responds with a ICE version 6 SDP only.  This tells the R2 endpoint to go into legacy mode.  If the callee is R2, the offer will contain just a legacy ICE SDP which indicates to the callee that it should only respond with a legacy ICE SDP.  Keep in mind that because app sharing is a new feature of OCS 2007 R2, you will never see any app sharing candidate lists or media offer in a legacy SDP block.

You’ll also notice the version 19 candidate list is shorter and more readable.  Rather than encoding a unique username/password per candidate, a common one is used for the entire set of candidates.  The type of ICE candidate is also encoded, where HOST is a candidate on the endpoint itself, SRFLX (short for Server Reflexive) is a STUN candidate on the NAT, and RELAY a candidate on the A/V Edge.  You’ll also notice that TCP candidates are denoted as ACT (Active) or PASS (Passive), indicating whether the candidate will initiate or receive connectivity check requests.  In OCS 2007, TCP A/V Edge candidates behave as active and passive, but TCP NAT candidates were passive only.  However this was not apparent from looking at the candidate list SDP.  Another difference is the priority encoding.  ICE version 6 used a three digit decimal to encode the priority and required floating point math to compute the combined priority of a candidate pair.  In ICE version 19, the priority is now an integer, which makes the computation less intensive.

Again, the details of the SDP differences between the two ICE versions is not terribly important.  Just remember the multipart nature of the SDP and how an R2 endpoint negotiates with legacy ICE endpoints.

Differences in A/V Edge 50,000 port range requirement
In OCS 2007, the external side of the A/V Edge server role required ports 50,000-59,999 to be open for UDP and TCP in the inbound and outbound direction.  Although this was a secure solution (see my original blog post), networking administrators perceived this to be a security threat and were very resistant to deploying the A/V Edge role.  To mitigate this deployment hurdle, OCS 2007 R2 reduces the requirement to just allowing ports 50,000-59,999 for TCP outbound only.  Moreover, the product documentation now states that this outbound TCP port support is only required to support federation with OCS 2007 R2 environment.  To support remote users only, opening ports UDP 3478 and TCP 443 is sufficient.  (This remote-only mode worked in OCS 2007, but was not officially supported.)  What changed in the A/V Edge?  Well, the A/V Edge now supports a federation over a “tunneled” link.

Let’s say a R2 OC endpoint within the Contoso company network calls an R2 OC endpoint within the Litware company network.  Both endpoints still advertise allocated ports in the 50,000-59,999 range in their candidate lists.  Now let’s say connectivity checks are happening and the Contoso R2 A/V Edge receives a UDP STUN connectivity check destined for the Litware A/V Edge.  Instead of sending that to the Litware A/V Edge using a source and destination port in the 50,000-59,999 range, the Contoso A/V Edge actually encapsulates this connectivity check in a new TURN tunnel message and sends it to the Litware A/V Edge using a UDP source and destination port of 3478.  Keep in mind that the intended source and destination IP/port numbers are passed within this tunnel packet.  When the Litware R2 A/V Edge receives this tunnel packet, it unpacks the message, looks at the intended source/destination IP/port info, and treats the packet as if it came to the destination IP/port from the source IP/port.

The idea is that conveying the knowledge of the intended source and destination IP/port for this connectivity check provides the equivalent security as actually sending the connectivity check along that route.  This explains why UDP ports in the 50,000-59,999 range are no longer needed.  Why is TCP needed in the outbound direction only?  In turns out TCP also supports the same tunneling mechanism.  However, TCPs connection oriented nature means problems can arise if the listening port is used as the source port when opening a TCP connection.  So in the connectivity check example used above, the Contoso A/V Edge opens a TCP connection to port 443 on Litware’s A/V Edge, choosing and ephemeral source port in the 50,000-59,999 port range.

Supporting federation with legacy A/V Edge servers
The example above works for two R2 OCS deployments.  What would happen if Litware was still on OCS 2007?  Again, both OC endpoints will advertise A/V Edge candidates in the 50,000-59,999 port range.  In order for connectivity to succeed, Contoso’s R2 A/V Edge must be able to send a connectivity check to Litware’s A/V Edge and vice versa.  To support the former, Contoso doesn’t know that Litware’s A/V Edge is only on OCS 2007, so it tries to send the tunneling connectivity check packet, but Litware’s A/V Edge is legacy, so it drops these packets.  Hearing no response, the Contoso A/V Edge will then flip to direct mode where it will send the packet using a source and destination port in the 50,000-59,999 port range.  Similarly in the other direction, the Litware A/V Edge has no ability to send a tunneled connectivity check, so it sends directly in the 50,000-59,999 port range as well.  The same logic applies to TCP connectivity checks.  You can now see why opening the 50,000-59,999 port range for UDP and TCP in the inbound and outbound direction is required to support federation with legacy OCS 2007 A/V Edge deployments.

Port Range Implications
Supporting two versions of Ice in an Invite does have implications on the number of ports allocated at the start of a call.  In the SDP snippet above, you’ll notice the version 6 ICE candidates are totally different than the version 18 ICE candidates, meaning two full candidate sets are allocated instead of just one set in OCS 2007.  Early media could also have an impact on the number of allocated ports if a called user has multiple points of presence.  Each called endpoint will allocate a set of candidates and perform a full ICE negotiation prior to the call being answered.  That application sharing uses ICE could also increase the port allocation usage for ICE.

The majority of these ports are short lived and will be de-allocated within 10 seconds of the call being answered.  The only ports that remain for the duration of a call are actually used to send and receive media.  Nonetheless, this increased port usage at the start of a call could be an issue for enterprises who have narrowed the allowed port range of their endpoints or the reduced number of ports in the A/V Edge’s 50,000 port range.  For these reasons, the OCS team recommends the media port range for R2 Office Communicator clients to be at least 40, twice the recommendation provided in OCS 2007.

Conclusion
Although the fundamental architecture of media traversal remains the same in OCS 2007 R2, a number of enhancements have been.  Key impacts include: faster negotiation of the media stream through early media ICE negotiations, leveraging ICE/STUN/TURN for new modalities such as application sharing, and easing the port range requirements on the A/V Edge server through a tunneled federation mode.  This revised implementation of ICE/STUN/TURN will serve as a great foundation for enabling connectivity of new media scenarios in future versions of the Microsoft Unified Communications product line.

Alan Shen | Director


March 18, 2009 08:53 by alanshen Permalink | Comments (89) | Comment RSS RSS Button Image


Privacy  |  Contact  |  Terms of Use  |  www.unifysquare.com | Copyright © 2009 Unify2  -  All rights reserved.
Microsoft and the Microsoft Logos are trademarks of Microsoft, Inc. Unify2 is a trademark of Unify Square, Inc.