||H.323 was designed with a good understanding of the requirements for multimedia communication over IP networks, including audio, video, and data conferencing. It defines an entire, unified system for performing these functions, leveraging the strengths of the IETF and ITU-T protocols.
As a result, it might be reasonable for users to expect about the same level of robustness and interoperability as is found on the PSTN today, although this admittedly varies across the globe.
H.323 was designed to scale to add new functionality. The most widely deployed use of H.323 is "Voice over IP" followed by "Videoconferencing", both of which are described in the H.323 specifications.
|SIP was designed to setup a "session" between two points and to be a modular, flexible component of the Internet architecture. It has a loose concept of a call (that being a "session" with media streams), has no support for multimedia conferencing, and the integration of sometimes disparate standards is largely left up to each vendor.
As a result, SIP is now a 10-year old protocol with a vast number of interoperability problems. While SIP has been successfully deployed in some environments, those are generally "closed" environments where the means of interoperability has been PSTN gateways.
H.323 has defined a number of features to handle failure of intermediate network entities, including "alternate gatekeepers", "alternate endpoints", and a means of recovering from connection failures.
|SIP has not defined procedures for handling device failure. If a proxy fails, the user agent detects this through timer expiration. It is the responsibility of the user-agent to send a re-INVITE to another proxy, leading to long delays in call establishment.
||ASN.1, a standardized, extremely precise, easy-to-understand structural notation that is used by many other systems.
||ABNF, or Augmented Backus-Naur Form, a syntactical notation. SIP uses the ABNF as defined in RFC 2234.
||H.323 encodes messages in a compact binary format that is suitable for narrowband and broadband connections. Messages are efficiently encoded and decoded by machines, with decoders widely available (e.g., Ethereal).
||SIP messages are encoded in ASCII text format, suitable for humans to read. As a consequence, the messages are large and less suitable for networks where bandwidth, delay, and/or processing are a concern.
SIP messages get so large that they sometimes exceed the MTU size when going over WAN links, resulting in delays, packet loss, etc. As a result, effort has been made to binary encode SIP (e.g., RFC 3485 and RFC 3486).
|H.323 is extended with non-standard features in such a way as to avoid conflicts between vendors. Globally unique identifiers prevent feature and data element collision.
||SIP is extended by adding new header lines or message bodies that may be used by different vendors to serve different purposes, thus risking interoperability problems.
The risk is admittedly small, but this problem has already been seen in the real world with similar extension schemes.
|H.323 is extended by the standards community to add new features to H.323 in such a way as to not impact existing features. However, new revisions of H.323 are published periodically, which introduce new functionality that is mandatory, yet done in such a way as to preserve backward compatibility.
||SIP is extended by the standards community to add new features to SIP in such a way as to not impact existing features. However, new revisions of SIP are potentially not backward compatible (e.g., RFC 3261 was not entirely compatible with RFC 2543). In addition, several extensions are "mandatory" in some implementations, which cause interoperability problems.
|H.323 has the ability to load balance endpoints across a number of alternate gatekeepers in order to scale a local point of presence. In addition, endpoints report their available and total capacity so that calls going to a set of gateways, for example, may be best distributed across those gateways.
||SIP has no notion of load balancing, except "trial and error" across pre-provisioned devices or devices learned from DNS SRV records. There is no means of detecting the load on a particular gateway or to know whether a device has failed, meaning that proxies simply have to try a PSTN gateway, wait for the call to timeout, and then try another.
|When an H.323 gatekeeper is used, it may simply provide address resolution through one RAS message exchange, or it may route all call signaling traffic. In large networks, the direct call model may be used so that endpoints connect directly to one another.
||When using a SIP proxy to perform address resolution for the SIP device, the proxy is required to handle at least 3 full message exchanges for every call. In large networks, such as IMS networks, the number of messages on the wire may be excessive. A basic call between two users may require as many as 30 messages on the wire!
|An H.323 gatekeeper can be stateless using the direct call model.
||A SIP proxy can be stateless if it does not fork, use TCP, or use multicast.
|H.323 defines an interface between the endpoint and gatekeeper for address resolution using ARQ or LRQ. The H.323 gatekeeper may use any number of protocols to discover the destination address of the callee, including LRQs to other gatekeepers, Annex G/H.225.0, TRIP, ENUM, and/or DNS. The endpoint does not have to be concerned with the mechanics of this process, and the processing requirements for address resolution placed on the gatekeeper by H.323 are for just a single message exchange.
Although out of scope of H.323, an H.323 endpoint may perform its own address resolution using ENUM and/or DNS and then place a direct call to the resolved address or provide the resolved address to the gatekeeper as an "alias".
|While SIP has no address-resolution protocol, per se, a SIP user agent may route its INVITE message through a proxy or redirect server in order to resolve addresses. The SIP proxy may use various protocols to discover the destination address of the callee, including TRIP, ENUM, and/or DNS. The endpoint does not have to be concerned with the mechanics of this process. Unfortunately, the processing requirements placed on the SIP proxy are higher than with H.323 because at least 3 message exchanges must take place between the SIP device, SIP proxy, and the next hop.
Although out of scope of SIP, a SIP user agent may perform its own address resolution using ENUM and/or DNS and then place a direct call to the resolved address or through a proxy.
||Flexible addressing mechanisms, including URIs, e-mail addresses, and E.164 numbers.
H.323 supports these aliases:
H.323 also supports overlap sending with no additional overhead, except conveyance of the newly received digits in a single message.
- E.164 dialed digits
- generic H.323 ID
- transport address
- email address
- party number
- mobile UIM
- ISUP number
|SIP only understands URI-style addresses. This works fine for SIP-SIP devices, but causes some confusion when trying to translated various dialed digits. The unofficial convention is that a "+" sign is inserted in the SIP URI (e.g., "sip:+email@example.com") in order to indicate that the number is in E.164 format, versus a user ID that might be numeric.
SIP has support for overlapped signaling defined in RFC 3578, though additional digit received requires transmission of three messages on the wire (a new INVITE, a 484 response to indicate that the address is incomplete, and an ACK).
||Even with H.323's direct call model, the ability to successfully bill for the call is not lost because the endpoint reports to the gatekeeper the beginning and end time of the call via the RAS protocol. Various pieces of billing information may be present in the ARQ and DRQ messages at the start and end of the call.
||If the SIP proxy wants to collect billing information, it has no choice but to stay in the call signaling path for the entire duration of the call so that it can detect when the call completes. Even then, the statistics are skewed because the call signaling may have been delayed. Otherwise, there is no mechanism in SIP to perform any accounting/billing function.
||A call can be established in as few as 1.5 round trips using UDP:
Of course, more elaborate call establishment procedures may be required to negotiate complex capabilities, negotiate complex video modes, etc.
|A call can be established in as few as 1.5 round trips using UDP:
<- 200 OK
Most real-world flows are more complex, as they often pass through one or more proxy devices, have intermediary response messages, and "negotiate" capabilities through a "trial and error" process that is far from scientific.
||H.323 entities may exchange capabilities and negotiate which channels to open, including audio, video, and data channels. Individual channels may be opened and closed during the call without disrupting the other channels.
||SIP entities have limited means of exchanging capabilities. RFC 3407 is the state of the art, which is more or less a "declaration" mechanism, not a negotiation procedure. The end result is still a "trial and error" approach in case the called party does not support the proposed media.
||H.323 gatekeeper can control the call signaling and may fork the call to any number of devices simultaneously.
||SIP proxies can control the call signaling and may fork the call to any number of devices simultaneously.
||H.323 borrows from traditional PSTN protocols, e.g., Q.931, and is therefore well suited for PSTN integration. However, H.323 does not employ the PSTN's circuit-switched technology--like SIP, H.323 is completely packet-switched. How Media Gateway Controllers fit into the overall H.323 architecture is well-defined within the standard.
||SIP has no commonality with the PSTN and such signaling must be "shoe-horned" into SIP. SIP has no architecture that describes the decomposition of the gateway into the Media Gateway Controller and the Media Gateways. This has been a recent study of 3GPP and others in the form of IMS. Presently, there are about 4 "IMS" variants: 3GPP, ITU NGN, 3GPP2, and PacketCable. Pick the architecture you like best, I suppose.
||Services may be provided to the endpoint through a web-browser interface using HTTP or a feature server using Megaco/H.248. In addition, services may be provided to an endpoint as it places a call, as a call arrives, or during the middle of a call by a gatekeeper or other entity that routes the call signaling. As a result, H.323 is well-suited to providing new services.
||SIP devices can receive service from a SIP proxy as the endpoint places a call, as a call arrives, or during the middle of a call. There is no defined way within SIP of providing services via a web browser or a feature server, as everything is done within the context of a "session".
One may provide ad-hoc services through other means, such as XML, SOAP, or CPL. However, there are no standards for this.
|Video and Data Conferencing
||H.323 fully supports video and data conferencing. Procedures are in place to provide control for the conference as well as lip synchronization of audio and video streams.
||SIP has limited support for video and no support for data conferencing protocols like T.120. SIP has no protocol to control the conference and there is no mechanism within SIP for lip synchronization. There is no standard means of recovering from packet loss in a video stream (to parallel H.323's "video fast update" command).
||H.323 does not require a gatekeeper. A call can be made directly between two endpoints.
However, most devices do utilize a gatekeeper for the purpose of registration and address resolution.
|SIP does not require a proxy. A call can be made directly between two user agents.
However, most devices do utilize a SIP proxy for the purpose of registration, address resolution, and call routing.
||H.323 supports any codec, standardized or proprietary. No registration authority is required to use any codec in H.323.
||SIP supports any IANA-registered codec (as a legacy feature) or other codec whose name is mutually agreed upon.
||Provided by H.323 "proxy" or by the endpoint, both in conjunction with a gatekeeper residing in the public network. Refer to H.460.17, H.460.18, and H.460.19.
||SIP does not defined a NAT/FW traversal mechanism, as this is left to other standard. Some standards that have been defined or are being defined are STUN, TURN, ANAT, and ICE. (All of this has been work in progress for years, with most workable solutions done by agreed convention.)
||Reliable or unreliable, e.g., TCP or UDP. Most H.323 entities use a reliable transport for signaling.
||Reliable or unreliable, e.g., TCP or UDP. Most SIP entities use an unreliable transport for signaling.
||Routing gatekeepers can detect loops by looking at the CallIdentifier and destinationAddress fields in call-processing messages. If the combination of these matches an existing call, it is a loop. Infinite loops may be prevented by utilizing the hopCount field in the SETUP message.
||The Via header facilitates this. However, there has been talk about deprecating Via as a means of loop detection due to its complexity. Instead, the Max-Forwards header seems to be the preferred method of limiting hops and therefore loops. In November 2005, a presentation was given on issues with max-forwards. So, what is the right solution?
||Yes, location requests (LRQ) and auto gatekeeper discovery (GRQ).
||Yes, e.g., through group INVITEs.
|Third-party Call Control
||Yes, through third-party pause and re-routing which is defined within H.323. More sophisticated control is defined by the related H.450.x series of standards.
||Yes, through SIP as described in RFC 3725.
|Minimum Ports for VoIP Call
||3 (Call signaling, RTP, and RTCP.)
||3 (SIP, RTP, and RTCP.)
||Yes, an MC is required for this, but it could be co-located in a participating endpoint, or all endpoints could contain an MC. A stand-alone conference bride may provide this functionality and H.323 has well-defined procedures for such entities.
What distinguishes H.323 is not that it requires yet another onerous physical entity for conferencing (it does not) but that it just has a name for this functionality, an "MC," and that it provides a flexible means of implementing that functionality.
|No; however, SIP user agents may perform conferencing themselves. A stand-alone conference bridge may also provide this functionality.
||"VISUAL TELEPHONE SYSTEMS AND EQUIPMENT FOR LOCAL AREA NETWORKS WHICH PROVIDE A NON-GUARANTEED QUALITY OF SERVICE"
It is now, "Packet-based multimedia communications systems."
Despite the word, "VISUAL," in the original title, H.323 has never described just a videoconferencing solution--support for video and data has always been optional. And the reference to LANs may be misleading because H.323 was intended from the start to support simple and "complex topologies" and not just single-segment networks, which "LOCAL AREA NETWORKS" may imply.
|"Application-level protocol for inviting users to multimedia conferences [emphasis ours]"
It is now, "SIP: Session Initiation Protocol."
Note that the "multimedia conferences" referred to in the original title are loosely coupled multicast conferences, à la MBone. This is because SIP was intended to be just a point-to-point version of SAP and not the "carrier-class solution addressing a wide area" that many would have you believe.
||H.323 is based on H.324, not H.320. However, H.324 was designed to be a better H.320.
As you can see, H.323 is no more a "legacy" protocol than SIP. Both are very modern protocols.
- 1990 - H.320 approved.
- 1995 - H.324 approved.
- 1995 - H.323 working draft circulated.
- 1996 - H.323 approved.
- 2000 - H.323v4 approved.
- 2003 - H.323v5 approved.
- 2006 - H.323v6 approved.
|SIP is frequently allied with the Internet and the World Wide Web by way of HTTP.
- 1990 - WWW and HTTP described and implemented.
- 1996 - SIP Internet Draft circulated.
- 1999 - SIP (RFC 2543) approved.
- 2002 - SIP (RFC 3261) approved.
While backward compatibility was not maintained between the 1999 and 2002 documents, the version number remained the same "version 2.0".
||Yes, e.g., OpenH323.
||Yes, e.g., reSIProcate.
||Unicast, multicast, star, and centralized.
||Unicast, multicast, star, and centralized.
||Yes, via H.235.
||Yes, via HTTP (Digest and Basic), SSL, PGP, S/MIME, or various other means.
||Yes, via H.235 (including use of SRTP, TLS, IPSec, etc.).
||Yes, via SSL, PGP, S/MIME, or various other means.
||Three ways, with the alphanumeric choice of the H.245 UserInputIndication message being the baseline carriage common to all H.323 endpoints.
||Three ways. There is no baseline carriage, which presents issues of interoperability. However, transport of DTMF via the INFO method and RFC 2833 are most common.
||Refer to the H.323 Information Site.
||Refer to the SIP Information Site.