Audio, Video and Data Conferencing over the Internet: Technologies and Standards
Bernd J. Kurz Faculty of Computer Science, University of New Brunswick Fredericton, N.B., Canada E3B 5A3
1. What is it ? - Driving Forces 2. A brief History 3. IP Networks and R/T data: a Challenge 4. R/T Protocol Suites 5. Managed IP Nets: QoS 6. Conferencing Standard H.323 7. Directory Servers 8. Multi-Party Conferencing 9. IP Telephony 10. Future Unifying IP Networks 11. Conclusions 12. References Fredericton, N.B., November 1999 ==========================================================1. What is it ? - Driving Forces
- Personal communication over the Internet
beyond e-mail and WWW multi-media environment consisting of: - Audio - transmission of digitized sound one-way and two-way radio broadcast, telephony - Video - transmission of digitized moving images adding a visual medium one-way and two-way TV broadcast, video conferencing - Data - traditional medium text messaging file transfer whiteboarding - graphics, photo application sharing application collaboration- Synergy of above media in multi-media environment
state-of-the art conferencing services Point-to-point and multi-party capability- Exploit accessability of the Internet
enabling technologies: Voice-over-IP (VoIP) Video-over-IP Fig 1.1 A Typical Internet Videoconference ==========================================================Driving Forces
Desktop client/server applications are dominant- Distance learning
exploiting MM approach- Telecommuting
enabling mobility of workforce- Commercial services
advertisement with MM information - news dissemination ‘live': radio, TV broadcasting ‘canned': audio and video-on-demand Technical on-line support with MM CTI systems with IV(V)R- Entertainment
chat lines virtual meeting places Buddy sites, ICQ, AOL- Kids' stuff
education and fun combined adventure games Fig 1.2 Education Fig 1.3 YTV Fig 1.4 ICQ Panel ==========================================================2. A brief History
The migration of voice (and later data and video) from Switched-Circuit networks (SCN) to IP-routed networks
- Pre 1980 PSTN
analog networks, voice only CCT-switched, voice-grade 4 kHz- Post 1980 N-ISDN
digital networks, voice and data services CCT-switched and packet-switched, 16 (D) to 64...128 kbps (B)- Post 1990 B-ISDN
digital broadband networks, voice, video and data fast-cell switching (ATM), CCT-emulation, 155+ Mbps- Post 1985 Internet
IP-routed networks, initially data world-wide access, inexpensive, available bandwidth Fig 2.1 A Circuit-Switched Network ==========================================================- Switching versus Routing
CCT-switching (Layer 2) PSTN
fixed physical path end-to-end, reserved bandwidth reliable ordered delivery fast table-lookup 100% bandwidth reservation guaranteed high QoSPacket-switching (Layer 2) X.25, FR, ATM
Fixed virtual path end-to-end Virtual CCT (VCI, VPI, LCI) reliable ordered delivery fast table-lookup path inventory is known bandwidth reservation possible mostly statistical multiplexing used statistically guaranteed QoS (<100%)IP-routing (Layer 3) Internet
datagram technique variable path per packet, no resource reservation unreliable, out-of-order, best-effort delivery computer-intensive routing path inventory not known no effective reservation possible no QoS guarantee robust, adaptive and self-healing Fig 2.1 Circuit-Switched Network Fig. 2.2 IP-Routed Network ==========================================================- Early audio/video conferencing systems, 1980 ...
CCT-switched PSTN, nx56...nx64 H.260 codec standards predictable performance over reserved links UNB's PictureTel 1...3 N-ISDN lines or FR, ATM costly- Emerging audio/video conferencing systems, 1995 ...
PC-based desktop conferencing IP-based, available low bit rates (...20...100 kbps) over non-reserved, unreliable, non-guaranteed QoS links Internet Internetphone 1995 affordable to free- Evolution of ITU-T H.32x standard family
H.320 (1990) N-ISDN, 2..23xB + D H.321 (1996) B-ISDN, ATM H.322 LANs with guaranteed QoS H.323 (1996) unreliable unreserved packet-based nets Internet, Enterprise IP nets, IP LANs H.324 (1995) PSTN, V.90, V.32+ modem speed- Focus is on H.323 standard framework
Fig. 2.3 Desktop Videoconferencing Fig. 2.4 H.323 Architecture ==========================================================3. IP Networks and R/T Data: a Challenge
- Routing in IP networks
Layer 3 operation connection-less Datagram technique independent store-and-forward per packet at every node variable path per packet computer-intensive routing generally no resource reservation unreliable, out-of-order delivery unpredictable, best-effort no QoS guarantee robust and self-healing Fig. 3.1 IP-Routed Network- Internet Protocol suite
Layer 4 operation, end-to-end TCP/UDP packets carried by IP packets TCP reliable connection-oriented, slow UDP unreliable, fast and minimum delay Fig. 3.2 UDP/TCP-IP Protocl Stack- Real-Time data sources
data streams with time-constraints voice, video analog signals continuous in amplitude and time discretization regular sampling in time (Nyquist, Shannon) uniform quantization of amplitude (resolution, jnd) ADC/DAC telephone quality 4 kHz BW 8000 samples at 7..8 bits/sample PCM coding at 56...64 kbps data rate VGA video 640x480 pels at 30 fps, 256 colors approx. 75 Mbps- Need for effective data compression
==========================================================- Data compression
Codecs reduce/remove redundancies in data stream lossless and lossy CBR and VBR- Voice codecs
waveform coding in time-domain DPCM, ADCPM, ADM -law and A-law compression rates 2 ... 8 CBR vocoder coding in spectral domain model, analyse and synthesize voice formants, pitch, excitation, ... send signature of voice frame reconstruct at destination high compression rates 10... 20 computer-intensive or HW-based mostly VBR with silence compression- Standard codec types ITU-T
G.711 waveform, PCM (48, 56,) 64 kbps G.727 waveform, ADPCM, 8k x 5,4,3,2 = 40...16 kbps G.723.1 vocoder, low-bit rate CELP, 5.3...6.3 kbps G.729 vocoder, ACELP, 8 kbps Fig. 3.3 ADC and Pulse Code Modulation (PCM) Fig. 3.4 Blockdiagram of CELP Vocoder ==========================================================- Video codecs
transform coding in spectral domain JPEG-based VBR DFTs on 8x8 pixel blocks quantization of spectral coefficients truncation of diagonal scan (energy clustering) MPEG-based CBR enhanced by motion-prediction frame-to-frame redundancy reduction very high compression ratio to V.32+ data rates- Standard codec types ITU-T
H.261 N-ISDN and IP, LAN, 64 kbps H.263 V.34+ modems, 28.8 kbps H.262 MPEG-1, T1...OC3 (ATM), 1.44+ Mbps- Consequences for real-time data over routed IP nets
source sends regular data stream destination needs timely in-order delivery for reconstruction but no delivery guarantee, no QoS guarantee we have a problem- Quality of Service
affecting quality of received voice and video some criteria sustained data rate propagation delay delay - jitter packet loss and error ratio out-of-order delivery echo- Need special end-to-end-protocols to overcome lack of QoS
RT protocols partially restore pseudo-QoS at destination penalties arise ==========================================================4. R/T Protocol Suites - a Partial Solution
- Packet arrivals at destination of an IP network
subject to delay delay-jitter out-of-order packets loss of packets reconstruct voice from packets in streaming mode- Non-buffered receiver
reconstruct voice from ordered packet stream Fig. 4.1 Arrival and Processing of Voice Packets, Non-buffered wait for late packets gaps of silence and temporal distortion cumulative delay in reconstruction complex buffer management missing packets max permissible wait time W discard late arrivals >W silence gaps are less disturbing than temporal distortions ==========================================================- Pre-buffering at receiver
counteract effects of irregular packet arrivals Fig. 4.2 Pre-buffering and Operation pre-buffer L seconds of R/T data insert late arrivals release data regularly for voice reconstruction large L less probability of empty buffer inverted queuing system optimal buffer length L depends on delay-jitter statistics application one-way delivery (radio): large L permissible, ...60 sec two-way telephony: <200msec trade-off initial delay <-> glitches responsiveness <-> voice quality auto-adaptation to traffic statistics, net congestion- Requirement for time information with each packet
==========================================================- Real-time protcol suite standard
Layer 4 operation end-to-end typically above UDP RTP ITU-T, IETF Real-time protocol defines R/T data encapsulation sequence numbering and time-stamping does not guarantee timely delivery RTCP ITU-T, IETF companion protocol to RTP utilizes time information for delivery quality monitoring feedback to source control of source delivery synchronization of multiple R/T data streams audio - video lip synchronization RTP & RTCP do not establish QoS within IP network but reduce effect of lack of QoS at destination pre-buffering is implemented at lower layers (e.g. link, physical)- Alternative real-time protocols
Apple QuickTime MS Active Video Fig. 4.3 RTP Packet Encapsulation Fig. 4.4 RTP/UDP/IP Protocol Stack ==========================================================- RTSP
- Real-Time Streaming protocol, IETF - developed by Netscape, RealPlayer for user-interaction with the presentation of MM - control protocol for timed delivery of MM streams particularly one-way streaming on-demand audio and video from servers - does not deliver R/T itself can co-operate with RTP using timing information - offers functionalities beyond RTCP VCR-style controls ‘play', ‘seek', ‘pause', ‘stop' - used by RealPlayer G2 - supports ‘canned' services: on-demand audio (music, radio, news, ...) on-demand video (movies, TV, on-line learning, ...) Fig. 4.5 Real-Player and VCR-style Controls ==========================================================5. Managed IP Networks: QoS
- Lack of guaranteed QoS in traditional IP network
routed IP and datagram technique: variable path end-to-end per packet resource reservation infeasible- Unsuitability for high-quality R/T data streams
e.g. audio and video conferencing - Current efforts to improve QoS by managing networks resource reservation differentiated service classes: R/T versus computer data- Managed IP networks
exercise control over connections and traffic avoid overbooking of resources allow for (statistical) reservation of resources- Tiered access to resources
prioritization of IP packet processing relative to current traffic load from speedy to deferred to discard IPV4 Type-of-Service (TOS) field IPV6 Priority (PRI) field no reservation is effected overbooking of resources is not avoided but symptoms are dealt with expect varying tariffs in future- RSVP Resource Reservation Protocol ITU-T
suitable for connection-less IP services signaling protocol to request reservation does not actually reserve requires lower-layer reservation capabilities e.g. ATM network Fig 5.1 IVP4 and IVP6 Frame Encapsulation (TOS, PRI) ==========================================================- Gatekeeper
central controller in an IP network zone provides full connection management controls of admission of end stations limits (statistical) overbooking of routers can do bandwidth allocation network-wide all connection requests via Gatekeeper admission or refusal direct or indirect signaling for connection setup RAS Registration-Admission-Status protocol no actual router resource reservation but helps avoid net-congestion- Layer 3 Switching
introduce switching into network layer emulation of virtual circuit/path in connection-less environment switching along temporary fixed paths set up to destination by traditional routing identified by tag per destination routers maintain connection tables IP packets are assigned tag and switched towards destination fast table look-up inventory of paths in routers enables better reservation Commercial products Ipsilon Network's IP Switching ,1995 Cisco's NetFlow (WAN) and Tag Switching (LAN), 1997 Fig. 5.2 Gatekeeper-controlled Network Fig. 5.3 CISCO's Tag Switching LAN Structure ==========================================================- Excess raw bandwith
popular thought and trivial solution overbooking of resources avoided by generous reserves can work if bandwidth and router CPU-power stay ahead of expectations no dominant user traffic is permitted policing of traffic entering the network needed access bottle-necks are avoided (e.g. shared local-loops) incremental upgrade of resources costly regular investment inefficient use of network resources not applicable to popular wireless networks limited frequency real-restate will it work over the long term ? ==========================================================6. Conferencing Standard H.323
- Evolution
1996 Microsoft and Intel developed ‘Open Conferencing' standard later adopted by ITU-T as H.323 Family of ITU-T standards H.32x, designed for interoperability H.320 (N-ISDN) H.321(B-ISDN) H.322 (LAN c/w QoS) H.323 (packet network, IP) H.324 (PSTN)- H.323 Standard
‘Standard for Multi-Media Conferencing over Unreliable Unreserved Packet-based Networks' e.g. Internet, Enterprise IP nets, IP LANs specifies: components, protocols and procedures support for: audio, video and data conferencing gatekeepers and gateways for PSTN integration 1996, 1998 (VoIP standardized, gatekeeper, gateway) 1999...2000 ? (fax, MCU support)- Architecture of framework
Modular building blocks Codecs real-time protocols data collaboration protocols signaling & negotiation protocols Transport & networking protocols- Minumum capabilities are specified
audio G.711 support connection support including gatekeeper Fig. 6.1 H.323 Architecture ========================================================== Fig. 6.1 H.323 Architecture and detailed Protocol Stacks- Codecs
audio G.711 (PCM), optional G.723.1, G.729 and more point-to-point over UDP video H.261, optional H.263 and more point-to-point over UDP- Data collaboration
T.120 supporting applications in multi-party mode collection of standards T.121... T.128 Message chat Application sharing File transfer Application collaboration Whiteboarding- Real-time data streaming
RTP and RTCP over UDP- Call signaling
H.225 over TCP for connection management with peer(s) H.225 RAS over UDP for registration, etc. with gatekeeper- Control signaling
H.245 over TCP for negotiation of capabilities of end stations for creation of media streams ==========================================================- Popular Commercial Application Packages
MS Netmeeting (V.3.01)
fully H.323 compliant multi-window design Control and media windows Directory window freewareWhite Pine Cu-SeeMe (V4.1)
originally developed by Cornell University fully H.323 compatible proprietary mode for multi-party audio, video Conferencing using ‘Reflector' servers excellent codec selection including Digitalk CELP, MPEG multi-window design Control and media windows WWW-based Directory window purchase approx. $99Some other products
VDOPhone, iVisit, .... Fig. 6.2 Netmeeting (V.3) Windows layout Fig. 6.3 Cu-SeeMe (V.4.1) Windows layout ==========================================================7. Directory Servers
- Connection establishment under H.225
by IP address 131.202.129.180- Dynamic IP address
ISP assigns dynamic IP address upon sign-in (NBNet, ...) varying address, valid temporarily for this session- Directory Server
tool to find current dynamic IP address of an Internet user typically mapping e-mail address IP address not part of H.323 supported by most application packages can serve as ‘Address-book' to publicize users central to ‘Buddy' systems (ICQ, AOL, ...) with popular personal communication functionalities built around the directory server ils, uls, dls servers, e.g. ils.four11.com- Architecture and Operation
X.500 directory structure, ITU-T standard LDAP access protocol, subset of DAP supports ‘post', ‘search' and ‘retrieve' post upon log-in: name, location,... ,e-mail addr, IP addr (hidden) search by: name, ..., e-mail addr (not IP address) retrieve by: e-mail-addr get IP address current IP address used by H.225 to connect Fig. 7.1 Directory Window Fig. 7.2 Directory Server, Data Flow and X.500 Database ==========================================================8. Multi-Party Conferencing
- Current H.323 (1998) standard supports
point-to-point two-way audio and video conferencing multi-party two-way data conferencing H.332 one-way broadcast extension- Multi-Party Conferencing Unit (MCU)
separate high-performance server replicator of media streams to all logged-in peers next-generation H.323 (1999...2000 ?) to support MCU- Today: proprietary ‘MCU' available, Cu-SeeMe (V1.0...4.1)
Reflector (mirror) replicates audio, video and data streams to all logged-in peers exponential growth of compound traffic suitable for 3...5 peers over Internet (<100 kbps sustained) popular as ‘Virtual Meeting Places', ‘Rooms', etc.- Traffic growth is problematic
exponential traffic growth with number of peers single point of congestion at MCU better approaches: network-oriented multicasting, MBONE reduce rapid growth of media traffic Fig. 8.1 MCU Integration into Network and Data Flow ==========================================================9. IP Telephony
- H.323 supports inter-operation with incompatible networks
described by H.246 most popular is IP Telephony (Internet Telephony)- Integration of IP-based audio conferencing with PSTN
carry long-distance portion of call over Internet use only local PSTN service to Central Office significant l.d. savings ‘long-distance calls at local call cost'- Services for seamless integration
PC-to-phone PC-to fax phone-phone- Enabling technologies
voice-over-IP , VoIP Gateways support by H.323 terminals- Gateways
separate high-performance edge servers interconnection of incompatible networks e.g. IP-based Internet and PSTN (POTS, ISDN) conversion of media formats IP encapsulated RTP packets analog, T1, ISDN translation of call signaling protocols address translation to E.164 call establishment/tear-down requests e.g. in-band H.225 signaling packets to SS7 Fig. 9.1 Network Interconnection by Gateway (IP-PSTN) ==========================================================- Some commercial service providers
Gateway access requires subscription Vocaltech Net2Phone Delta3 MediaRing- Some application software packages
many packages are freeware MS Netmeeting MediaRing Internetphone Net2Phone and Net2Fax VDOPhone PhoneFree- Typical calling costs
$0.05 within N.A, $0.07 France, $0.10 Germany voice quality still below traditional telephone quality (on average )- Future offerings
IP-based corporate phone systems over installed LAN, MAN Selsius' first IP-based H.323 PBX (1998) Centrex-over-IP Take-anywhere phone-IP interface with integrated gateway MediaRing IP-Phone Bridge, US$149 Fig. 9.2 Net2Phone Fig. 9.3 MediaRing IP-Phone Bridge ==========================================================10. Future Unifying IP-Networks
- Objective
To build one single multi-service network to provide a full range of communication services efficiently and affordably- Current trend
rely on global end-to-end IP-based managed networks e.g. Internet, Enterprise networks, LANs, MANs- The pioneering effort
Level 3 Communication's first international IP-network announced August 1998 to be completed by 2001 local and long-distance telephony, computer data, later video QoS guarantee security guarantee inter-operability with IP-networks, ATM, PSTN, PBX, CaTV protocol stack founded on emerging IETF standard Fig. 10.1 Interconecton of multiple incompatible Networks and Inter-Gateway Communication (MGCP) ==========================================================- Emerging Protocol Suite
IETF promoted part of MMUSIC (Multiparty Multimedia session Control) inter-gateway/gatekeeper control protocols optimize inter-network routing, traffic balancing SIP-based SIP Session Initiation Protocol SAP Session Announcement Protocol SDP Session Device Protocol PDC Internet Protocol Device Control (IP SS7), Level 3 Comm SGCP Simple Gateway Protocol (Bellcore, Cisco) MGCP Media Gateway Control Protocol = IPDC + SGCP MDCP Media Device Control Protocol PGRP Peer Gatekeeper Routing Protocol much functional overlap and much work left to be done- Co-existence of H.323's control protocols and the IETF suite ?
remains to be seen- Packet cable initiative
extend IP-voice conferencing services (telephony) to CaTV CableIP voice service (Rogers, Time Warner, ...) DOCSIS cable modem interface SIP-based protocol stack Fig. 10.2 SIP/RTP Protocol Stack Fig. 10.3 Cable IP Service using DOCSIS ==========================================================11. Conclusions
- ‘The Internet used to run over the phone - now the phone runs over the Internet'
- last 3 years have seen
an enormous increase of audio and video traffic over the Internet a rapid improvement of media quality better codecs better R/T protocols higher available bandwidth lower hardware and software cost a move from proprietary to international standards- ITU's H.323 standard dominating, with RTP/RTCP for VoIP
enabling vendor-independent audio, video and data conferencing- recent seamless integration of IP-nets and PSTNs
IP telephony shrinking long-distance costs- however the Internet is still unmanaged
prone to net congestions often poor quality of audio and video- need better QoS
managed IP networks, excess bandwidth, or both ? a trend towards unifying IP-networks will tariffs go up ? Fig. 11.1 H.32x Standard Family Fig. 11.2 Gateway-Interconnected Networks Fig. 11.3 IP Soft-Telephone ==========================================================12. References
Daniel and Emma Minoli, "Delivering Voice over IP", John Wiley&Son, 1998 Marcus Goncalves, "Voice Over IP Networks", McGraw Hill, 1998 Uyless Black, "Voice Over IP," Prentice Hall, 1999 D. Briere, et al, "Internet Telephony for Dummies," IDG Books, 1997 MS Netmeeting Resource Kit, Microsoft, http://msdn.microsoft.com/netmeeting/reskit/main.htm, March 1999 NetPhone Review, http://www.cnet.com/content/Reports/Reviews/NetPhones/ ==========================================================13. Demonstrations
Pre-record on videotape, then play back in groups - Live radio broadcast: Sunset radio - Life video TV: CBC or Realplayer - Audio-on-demand: RealPlayer - IP Telephony: MediaRing demo call, one-way - IP Telephony: Net2Phone, actual NB-wide call - Conferencing: Netmeeting, video-conference Whiteboarding (graphics) application sharing - Multi-Part conferencing: Cu-SeeMe, conference groups
Last revised: 11 November 1999, BJK