Audio, Video and Data Conferencing over the Internet: Technologies and Standards

Bernd J. Kurz Faculty of Computer Science, University of New Brunswick Fredericton, N.B., Canada E3B 5A3

1. What is it ? - Driving Forces 2. A brief History 3. IP Networks and R/T data: a Challenge 4. R/T Protocol Suites 5. Managed IP Nets: QoS 6. Conferencing Standard H.323 7. Directory Servers 8. Multi-Party Conferencing 9. IP Telephony 10. Future Unifying IP Networks 11. Conclusions 12. References Fredericton, N.B., November 1999 ==========================================================

1. What is it ? - Driving Forces

- Personal communication over the Internet

beyond e-mail and WWW multi-media environment consisting of: - Audio - transmission of digitized sound one-way and two-way radio broadcast, telephony - Video - transmission of digitized moving images adding a visual medium one-way and two-way TV broadcast, video conferencing - Data - traditional medium text messaging file transfer whiteboarding - graphics, photo application sharing application collaboration

- Synergy of above media in multi-media environment

state-of-the art conferencing services Point-to-point and multi-party capability

- Exploit accessability of the Internet

enabling technologies: Voice-over-IP (VoIP) Video-over-IP Fig 1.1 A Typical Internet Videoconference ==========================================================

Driving Forces

Desktop client/server applications are dominant

- Distance learning

exploiting MM approach

- Telecommuting

enabling mobility of workforce

- Commercial services

advertisement with MM information - news dissemination ‘live': radio, TV broadcasting ‘canned': audio and video-on-demand Technical on-line support with MM CTI systems with IV(V)R

- Entertainment

chat lines virtual meeting places Buddy sites, ICQ, AOL

- Kids' stuff

education and fun combined adventure games Fig 1.2 Education Fig 1.3 YTV Fig 1.4 ICQ Panel ==========================================================

2. A brief History

The migration of voice (and later data and video) from Switched-Circuit networks (SCN) to IP-routed networks

- Pre 1980 PSTN

analog networks, voice only CCT-switched, voice-grade 4 kHz

- Post 1980 N-ISDN

digital networks, voice and data services CCT-switched and packet-switched, 16 (D) to 64...128 kbps (B)

- Post 1990 B-ISDN

digital broadband networks, voice, video and data fast-cell switching (ATM), CCT-emulation, 155+ Mbps

- Post 1985 Internet

IP-routed networks, initially data world-wide access, inexpensive, available bandwidth Fig 2.1 A Circuit-Switched Network ==========================================================

- Switching versus Routing

CCT-switching (Layer 2) PSTN

fixed physical path end-to-end, reserved bandwidth reliable ordered delivery fast table-lookup 100% bandwidth reservation guaranteed high QoS

Packet-switching (Layer 2) X.25, FR, ATM

Fixed virtual path end-to-end Virtual CCT (VCI, VPI, LCI) reliable ordered delivery fast table-lookup path inventory is known bandwidth reservation possible mostly statistical multiplexing used statistically guaranteed QoS (<100%)

IP-routing (Layer 3) Internet

datagram technique variable path per packet, no resource reservation unreliable, out-of-order, best-effort delivery computer-intensive routing path inventory not known no effective reservation possible no QoS guarantee robust, adaptive and self-healing Fig 2.1 Circuit-Switched Network Fig. 2.2 IP-Routed Network ==========================================================

- Early audio/video conferencing systems, 1980 ...

CCT-switched PSTN, nx56...nx64 H.260 codec standards predictable performance over reserved links UNB's PictureTel 1...3 N-ISDN lines or FR, ATM costly

- Emerging audio/video conferencing systems, 1995 ...

PC-based desktop conferencing IP-based, available low bit rates (...20...100 kbps) over non-reserved, unreliable, non-guaranteed QoS links Internet Internetphone 1995 affordable to free

- Evolution of ITU-T H.32x standard family

H.320 (1990) N-ISDN, 2..23xB + D H.321 (1996) B-ISDN, ATM H.322 LANs with guaranteed QoS H.323 (1996) unreliable unreserved packet-based nets Internet, Enterprise IP nets, IP LANs H.324 (1995) PSTN, V.90, V.32+ modem speed

- Focus is on H.323 standard framework

Fig. 2.3 Desktop Videoconferencing Fig. 2.4 H.323 Architecture ==========================================================

3. IP Networks and R/T Data: a Challenge

- Routing in IP networks

Layer 3 operation connection-less Datagram technique independent store-and-forward per packet at every node variable path per packet computer-intensive routing generally no resource reservation unreliable, out-of-order delivery unpredictable, best-effort no QoS guarantee robust and self-healing Fig. 3.1 IP-Routed Network

- Internet Protocol suite

Layer 4 operation, end-to-end TCP/UDP packets carried by IP packets TCP reliable connection-oriented, slow UDP unreliable, fast and minimum delay Fig. 3.2 UDP/TCP-IP Protocl Stack

- Real-Time data sources

data streams with time-constraints voice, video analog signals continuous in amplitude and time discretization regular sampling in time (Nyquist, Shannon) uniform quantization of amplitude (resolution, jnd) ADC/DAC telephone quality 4 kHz BW 8000 samples at 7..8 bits/sample PCM coding at 56...64 kbps data rate VGA video 640x480 pels at 30 fps, 256 colors approx. 75 Mbps

- Need for effective data compression

==========================================================

- Data compression

Codecs reduce/remove redundancies in data stream lossless and lossy CBR and VBR

- Voice codecs

waveform coding in time-domain DPCM, ADCPM, ADM -law and A-law compression rates 2 ... 8 CBR vocoder coding in spectral domain model, analyse and synthesize voice formants, pitch, excitation, ... send signature of voice frame reconstruct at destination high compression rates 10... 20 computer-intensive or HW-based mostly VBR with silence compression

- Standard codec types ITU-T

G.711 waveform, PCM (48, 56,) 64 kbps G.727 waveform, ADPCM, 8k x 5,4,3,2 = 40...16 kbps G.723.1 vocoder, low-bit rate CELP, 5.3...6.3 kbps G.729 vocoder, ACELP, 8 kbps Fig. 3.3 ADC and Pulse Code Modulation (PCM) Fig. 3.4 Blockdiagram of CELP Vocoder ==========================================================

- Video codecs

transform coding in spectral domain JPEG-based VBR DFTs on 8x8 pixel blocks quantization of spectral coefficients truncation of diagonal scan (energy clustering) MPEG-based CBR enhanced by motion-prediction frame-to-frame redundancy reduction very high compression ratio to V.32+ data rates

- Standard codec types ITU-T

H.261 N-ISDN and IP, LAN, 64 kbps H.263 V.34+ modems, 28.8 kbps H.262 MPEG-1, T1...OC3 (ATM), 1.44+ Mbps

- Consequences for real-time data over routed IP nets

source sends regular data stream destination needs timely in-order delivery for reconstruction but no delivery guarantee, no QoS guarantee we have a problem

- Quality of Service

affecting quality of received voice and video some criteria sustained data rate propagation delay delay - jitter packet loss and error ratio out-of-order delivery echo

- Need special end-to-end-protocols to overcome lack of QoS

RT protocols partially restore pseudo-QoS at destination penalties arise ==========================================================

4. R/T Protocol Suites - a Partial Solution

- Packet arrivals at destination of an IP network

subject to delay delay-jitter out-of-order packets loss of packets reconstruct voice from packets in streaming mode

- Non-buffered receiver

reconstruct voice from ordered packet stream Fig. 4.1 Arrival and Processing of Voice Packets, Non-buffered wait for late packets gaps of silence and temporal distortion cumulative delay in reconstruction complex buffer management missing packets max permissible wait time W discard late arrivals >W silence gaps are less disturbing than temporal distortions ==========================================================

- Pre-buffering at receiver

counteract effects of irregular packet arrivals Fig. 4.2 Pre-buffering and Operation pre-buffer L seconds of R/T data insert late arrivals release data regularly for voice reconstruction large L less probability of empty buffer inverted queuing system optimal buffer length L depends on delay-jitter statistics application one-way delivery (radio): large L permissible, ...60 sec two-way telephony: <200msec trade-off initial delay <-> glitches responsiveness <-> voice quality auto-adaptation to traffic statistics, net congestion

- Requirement for time information with each packet

==========================================================

- Real-time protcol suite standard

Layer 4 operation end-to-end typically above UDP RTP ITU-T, IETF Real-time protocol defines R/T data encapsulation sequence numbering and time-stamping does not guarantee timely delivery RTCP ITU-T, IETF companion protocol to RTP utilizes time information for delivery quality monitoring feedback to source control of source delivery synchronization of multiple R/T data streams audio - video lip synchronization RTP & RTCP do not establish QoS within IP network but reduce effect of lack of QoS at destination pre-buffering is implemented at lower layers (e.g. link, physical)

- Alternative real-time protocols

Apple QuickTime MS Active Video Fig. 4.3 RTP Packet Encapsulation Fig. 4.4 RTP/UDP/IP Protocol Stack ==========================================================

- RTSP

- Real-Time Streaming protocol, IETF - developed by Netscape, RealPlayer for user-interaction with the presentation of MM - control protocol for timed delivery of MM streams particularly one-way streaming on-demand audio and video from servers - does not deliver R/T itself can co-operate with RTP using timing information - offers functionalities beyond RTCP VCR-style controls ‘play', ‘seek', ‘pause', ‘stop' - used by RealPlayer G2 - supports ‘canned' services: on-demand audio (music, radio, news, ...) on-demand video (movies, TV, on-line learning, ...) Fig. 4.5 Real-Player and VCR-style Controls ==========================================================

5. Managed IP Networks: QoS

- Lack of guaranteed QoS in traditional IP network

routed IP and datagram technique: variable path end-to-end per packet resource reservation infeasible

- Unsuitability for high-quality R/T data streams

e.g. audio and video conferencing - Current efforts to improve QoS by managing networks resource reservation differentiated service classes: R/T versus computer data

- Managed IP networks

exercise control over connections and traffic avoid overbooking of resources allow for (statistical) reservation of resources

- Tiered access to resources

prioritization of IP packet processing relative to current traffic load from speedy to deferred to discard IPV4 Type-of-Service (TOS) field IPV6 Priority (PRI) field no reservation is effected overbooking of resources is not avoided but symptoms are dealt with expect varying tariffs in future

- RSVP Resource Reservation Protocol ITU-T

suitable for connection-less IP services signaling protocol to request reservation does not actually reserve requires lower-layer reservation capabilities e.g. ATM network Fig 5.1 IVP4 and IVP6 Frame Encapsulation (TOS, PRI) ==========================================================

- Gatekeeper

central controller in an IP network zone provides full connection management controls of admission of end stations limits (statistical) overbooking of routers can do bandwidth allocation network-wide all connection requests via Gatekeeper admission or refusal direct or indirect signaling for connection setup RAS Registration-Admission-Status protocol no actual router resource reservation but helps avoid net-congestion

- Layer 3 Switching

introduce switching into network layer emulation of virtual circuit/path in connection-less environment switching along temporary fixed paths set up to destination by traditional routing identified by tag per destination routers maintain connection tables IP packets are assigned tag and switched towards destination fast table look-up inventory of paths in routers enables better reservation Commercial products Ipsilon Network's IP Switching ,1995 Cisco's NetFlow (WAN) and Tag Switching (LAN), 1997 Fig. 5.2 Gatekeeper-controlled Network Fig. 5.3 CISCO's Tag Switching LAN Structure ==========================================================

- Excess raw bandwith

popular thought and trivial solution overbooking of resources avoided by generous reserves can work if bandwidth and router CPU-power stay ahead of expectations no dominant user traffic is permitted policing of traffic entering the network needed access bottle-necks are avoided (e.g. shared local-loops) incremental upgrade of resources costly regular investment inefficient use of network resources not applicable to popular wireless networks limited frequency real-restate will it work over the long term ? ==========================================================

6. Conferencing Standard H.323

- Evolution

1996 Microsoft and Intel developed ‘Open Conferencing' standard later adopted by ITU-T as H.323 Family of ITU-T standards H.32x, designed for interoperability H.320 (N-ISDN) H.321(B-ISDN) H.322 (LAN c/w QoS) H.323 (packet network, IP) H.324 (PSTN)

- H.323 Standard

‘Standard for Multi-Media Conferencing over Unreliable Unreserved Packet-based Networks' e.g. Internet, Enterprise IP nets, IP LANs specifies: components, protocols and procedures support for: audio, video and data conferencing gatekeepers and gateways for PSTN integration 1996, 1998 (VoIP standardized, gatekeeper, gateway) 1999...2000 ? (fax, MCU support)

- Architecture of framework

Modular building blocks Codecs real-time protocols data collaboration protocols signaling & negotiation protocols Transport & networking protocols

- Minumum capabilities are specified

audio G.711 support connection support including gatekeeper Fig. 6.1 H.323 Architecture ========================================================== Fig. 6.1 H.323 Architecture and detailed Protocol Stacks

- Codecs

audio G.711 (PCM), optional G.723.1, G.729 and more point-to-point over UDP video H.261, optional H.263 and more point-to-point over UDP

- Data collaboration

T.120 supporting applications in multi-party mode collection of standards T.121... T.128 Message chat Application sharing File transfer Application collaboration Whiteboarding

- Real-time data streaming

RTP and RTCP over UDP

- Call signaling

H.225 over TCP for connection management with peer(s) H.225 RAS over UDP for registration, etc. with gatekeeper

- Control signaling

H.245 over TCP for negotiation of capabilities of end stations for creation of media streams ==========================================================

- Popular Commercial Application Packages

MS Netmeeting (V.3.01)

fully H.323 compliant multi-window design Control and media windows Directory window freeware

White Pine Cu-SeeMe (V4.1)

originally developed by Cornell University fully H.323 compatible proprietary mode for multi-party audio, video Conferencing using ‘Reflector' servers excellent codec selection including Digitalk CELP, MPEG multi-window design Control and media windows WWW-based Directory window purchase approx. $99

Some other products

VDOPhone, iVisit, .... Fig. 6.2 Netmeeting (V.3) Windows layout Fig. 6.3 Cu-SeeMe (V.4.1) Windows layout ==========================================================

7. Directory Servers

- Connection establishment under H.225

by IP address 131.202.129.180

- Dynamic IP address

ISP assigns dynamic IP address upon sign-in (NBNet, ...) varying address, valid temporarily for this session

- Directory Server

tool to find current dynamic IP address of an Internet user typically mapping e-mail address IP address not part of H.323 supported by most application packages can serve as ‘Address-book' to publicize users central to ‘Buddy' systems (ICQ, AOL, ...) with popular personal communication functionalities built around the directory server ils, uls, dls servers, e.g. ils.four11.com

- Architecture and Operation

X.500 directory structure, ITU-T standard LDAP access protocol, subset of DAP supports ‘post', ‘search' and ‘retrieve' post upon log-in: name, location,... ,e-mail addr, IP addr (hidden) search by: name, ..., e-mail addr (not IP address) retrieve by: e-mail-addr get IP address current IP address used by H.225 to connect Fig. 7.1 Directory Window Fig. 7.2 Directory Server, Data Flow and X.500 Database ==========================================================

8. Multi-Party Conferencing

- Current H.323 (1998) standard supports

point-to-point two-way audio and video conferencing multi-party two-way data conferencing H.332 one-way broadcast extension

- Multi-Party Conferencing Unit (MCU)

separate high-performance server replicator of media streams to all logged-in peers next-generation H.323 (1999...2000 ?) to support MCU

- Today: proprietary ‘MCU' available, Cu-SeeMe (V1.0...4.1)

Reflector (mirror) replicates audio, video and data streams to all logged-in peers exponential growth of compound traffic suitable for 3...5 peers over Internet (<100 kbps sustained) popular as ‘Virtual Meeting Places', ‘Rooms', etc.

- Traffic growth is problematic

exponential traffic growth with number of peers single point of congestion at MCU better approaches: network-oriented multicasting, MBONE reduce rapid growth of media traffic Fig. 8.1 MCU Integration into Network and Data Flow ==========================================================

9. IP Telephony

- H.323 supports inter-operation with incompatible networks

described by H.246 most popular is IP Telephony (Internet Telephony)

- Integration of IP-based audio conferencing with PSTN

carry long-distance portion of call over Internet use only local PSTN service to Central Office significant l.d. savings ‘long-distance calls at local call cost'

- Services for seamless integration

PC-to-phone PC-to fax phone-phone

- Enabling technologies

voice-over-IP , VoIP Gateways support by H.323 terminals

- Gateways

separate high-performance edge servers interconnection of incompatible networks e.g. IP-based Internet and PSTN (POTS, ISDN) conversion of media formats IP encapsulated RTP packets analog, T1, ISDN translation of call signaling protocols address translation to E.164 call establishment/tear-down requests e.g. in-band H.225 signaling packets to SS7 Fig. 9.1 Network Interconnection by Gateway (IP-PSTN) ==========================================================

- Some commercial service providers

Gateway access requires subscription Vocaltech Net2Phone Delta3 MediaRing

- Some application software packages

many packages are freeware MS Netmeeting MediaRing Internetphone Net2Phone and Net2Fax VDOPhone PhoneFree

- Typical calling costs

$0.05 within N.A, $0.07 France, $0.10 Germany voice quality still below traditional telephone quality (on average )

- Future offerings

IP-based corporate phone systems over installed LAN, MAN Selsius' first IP-based H.323 PBX (1998) Centrex-over-IP Take-anywhere phone-IP interface with integrated gateway MediaRing IP-Phone Bridge, US$149 Fig. 9.2 Net2Phone Fig. 9.3 MediaRing IP-Phone Bridge ==========================================================

10. Future Unifying IP-Networks

- Objective

To build one single multi-service network to provide a full range of communication services efficiently and affordably

- Current trend

rely on global end-to-end IP-based managed networks e.g. Internet, Enterprise networks, LANs, MANs

- The pioneering effort

Level 3 Communication's first international IP-network announced August 1998 to be completed by 2001 local and long-distance telephony, computer data, later video QoS guarantee security guarantee inter-operability with IP-networks, ATM, PSTN, PBX, CaTV protocol stack founded on emerging IETF standard Fig. 10.1 Interconecton of multiple incompatible Networks and Inter-Gateway Communication (MGCP) ==========================================================

- Emerging Protocol Suite

IETF promoted part of MMUSIC (Multiparty Multimedia session Control) inter-gateway/gatekeeper control protocols optimize inter-network routing, traffic balancing SIP-based SIP Session Initiation Protocol SAP Session Announcement Protocol SDP Session Device Protocol PDC Internet Protocol Device Control (IP SS7), Level 3 Comm SGCP Simple Gateway Protocol (Bellcore, Cisco) MGCP Media Gateway Control Protocol = IPDC + SGCP MDCP Media Device Control Protocol PGRP Peer Gatekeeper Routing Protocol much functional overlap and much work left to be done

- Co-existence of H.323's control protocols and the IETF suite ?

remains to be seen

- Packet cable initiative

extend IP-voice conferencing services (telephony) to CaTV CableIP voice service (Rogers, Time Warner, ...) DOCSIS cable modem interface SIP-based protocol stack Fig. 10.2 SIP/RTP Protocol Stack Fig. 10.3 Cable IP Service using DOCSIS ==========================================================

11. Conclusions

- ‘The Internet used to run over the phone - now the phone runs over the Internet'

- last 3 years have seen

an enormous increase of audio and video traffic over the Internet a rapid improvement of media quality better codecs better R/T protocols higher available bandwidth lower hardware and software cost a move from proprietary to international standards

- ITU's H.323 standard dominating, with RTP/RTCP for VoIP

enabling vendor-independent audio, video and data conferencing

- recent seamless integration of IP-nets and PSTNs

IP telephony shrinking long-distance costs

- however the Internet is still unmanaged

prone to net congestions often poor quality of audio and video

- need better QoS

managed IP networks, excess bandwidth, or both ? a trend towards unifying IP-networks will tariffs go up ? Fig. 11.1 H.32x Standard Family Fig. 11.2 Gateway-Interconnected Networks Fig. 11.3 IP Soft-Telephone ==========================================================

12. References

Daniel and Emma Minoli, "Delivering Voice over IP", John Wiley&Son, 1998 Marcus Goncalves, "Voice Over IP Networks", McGraw Hill, 1998 Uyless Black, "Voice Over IP," Prentice Hall, 1999 D. Briere, et al, "Internet Telephony for Dummies," IDG Books, 1997 MS Netmeeting Resource Kit, Microsoft, http://msdn.microsoft.com/netmeeting/reskit/main.htm, March 1999 NetPhone Review, http://www.cnet.com/content/Reports/Reviews/NetPhones/ ==========================================================

13. Demonstrations

Pre-record on videotape, then play back in groups - Live radio broadcast: Sunset radio - Life video TV: CBC or Realplayer - Audio-on-demand: RealPlayer - IP Telephony: MediaRing demo call, one-way - IP Telephony: Net2Phone, actual NB-wide call - Conferencing: Netmeeting, video-conference Whiteboarding (graphics) application sharing - Multi-Part conferencing: Cu-SeeMe, conference groups

Back to: Kurz's Seminar Page | Kurz's Professional Activity Page | Kurz's Home Page

Last revised: 11 November 1999, BJK