Information Theory in the real world

Mathematical models of the real world

Physics

Mathematics

Information Theory is a branch of mathematics

Funadamental Concepts

Entropy

displaymath501

Relative Entropy

displaymath502

Mutual Information

displaymath503

Entropy Rate

displaymath504

Fundamental Properties

Fundamental Results: AEP

displaymath506

Typical set

displaymath507

Lossless Source Coding

Universal Source Coding

Fundamental results: Channel Capacity

Capacity of a channel tex2html_wrap_inline557 is defined

displaymath509

Channel Coding theorem There exist codes with probability of error arbitrarily small for all rates less than the capacity of the channel, and all rates above capacity have a non-vanishing probability of error.

Examples of Channel Capacity

Binary Symmetric Channel

C= 1-H(p)

Gaussian Channel

displaymath510

Fundamental Results: Rate Distortion Theory

Distortion measure tex2html_wrap_inline561 - eg. Hamming distortion or mean square distortion

Represent random variable X to within average distortion D.

R(D) = minimum rate of code that achieves distortion tex2html_wrap_inline569 .

Rate Distortion Theorem:

displaymath511

Quantization and Vector Quantization

Quantization: The process of finding a discrete set of values to represent a real number.

Uniform quantization: Divide range into bins.

Non-uniform quanitzation: Lloyd algorithm

Vector quantization: Treat blocks of random variables as vectors in high dimensional space and divide the space into regions.

Sigma-delta modulation:

Coding Theory

 

Channel capacity theorem promises the existence of codes with rates less than capacity with an arbitrarily low probability of error.

Coding theory: Find efficient coding and decoding techniques.

Block Codes

Key paremeters of block code: n: Block length, k: no. of information bits, d: minimum distance

Convolutional Codes

Block codes with memory

Parameters of a convolutional code: Free distance, constraint length

Viterbi Algorithm: Dynamic programming to find most likely path through trellis.

Real Information Sources

Model data by a stationary ergodic stochastic process.

Text and computer data: discrete valued process

Speech: one-dimensional continuous valued process

Images: two-dimensional continous valued process

Assumptions: Source is

Computer files

Computer files: Text, numbers, computer source code, binary files, etc.

English text has an entropy rate of about 1.5 bits/letter.

Compression algorithms:

Speech

Telephone quality speech is sampled at 8 kHz, with 8 bits/sample - 64,000 bits/second.

Compression schemes:

Music

Human ear cannot hear above 20 KHz.

CD quality music is samples at 44.1 KHz, 16 bits/sample, stereo, for a bit rate of 176.4 KBytes/sec or 1.411 Mbits/sec.

CD's data capacity is about 800 MB.

The new Sony Minidisc and the Phillips DCC both incorporate audio compression algorithms.

MIDI: Musical Instrument Digital Interface

Images: FAX

8.5 by 11 inch paper tex2html_wrap_inline577 1728 by 1186 pixels = 2.05 MBits =430 seconds at 4800 bits/sec.

Run length encoding

READ (Relative Element Address Designate) code for vertical corellation.

Average transmission time after compression = 57 seconds.

Group 3/Group 4 coding

New standard: JBIG based on arithmetic coding using contexts. Uses progressive compression.

Images: Still Images

Typical images on a computer screen: 640 by 480 by 24 bit color is 921 KBytes. Compression could save a lot of space.

JPEG Compression

Fractal Compression

Fractal: Fixed point of recursive mapping.

Video Compression: MPEG

Spatial and Time Correlation.

Uncompressed Video Stream: about 100 Mbits/sec.

MPEG-1: 1.5 Mbits/sec.

MPEG-2: 4-9 Mbits/sec

Communication Channels

Discrete channels:

Continuous input discrete time channels

Waveform channels

The promise of infinite bandwidth

Optical networks promise large bandwidth ( > 10 Gb/s) with extremely low error probability tex2html_wrap_inline587 .

Issues

Storage channels

Storage channels transmit information over time.

Magnetic disks

Information stored in transitions

(d,k) constraints: The number of 0's between any two 1's is at least d and at most k.

Eg. (1,3) constraint - 0100100010010001

Capacity of the constraint tex2html_wrap_inline595 where N(t) is the number of sequences satisfying the constraint of length t.

State Splitting or Bounded Delay codes

100's of Mbits/sq. inch

Compact disks

Data is stored by mean of pits on the surface - under the layer of protective plastic.

Each disk stores about 800 MB, and can output information at a data rate of 175 kB/sec.

Uses a (2,10) run length limited code, and interleaved Reed-Solomon codes to correct bursts of up to 4000 errors.

Oversampling D/A converters

CD-ROM, CD-I, Photo-CD, CD-Video all use the same technology.

2x CD-ROM drives

Low bandwidth channels

Rates less than 1 MBit/sec

High Speed Channels

Putting it all together: The telephone network

The Cellular Telephone Network: TDMA vs CDMA

The Information Age

Parkinson's Law for Communications: Demand will expand to exceed supply for

Information theory remains an active area of research and, in the information age, provides insight and direction for the design and analysis of communication and computer systems.


Joy Thomas
Sat Aug 15 06:26:30 EDT 1998