Timing Redundancy in Telecommunication Systems
A white paper
—by Slobodan Milijevic, Senior Applications Engineer,
Zarlink Semiconductor
In a typical
telecommunication product, all cards are synchronized to the same
clock. The failure of this clock disrupts the data traffic on all
cards. To avoid this problem and increase network reliability,
telecom products are designed with at least two clocks — active and
redundant. If the active clock fails, the system avoids failure by
switching to the redundant clock. This article outlines the
importance of clock redundancy, presents two methods (parallel and
serial) used to implement timing redundancy, and discusses the
advantages and disadvantages of both approaches.
Highly reliable operation

Figure 1: Functional diagram of a typical telecommunication product |
Telecommunications systems must provide highly reliable operation
under all network conditions. To do this, the most critical
components within the system are made redundant. A typical
telecommunications product is a 19-inch standard telecom rack
populated by up to 18 one-inch vertically inserted cards.
As shown in Figure 1, a typical system is comprised of two control
cards and multiple line cards that communicate over a common
backplane. The two control cards are identical and run in parallel.
Only one control card is active at any given time, and the other
takes over if the first fails. Switching from one control card to
the other should not cause any interruption or failure in the
system.
The control card includes a system control processor, switching
fabric and system timing. It is important to note that in more
complex, larger systems, timing is implemented on separate cards to
further increase the flexibility of the product. This article covers
only the timing aspects of telecommunication systems.
Timing card architecture

Figure 2: Block diagram of a typical timing card |
Having two timing cards protects against an internal failure where
one of the cards fails. To protect from external clock reference
failures, the timing cards are designed to be able to synchronize to
more than one reference.
A timing card accepts references from multiple sources, selects one,
cleans it from phase noise with a digital phase locked loop (DPLL),
and distributes it to the line cards via the backplane. The DPLL is
the most important part of the timing card. Depending on the
targeted application of the product and region of deployment, the
DPLL needs to be compliant with the appropriate timing
specifications, such as Telcordia GR-1244 CORE, Telcordia GR-253
CORE or ITU G.813. The DPLL needs to provide an array of crucial
features, including:
-
Hitless reference switching — if the reference the DPLL is locked
to fails, the DPLL will lock to another available reference without
phase disturbances at its output.
-
Holdover mode — the DPLL constantly calculates the average
frequency of the locked reference. If the reference fails and none
of the other references are available, the DPLL goes into holdover
mode where it generates an output clock based on calculated average
value. Holdover stability depends on resolution of the DPLL
averaging algorithm and on frequency stability of the oscillator
used as the DPLL master clock.
-
Reference monitoring — the DPLL needs to constantly monitor
quality of its input references. If the reference the DPLL is locked
to deteriorates — disappears or drifts in frequency — the DPLL
raises an alarm (interrupt) and switches to another valid reference.
-
Narrow loop bandwidth — the DPLL can be viewed as a phase noise
filter. The narrower the loop bandwidth, the better the phase noise
attenuation. Some specifications, such as G.813, explicitly provide
the loop bandwidth. Others, including GR-253 CORE, provide narrow
loop bandwidth specifications implicitly through the wander transfer
requirement. Ideally, the DPLL should have programmable loop
bandwidth so the timing card can be easily used for different
applications.
-
High jitter and wander tolerance — the DPLL should tolerate large
phase noise at its input and still maintain synchronization.
Timing card DPLL references can come externally from a Building
Integrated Timing Supply (BITS) or internally from line cards. The
BITS is defined as the most accurate clock in an office, and is used
as a master clock for all intraoffice equipment. The BITS can be
viewed as a stand-alone timing card, usually with Stratum 2 (0.1
parts per billion) holdover stability. The BITS is timed by two T1
signals and its outputs are distributed to equipment with T1 or
Composite Clock (CC) signals. It should be noted that BITS is a
North American term, while the rest of the world uses
Synchronization Supply Unit (SSU). Where BITS uses T1 for clock
reception and distribution, SSU uses E1 links.
All nodes in a public telecommunication network must be synchronized
to timing references that are traceable to a Primary Reference
Source (PRS). A PRS provides a clock with Stratum 1 accuracy (0.01
parts per billion). PRS can be generated from an on-site cesium
clock, or from cesium clock-controlled radio signals such as Global
Positioning System (GPS) and Long Range Navigation System, or
Version C (LORAN-C). Due to the high cost of cesium clocks, PRS
usually use GPS with LORAN-C as a backup if GPS fails. Because it is
not economically viable to have PRS at each network node, few
(usually two) nodes have their BITS synchronized directly to PRS.

Figure 3: Block diagram of timing card where the extracted line
timing is passed to external BITS |
The other nodes in the network use line timing where their BITS/SSU
is synchronized to one of the extracted line clocks. The clock path
sequence is shown in Figure 3. In this case, an additional low-cost
wideband DPLL is needed to convert the frequency of the line card
extracted clock to the frequency needed by T1/E1/CC Line Interface
Units (LIU). LIUs are used for the transmission of the timing
references between the timing card and BITS and vice versa. For
example, if the extracted line clock originates from an OC-3 line
card, its frequency is usually 19.44 MHz so the wideband DPLL is
needed to convert from 19.44 MHz to 1.554 MHz (T1), 2.048 MHz (E1),
or 64 KHz (CC).
Optionally, the timing card can be used to source BITS/SSU clock if
an external BITS/SSU source with better holdover accuracy is not
available. In this case, the timing card DPLL is synchronized to one
of the extracted line clocks. Its output is fed to the backplane and
to LIUs via wideband DPLL.
Timing card redundancy
Timing card redundancy is implemented in one of two ways — parallel
redundancy or serial redundancy. Parallel redundancy is shown in
Figure 4, while serial redundancy (commonly referred to as
“master/slave” time redundancy) is illustrated in Figure 5.

Figure 4: Parallel implementation of the redundant timing |
As seen in Figures 4 and 5, DPLLs on the active and redundant cards
drive the active and redundant clocks to the corresponding traces on
the backplane. Each DPLL usually drives common clock frequencies
such as 8 kHz (DS0), 1.544 MHz (DS1), 2.048 MHz (E1) and 19.44 MHz (SONET/SDH).
The active and redundant clocks on the backplane should have the
same frequency and phase. Ideally, the phase difference should be
equal to zero. In practice, a phase difference in the range of few
nanoseconds is achievable.
The active and redundant clocks are distributed via the backplane to
the line cards. As seen in Figures 4 and 5, the line cards each have
a DPLL followed by an analog PLL (APLL). The DPLL is used for
hitless switching between the active and redundant clocks and to
provide clock continuity for a short period, such as when the active
clock unexpectedly disappears before the system detects active
reference failure and switches the line card DPLL to lock to the
redundant reference.

Figure 5: Serial (Master/Slave) implementation of the timing
redundancy |
The APLL is used only for jitter reduction and frequency
multiplication. It is possible to have hitless reference switching
with an APLL. However, good clock continuity is difficult to achieve
because oscillators used on APLLs (usually LC-based) have very low
holdover stability relative to DPLLs that use crystal oscillators.
Typically, a DPLL has short-term holdover accuracy of 0.01 ppm
(parts per million) or better, whereas an APLL has holdover accuracy
above 100 ppm.
Parallel timing redundancy
In this scheme, as illustrated in Figure 2, DPLLs on both timing
cards are locked to either an extracted line clock from one of the
line cards or the BITS reference. Both DPLLs should be locked to the
same input reference and should have identical loop bandwidth (i.e.
0.1 Hz for Telcordia GR-253 CORE). In this case, if the active card
does a reference switch from BITS0 to BITS1, the redundant card
should simultaneously do the same. Because the DPLLs on the active
and redundant timing cards have the same bandwidth and are fed with
the same input reference, the outputs should be closely phase
aligned regardless of the jitter/wander on the input reference.
However, this is only partially true due to intrinsic wander issues.
We will look at this later in the article.
Serial (master/slave) redundancy
A serial redundancy timing scheme is implemented by locking the
secondary timing card to the output of the primary timing card, as
shown in Figure 3. The loop bandwidth of the DPLL on the active
timing card should be set in accordance with requirements (for
Telcordia GR-253-CORE it is 0.1 Hz). However, the loop bandwidth of
the DPLL on the redundant card should be set as wide as possible –
at least 10 times more than the DPLL on the active card. The wider
bandwidth allows the DPLL to track clock changes at its input much
faster, thus keeping the active and redundant clocks closely aligned
at all times.
If it is detected that the clock generated by the active card has
failed, the DPLL on the secondary card will go into holdover mode
and signal to the board controller. The controller will now promote
the secondary card to act as the primary card by selecting the
narrowband loop filter on the DPLL and locking the DPLL to the same
reference input (if available) that the active card was locked to
before it failed. When the failed timing card is replaced, the new
card will assume the role of the redundant timing card.
In serial timing redundancy, the phase offset between the active and
the redundant clocks can be calculated from:
D = dPLL + dRxBuffer + dMux + dTxBuffer
where:
dRxBuffer is a typical propagation delay of the receive clock buffer
on the slave card,
dMux is a typical propagation delay of the clock multiplexer,
dTxBuffer is a typical propagation delay of the clock driver on the
slave card, and
dPLL is a typical phase offset between input and the output
reference after reference alignment is performed.
Some advanced DPLLs intended for timing card design have the ability
to advance the output clock relative to the input with a resolution
below 1 nanosecond. This feature can be used to minimize delay D.
Comparing redundancy schemes
In practice, designers use serial redundancy more often because it
has several important advantages.
If the product is in island mode (not locked to the network
reference or to the BITS clock), its timing cards must work in a
free-run mode. In this mode, the DPLL output frequency will be based
on crystal oscillators used as the DPLL master clock. As a result,
the active and redundant clocks in the parallel method will drift
relative to each other at a rate proportional to the fractional
frequency difference between crystal oscillators on the active and
redundant cards. However, in the serial redundancy method the active
and redundant clocks will always be aligned because the DPLL on the
redundant card locks to the clock generated by the free-running DPLL
on the active card.
Since DPLLs on the active and redundant timing cards have the same
bandwidth in the parallel redundancy method, and because they are
fed with the same input reference, one would expect that the outputs
would be closely phase-aligned regardless of the jitter/wander on
the input reference. However, the active and redundant clock may
drift back and forth relative to one another due to intrinsic wander
generated by the DPLL. This intrinsic wander is dependent on the
short time-frequency fluctuations of the crystal oscillator and on
the bandwidth of the DPLL. When fed with a clean input reference
clock, a DPLL can compensate for those short-term fluctuations and
provide clean clocks at its output.
However, the DPLL’s ability to do so is dependent on its bandwidth.
The wider the bandwidth, the better the compensation. Because the
DPLLs on the active and redundant cards in the parallel redundancy
method have the same narrow bandwidth they will both have intrinsic
wander. Since each card has its own crystal oscillator, the wander
generated by the DPLL will be uncorrelated. Thus, the active and
redundant clocks may drift back and forth relative to each other.
The maximum phase difference between them can be more than 10
nanoseconds when the DPLL is set to 0.1 Hz loop bandwidth, even when
very stable oscillators such as Ovenized Crystal Oscillators (OCXO)
are used. This problem is not present in the serial redundancy mode
because the DPLL on the redundant card compensates for all frequency
fluctuations caused by the crystal oscillator due to its wide loop
bandwidth.
Yet, the parallel redundancy scheme is easier to implement because
it does not require reconfiguration of the DPLL on the redundant
card when the active clock/card fails.
Conclusion
Timing card redundancy is implemented in telecommunications products
to prevent data loss and increase network reliability. This article
presented the typical timing card architecture and two common ways
of implementing timing card redundancy. Although slightly more
complicated to implement, serial redundancy has several advantages
over parallel redundancy.
References:
Alain Blachard, Phase-Locked Loops, Wiely 1976
Synchronous Optical Network (SONET) Transport Systems: Common
Generic Criteria GR-253-CORE, Issue 3, 2000
Clocks for the Synchronized Network: Common Generic Criteria
GR-1244-CORE, Issue 2, 2000
Digital Network Synchronization Plan GR-436 CORE, Issue 1, Revision
1, 6 1996
Timing characteristics of SDH equipment slave clocks (SEC) ITU-T
Recommendation G.813, 1998Transport Systems Generic Requirements (TSGR): Common Requirements
GR-499-CORE, Issue 2, 1998
About the author: Slobodan Milijevic is a Senior Applications
Engineer with Zarlink Semiconductor. He can be reached at
slobodan.milijevic@zarlink.com.
Zarlink Semiconductor,
www.rsleads.com/501df-241
|