

# STARCORE DSPs BOOST VOIP

Freescale Designs Its Latest DSPs for Packet-Telephony Applications By Tom R. Halfhill {5/18/04-01}

Two decades of deregulation have slashed the cost of long-distance phone calls to pennies a minute, but even pennies aren't free. Business and residential customers eager for lower-cost alternatives are eyeing voice-over-Internet-Protocol (VoIP) telephony, which piggybacks

digitized voice packets onto existing Internet services. Result: talk gets even cheaper.

The compound annual growth rate of the IP-telephony market is about 35% for residential users and about 53% for business users, according to Norm Bogen, director of networking market analysis at In-Stat/MDR. To feed that growing market, Freescale Semiconductor—Motorola's former semiconductor group—is introducing a new family of five DSPs based on the StarCore VLIW architecture.

Although Freescale's new MSC711x-series DSPs are useful for any 16-bit fixed-point signal processing, they are especially suited for packet telephony. Two of the chips have Ethernet media-access controllers, and all have time-division multiplexers (TDM), DDR memory controllers, 32-channel DMA, and generous amounts of on-chip SRAM. The DSPs are designed to work in tandem with Freescale's PowerQuicc communications processors, but they will function as slaves to virtually any host processor via an Ethernet connection or their 8/16-bit host data interface (HDI).

In an interesting departure, all five of the new DSPs are designed around the SC1400 synthesizable DSP core licensed from StarCore LLC, an independent offshoot of Infineon, Lucent/Agere, and Motorola. (See the sidebar, "StarCore LLC Offers Soft DSPs," in *MPR 10/20/03-01*, "Motorola Enhances StarCore DSP.") Previous Motorola DSPs—including other StarCore-based devices from Motorola—have been full-custom hand-packed designs. Freescale says the synthesizable SC1400 core, which accelerated the design project, allows the

company to rapidly create additional variations, provides better portability across different fabrication processes, and sacrifices a negligible amount of performance versus a hard core.

Members of the MSC711x family are priced relatively low for high-performance DSPs with their integrated features, ranging from about \$12 for the MSC7110 to about \$24 for the MSC7116, in 10,000-unit volumes. They are mutually pin compatible and are binary compatible with software written for MSC81xx-series StarCore DSPs. Samples of the chips will be available this summer, with production scheduled to begin in November and December.

## First DSPs With Glueless DDR Interfaces

Freescale expects customers to use the MSC711x-series DSPs in VoIP gateways that support multiple voice channels over a broadband Internet connection. Fewer than 20% of U.S. households currently have broadband access, but it's more common in some European and Asian countries, and it's almost universal among business customers worldwide.

Freescale estimates that an MSC711x DSP will support 8 to 16 "premium-voice" channels—the company's worstcase measure of capacity for packet telephony. Freescale's definition of a premium-voice channel is one that uses the G.711, G.726, and G.729AB vocoders with 128-millisecond echo cancellation. Omitting the G.726 and G.729AB vocoders requires more bandwidth but less processing power, so the DSP can handle twice as many voice channels. In large installations, multiple MSC711x DSPs can work together to support dozens or even hundreds of voice channels. Figure 1 is a block diagram of such a system.

To deliver the required level of signal-processing performance, Freescale started by licensing StarCore LLC's most powerful DSP core, the SC1400, which has twice as many ALU/multiply-accumulate (MAC) units as the SC1200 (four versus two). Note that the SC1400 is essentially a synthesizable version of the original StarCore SC140 DSP core introduced in 1999. (See *MPR 5/10/99-03*, "StarCore Reveals Its First DSP.") The SC1400 lacks the features added to the enhanced SC140e DSP core that Motorola announced at **Microprocessor Forum 2003**. (See *MPR 10/20/03-01*, "Motorola Enhances StarCore DSP.")

Although the SC1400 is somewhat long in the tooth, it still has enough processing power for VoIP and many other 16-bit fixed-point DSP applications. Clocked at 200-250MHz, the SC1400 can execute a peak 800 million to 1 billion MACs per second. More important are the fast memory system and other integrated features. MSC711x chips are the first DSPs to incorporate a glueless DDR SDRAM controller, which effectively doubles the memory bandwidth. With a 32-bit data bus, 14-bit address bus, and effective DDR frequency of 200-250MHz, the maximum theoretical throughput is 800-1,000MB/s. (Optionally, the bus can operate at a width of 16 bits, which would halve the throughput.) However, some other DSPs-such as Freescale's own StarCore MSC8122-have wider, faster SDR buses that match or exceed the DDR bandwidth of the MSC711x. The advantage of a narrower bus that achieves similar bandwidth by doubling the data rate is a smaller package with fewer pins.



In addition to their DDR memory interfaces, the MSC-711x DSPs have relatively large amounts of on-chip memory. All have a block of SRAM known as M1 memory: 64K in the MSC7110 and 192K in the other members of the family. The DSP core can access the four-ported M1 memory in a single clock cycle. The MSC7115 and '16 have an additional block of M2 memory with 192K of SRAM, which the DSP core can access in two clock cycles. These generous on-chip memories allow the processors to keep critical data close to the DSP core without the real-time uncertainties of data caches. (The core has a 16K, 16-way set-associative instruction cache, but no data cache.) A 32-channel DMA controller allows the processor to manage multiple memory transactions in the background. Figure 2 is a block diagram of the MSC7116.

#### Juggling Interfaces and Peripherals

Two of the new DSPs—the MSC7113 and '16—have 10–100Mb/s Ethernet media-access controllers. These controllers allow direct connections to a network (through a PHY chip) and also provide an alternative to the 8/16-bit HDI host port that's standard on all MSC711x DSPs. As prices for Ethernet switches decline, some customers are finding it cheaper and easier to connect the DSP to the rest of the system via Ethernet instead of using a conventional host interface.

Another feature that distinguishes members of the MSC711x family from each other is the number of on-chip TDMs, which are responsible for allocating time slots to multiplexed datastreams. The best-endowed family member in this respect is the MSC7115, which has three TDMs, each capable of multiplexing 128 channels of data. The MSC7112,

'13, and '16 each have two 128channel TDMs, and the '10 has one. The trade-off here is that customers must sacrifice Ethernet to get the largest number of TDM channels: the Ethernet-equipped '13 and '16 have only two TDMs.

Other MSC711x on-chip resources are fairly commonplace: a UART (which can provide an RS-232 interface), an I<sup>2</sup>C interface, a JTAG interface with an 8K trace buffer, 8K of bootstrap ROM, a 120-channel interrupt controller, a real-time clock, two 16-bit quad general-purpose timers, and up to 37 general-purpose I/O (GPIO) pins. Interestingly, the on-chip peripherals and serial controllers are integrated as dedicated logic blocks, unlike other Motorola StarCore DSPs, which use a RISCbased communications processor module (CPM) to implement serial I/O.

**Figure 1.** This VoIP reference design (available from Freescale later this year) uses four MSC711x-series StarCore chips on a "DSP farm card" to assist a PowerQuicc II MPC8260 communications processor. Such a system might be capable of handling 64 premium-voice channels or more than 100 G.711 channels.

3

Dedicated logic blocks give Freescale more design flexibility, because the company can rapidly create new members of the MSC711x family by adding or subtracting blocks. For example, the only real difference between the MSC7115 and '16 is that the latter chip substitutes an Ethernet controller for one of the TDMs. Likewise, the only difference between the MSC7112 and '13 is the latter's Ethernet controller, which in this case doesn't come at the expense of a TDM. Table 1 summarizes the features of the MSC711x-series DSPs and compares them with those of Freescale's MSC8122, a previously announced StarCore-architecture DSP that integrates four SC140 cores on a single chip.

#### StarCore Performance Remains Competitive

Berkeley Design Technology Inc. (BDTI) recently benchmarked an MSC711x DSP in simulation (samples of the chips weren't available) and obtained good results for a processor built on a synthesizable core with a five-year-old microarchitecture. The BDTI score agrees with our own extrapolations, which are based on BDTI's two previous tests of the SC1400 soft core and an actual DSP chip that uses the closely related SC140 hard core.

When simulated at 200MHz, the MSC711x achieved a BDTIsimMark2000 score of 2,240. At this writing, five other DSPs have posted higher BDTIsimMark2000 scores: the Tensilica Xtensa LX (6,150 at 370MHz); the CEVA X1620 (3,620 at 450MHz); the StarCore SC1400 (3,420 at 305MHz); the StarCore SC1200 (2,690 at 340MHz);

and the LSI Logic ZSP500 (2,570 at 325MHz). Note that all those cores were simulated at significantly higher clock speeds than the 200MHz MSC711x. The configurable Xtensa LX—actually a general-purpose RISC microprocessor core, not a DSP—is a special case. Announced at Embedded Processor Forum this week, the Xtensa LX was configured with custom instructions designed to ace the BDTI benchmark tests. (*MPR* is preparing a detailed report about the Xtensa LX for a future issue.)

Although BDTI's benchmark tests aren't specifically designed to measure DSP performance in packettelephony applications, they do include some relevant signal-processing filters and functions, such as fast Fourier transforms (FFT) and Viterbi decoders. They also include a decisionmaking control kernel that's relevant to packet processing, but that task will probably be handled by a PowerQuicc chip or similar communications processor working alongside the DSP.

## Price & Availability

Freescale plans to release samples of the MSC711xseries DSPs in June, July, and August, depending on the specific part (see Table 1). Production quantities are scheduled to be available in November and December 2004. Prices for chips purchased in 10,000-unit quantities range from \$12.05 to \$23.95. For more information, visit www.motorola.com/dsp.

BDTI hand codes the benchmark tests in assembly language for each DSP instead of using a C compiler, and the BDTIsim-Mark2000 tests run on a cycle-accurate instruction-set simulator. (For more information about BDTI benchmarks, see www.bdti.com/bdtimark/BDTImark2000.htm.)

Before BDTI published its official score for the MSC711x, *MPR* obtained similar results by extrapolating from BDTI's earlier benchmarks of the SC1400 soft core. Because of the nature of the BDTI benchmarks, the scores will scale in a linear relationship with clock frequency. When BDTI simulated the SC1400 at 305MHz—a clock speed higher than any reached by current members of the MSC711x family but certainly attainable in their 0.13-micron CMOS process—it recorded a BDTIsimMark2000 score of 3,420. Scaled to 200–250MHz, that score translates



**Figure 2.** Rich in on-chip memory and peripherals, the Freescale MSC711x-series chips are the first DSPs with glueless DDR SDRAM controllers. Other members of the family are minor variations of this MSC7116 design, generally having less on-chip memory, fewer time-division multiplexers, and, in some cases, omitting the Ethernet controller.

MAY 18, 2004

# StarCore DSPs Boost VoIP

4

|                                             | Freescale               | Freescale               | Freescale               | Freescale               | Freescale                                | Freescale                     |
|---------------------------------------------|-------------------------|-------------------------|-------------------------|-------------------------|------------------------------------------|-------------------------------|
| Feature                                     | MSC7110                 | MSC7112                 | MSC7113                 | MSC7115                 | MSC7116                                  | MSC8122                       |
|                                             |                         |                         | OSP Core Features       | 111307113               | 1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1. | 11000122                      |
| Architecture                                | StarCore VLIW                            | StarCore VLIW                 |
| Microarchitecture                           | SC1400                  | SC1400                  | SC1400                  | SC1400                  | SC1400                                   | SC140                         |
| DSP Type                                    | 16-bit fixed-pt                          | 16-bit fixed-pt               |
| On-Chip DSP Cores                           | 1                       | 1                       | 1                       | 1                       | 1                                        | 4                             |
| Core Freq                                   | 200MHz                  | 200MHz                  | 200MHz                  | 200–250MHz              | 200–250MHz                               | 300–400MHz                    |
| Effective Bus Freq                          | 200MHz (DDR)            | 200MHz (DDR)            | 200MHz (DDR)            | 200–250MHz<br>(DDR)     | 200–250MHz<br>(DDR)                      | 133MHz (SDR)                  |
| I-Cache (16-way)                            | 16K                     | 16K                     | 16K                     | 16K                     | 16K                                      | 4 x 16K                       |
| ALU / MAC Units                             | 4                       | 4                       | 4                       | 4                       | 4                                        | 16                            |
| MMACs                                       | 800 @ 200MHz            | 800 @ 200MHz            | 800 @ 200MHz            | 1,000 @ 250MHz          | 1,000 @ 250MHz                           | 6,400 @ 400MHz                |
| Address Gen Units                           | 2                       | 2                       | 2                       | 2                       | 2                                        | 8                             |
| Branch Units                                | 1                       | 1                       | 1                       | 1                       | 1                                        | 4                             |
| Data Registers                              | 16 x 40 bits                             | 48 x 40 bits                  |
| Address Registers                           | 27 x 32 bits                             | 108 x 32 bits                 |
| On-Chip Peripherals, Memory, and Interfaces |                         |                         |                         |                         |                                          |                               |
| M1 SRAM                                     | 64K                     | 192K                    | 192K                    | 192K                    | 192K                                     | 4 x 224K                      |
| M2 SRAM                                     | _                       | _                       | _                       | 192K                    | 192K                                     | 476K                          |
| Bootstrap ROM                               | 8K                      | 8K                      | 8K                      | 8K                      | 8K                                       | 4K                            |
| Ethernet MAC                                | _                       | _                       | 1 x 10/100              | _                       | 1 x 10/100                               | 1 x 10/100                    |
| Time-Div Muxes                              | 1 x 128 ch              | 2 x 128 ch              | 2 x 128 ch              | 3 x 128 ch              | 2 x 128 ch                               | 4 x 256 ch                    |
| SDRAM Controller                            | 16/32-bit DDR                            | 32/64-bit SDR                 |
| Host Interface                              | 8/16-bit HDI                             | 32/64-bit HDI                 |
| RS-232 Interface                            | 1 (via UART)                             | 1 (via UART)                  |
| DMA Controller                              | 32-channel              | 32-channel              | 32-channel              | 32-channel              | 32-channel                               | 16-channel                    |
| UART                                        | 1                       | 1                       | 1                       | 1                       | 1                                        | 1                             |
| I <sup>2</sup> C Interface                  | 1                       | 1                       | 1                       | 1                       | 1                                        | 1                             |
| Timers                                      | 2 x 16-bit quad                          | 32 x 16-bit<br>or 16 x 32-bit |
| GPIO                                        | 37                      | 37                      | 37                      | 37                      | 37                                       | 32                            |
| Interrupt Controller                        | 120 channels                             | 384 channels                  |
| Additional Specifications                   |                         |                         |                         |                         |                                          |                               |
| PV Channels*                                | 8                       | 12                      | 12                      | 16                      | 16                                       | 168                           |
| Voltage (Core)                              | 1.2V                    | 1.2V                    | 1.2V                    | 1.2V                    | 1.2V                                     | 1.0–1.2V                      |
| Voltage (I/O)                               | 3.3V I/O<br>2.5V DRAM                    | 3.3V I/O<br>3.3V SDRAM        |
| Power (200MHz)**                            | 300–400mW               | 300–400mW               | 300–400mW               | 300–400mW               | 300–400mW                                | 1W @ 300MHz                   |
| Package                                     | MAPBGA-400**<br>17x17mm | MAPBGA-400**<br>17x17mm | MAPBGA-400**<br>17x17mm | MAPBGA-400**<br>17x17mm | MAPBGA-400**<br>17x17mm                  | FC-CBGA-431<br>20x20mm        |
| IC Process                                  | 0.13µm CMOS                              | 90nm CMOS                     |
| General Sampling                            | 8/04                    | 8/04                    | 8/04                    | 6/04                    | 7/04                                     | 6/04                          |
| Availability                                | 12/04                   | 12/04                   | 12/04                   | 11/04                   | 11/04                                    | 12/04                         |
| Price (10K)                                 | \$12.05                 | \$17.40                 | \$18.59                 | \$22.76                 | \$23.95                                  | \$106-139                     |

Table 1. Freescale's MSC711x family debuts with five DSPs: the MSC7110, '12, '13, '15, and '16. All are based on the same SC1400 synthesizable DSP core and run at similar clock frequencies; on-chip peripherals and memories account for their differences. The table also shows the previously announced MSC8122, a more powerful DSP with four SC140 cores. The MSC711x chips are packaged in lead-free 400-contact multiarray plastic ball grid arrays (MAPBGA) and are mutually pin compatible. \*Freescale's estimate of the number of premium-voice packet-telephony channels each chip can support. \*\*Freescale's estimate of worst-case power consumption with all peripherals and I/O interfaces active. Typical power consumption will depend on the code that's running and the number of active blocks; all these chips use extensive clock gating to reduce power.

to 2,242–2,803, virtually the same as BDTI's actual result for the MSC711x at 200MHz.

For another data point, consider BDTI's tests of the Motorola MSC8101, which uses the SC140 hard core (albeit without an instruction cache). As a hand-optimized design, the SC140 core can run at higher clock frequencies than its synthesizable SC1400 counterpart when the SC140 is manufactured in a similar fabrication process, so some clock-speed scaling is necessary.

At 300MHz, the MSC8101 achieved a respectable BDTImark2000 score of 3,370. Scaling for the MSC711x at 200–250MHz, that score translates into a range of 2,258–2,808—again, very close to BDTI's actual result for the MSC711x. Only four other fixed-point DSPs have better BDTImark2000 scores: Texas Instruments' TMS320C64x (9,130 at 1GHz); Analog Devices' ADSP-TS201S TigerSharc (6,150 at 600MHz); Analog Devices' ADSP-BF5xx Blackfin (4,190 at 750MHz); and Motorola's MSC8101. Note that all the other DSPs run at significantly higher clock speeds than the MSC711x, up to 1GHz in the case of the 'C64x.

We draw two conclusions from these comparisons: the five-year-old SC140/SC1400 core remains competitive with

5

newer DSPs, and—most remarkably—it delivers good performance at lower clock frequencies, which allows lower power consumption. By wrapping the SC1400 core in an integrated design that includes a DDR memory controller, more on-chip memory than is commonly found in DSPs, and communications-oriented features (TDMs and Ethernet controllers), the MSC711x family is well positioned for packet telephony and many other signal-processing tasks.  $\diamondsuit$ 

To subscribe to Microprocessor Report, phone 480.609.4551 or visit www.MDRonline.com

© IN-STAT/MDR

MAY 18, 2004

MICROPROCESSOR REPORT