ICSI Speech FAQ:
6.7 Tell me about the MultiSPERT systems.

Answer by: dpwe - 2000-08-10


MultiSPERT hardware

The SPERT boards are custom SBUS cards manufactured at ICSI that plug into Sun SPARCstations and are used for accelerated neural net processing. The SBUS can of course accommodate a number of devices, and in theory up to fifteen (or something) SPERT boards could be connected to a single host.

In practice, it's a little more complicated. Firstly, the SPERT boards are double-width (taking up two SBUS sockets), not least because their current use (a peak of over 2.5 amps) exceeds SBUS specs for a single card. Most SPARCstations have only two or three SBUS slots, and can thus accommodate only one SPERT card.

However, SPARC-10s and 20s have two rows of two SBUS sockets, and can thus host two SPERT cards. Moreover, for a while it was possible to buy alleged SBUS expanders - a single SBUS card that allowed the connection of second chassis containing four more SBUS sockets, with the possibility of installing several such expanders in a single host. Finally, although the SPERT cards straddle two SBUS connectors, they only actually use the signals from the left-hand side, so through an arrangement of small extenders it might be possible to plug a separate SPERT board into every available SBUS socket.

A single host with multiple SPERT boards attached is called a MultiSPERT. In order to support the greater SBUS activity associated with MultiSPERT systems, a second generation of SPERT boards was manufactured with a new bus protocol (requiring a larger Xilinx part, which is why the original SPERTs can't be used). There are only 13 of these boards in existence, I think, mostly with serial numbers in the 3xx range, although a couple of the original boards were heroically upgraded by Jim.

At the time of writing we have four MultiSPERT machines. We have two DuoSPERTs - SPARC-10/20s with one SPERT board plugged into each of their two pairs of SBUS connectors. Of the two, RAVIOLI is slightly faster than PANFORTE (faster processor and/or faster SBUS clock rate).

We also have two TetraSPERTs: GUINNESS has two of the alleged SBUS expander chassis, each holding two SPERT cards. We went through 8 or 9 of these units before we found any that worked, but, apart from a few power supply deaths and SPERT overheating problems, the setup on GUINNESS seems to work pretty reliably.

The second TetraSPERT is KHEER, which uses highly unprofessional one-inch ribbon cable extensions to fit one SPERT card into each of the four SBUS sockets on a SPARC-20. The ribbon cables are electrically questionable, and the power supply loading is way out of spec, but this system has also operated reliably for the past year. At the time of writing, it appears to be misbehaving, but the true problem has yet to be determined.

There is also one MonoSPERT - a multispert-capable SPERT card plugged alone into a host (HOP?). The value of this is that it can run qnstrn with slaves=1, which can be useful to split the processing between SPARC (floating point for feature preprocessing) and SPERT (neural net calculation and updates).

MultiSPERT software

Of course, physically connecting multiple SPERTs doesn't mean that they are immediately useful. Our particular interest in thinking about MultiSPERTs was to support the training of nets too large to fit in a single SPERT. SPERTs have no support for virtual memory, so the training process must fit within the 8 MB local memory, which must also accommodate the kernel etc. I think the training process uses 6 bytes per weight (4 bytes to hold the weight, plus 2 bytes for the intermediate activation or back propagated error). In any case, the practical limit for a single SPERT is around 600-700,000 weights (or 2000 HUs for a 250 input, 50 output net).

Fortunately, a visiting post-doc (Philip Faerber) took on the project of adapting QuickNet to drive multiple slave SPERTs, and these extensions are now part of the standard qnstrn software (selected by the slaves=nn option). MultiSPERT qnstrn actually runs rather differently from either host-only qnstrn (which does all calculations locally) or single-SPERT qnstrn (which actually executes a complete qnstrn process, compiled to run on the SPERT rather than the host).

MultiSPERT operations run a special slave process on each spert, which are handed sub-problems to calculate via a protocol called the Spert Procedure Call (SPC). The main process, running on the host, breaks up each net calculation by dividing the hidden units between each SPERT, then gathers the partial results and combines them on the host CPU. For trainings, the host CPU also has to calculate the total error, and broadcast it back to the individual boards for back-propagation through their weights. As a result, MultiSPERT training is communications-bound by the limitations of the SBUS for all except the very largest trainings (i.e. you're better off using fewer SPERTs if the job will fit on them).

However, for the very large trainings (e.g. 8000 HUs with 250 inputs and 54 outputs), a four-SPERT system is required, and works very well - getting something over 370 MCUPS (millions of connection updates per second), which is excellent considering that a single SPERT, running locally, rarely exceeds 100 MCUPS.

To run a MultiSPERT training, consult the qnstrn man page, paying particular attention to the cpu= and slaves= options. To find a machine to run on, try finger -l spert; by convention, the .plan file of the pseudo-user spert has been kept up-to-date with the names of the available SEPRT machines and their current users, if any. You'll know the multisperts because multiple SPERT card serial numbers are listed for a single host.


Previous: 6.6 Tell me about the SPERT boards. - Next: 6.8 How does neural net size affect performance?
Back to ICSI Speech FAQ index

Generated by build-faq-index on Tue Mar 24 16:18:16 PDT 2009