CMOS VLSI chip 1980-1989

The PF474

A CMOS VLSI device to perform intelligent string comparisons while maintaining an on-chip ranked lists of best matches.

Click for a larger picture
The PF474 was featured on the cover of Electronics Magazine in December 1983.

Description: The PF474 is a special purpose CMOS VLSI processor that compares strings resulting in a numerical indication of their similarity. A ranked list of the 16 best matches is maintained on silicon. Its 9-stage pipeline includes 6 independent RAMs, including one that can be set to zero in a single cycle. The device count is roughly 55,000. Prototypes were build using 4-micron NMOS technology, but the first commercial versions used 3-micron CMOS. Mark Heising led the CMOS design effort. At its maximum 4MHz clock speed, 20,000-30,000 string comparisons and rankings per second were possible. See the author's 1978 masters thesis or A Bipartite Matching Approach to Approximate String Comparison and Search for a discussion of the kind of string similarity function computed by the PF474.

The PF474 3-micron CMOS die

Background: The PF474 arose more or less directly from my 1978 Mathematics Masters thesis at Emory University. It presented an easily computed notion for string similarity. This mathematical work was continued much later (see the links below). The motivation for the PF474 was the observation that computers were originally developed for numerical tasks but were clearly heading for widespread and general purpose (nonnumerical) use. While many had floating point units, little hardware existed for nonnumerical applications. We suggested that the computation of a string similarity measure might be of enough general interest to warrant its inclusion in hardware. The hardware was 100-600 times faster than CPUs of the early 80's.

A summary of applications for PF474 functionality

This allowed us to demonstrate never-before-seen concepts in user interface in which the screen showed similar records, refreshing itself with every character typed (see the PBASE image below). Another application (CleanMail) used it to analyze mailing lists to aid in the detection of near duplicates. The PF474 found several commercial and defense applications but never widespread use. Ultimately our focus turned entirely to software embodiments of similar functions.

A screen from the PBASE program. This program scanned an in-memory database of records using the PF474 and refreshed its display with each key typed. The records are displayed in order of similarity to the typed query

The CleanMail program used the PF474 to look for near duplicates in mailing lists

Click for a larger picture
The PF474 was described in a databook published by Proximity Technology (ISBN 0-926390-00-7) written primarily by myself and Tom Kearns

Click for a larger picture
The PF474 was packaged in a 40-pin ceramic DIP with a memory-like pinout

Click for a larger picture
PF474 equipped circuit boards for the (then new) IBM-PC and the Apple II were available

The second generation circuit board for the IBM-PC bus was much smaller and easier to program