## Integrated Smart Image Sensors in Digital CMOS Technology

by

Mark Ted Grigoleit B.Sc. (Honours), Simon Fraser University, 1986

## A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE

in the School

of

**Engineering Science** 

© Mark Ted Grigoleit 1991 Simon Fraser University April 1991

All rights reserved. This thesis may not be reproduced in whole or part, by photocopy or other means, without the permission of the author.

## Approval

| NAME:            | Mark Ted Grigoleit                                           |
|------------------|--------------------------------------------------------------|
| DEGREE:          | Master of Applied Science (Engineering Science)              |
| TITLE OF THESIS: | Integrated Smart Image Sensors in Digital CMOS<br>Technology |

EXAMINING COMMITTEE:

Chairman: Dr. John Jones

Dr. Márek J. Syrzycki Senior Supervisor

Dr. Shawn Stapleton Supervisor

Dr. Glenn Chapman Examiner

DATE APPROVED: April 11, 1991

#### PARTIAL COPYRIGHT LICENSE

I hereby grant to Simon Fraser University the right to lend my thesis, project or extended essay (the title of which is shown below) to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. I further agree that permission for multiple copying of this work for scholarly purposes may be granted by me or the Dean of Graduate Studies. It is understood that copying or publication of this work for financial gain shall not be allowed without my written permission.

#### Title of Thesis/Project/Extended Essay

"Integrated Smart Image Sensors in Digital CMOS Technology"

Author:

(signature)

Mark Grigoleit

(name)

April 18, 1991

(date)

#### Abstract

Focal-plane processing is a general approach to 'smart' image sensor design that incorporates local processing circuitry within the sensor itself, in an effort to overcome the high data bandwidths needed for conventional full-frame digital image collection and processing. Current approaches to focal plane processing use dense, regular arrays of detectors, interleaved with analog and digital circuitry. The state-of-the-art is represented by simple image processing done in arrays of up to 64 x 64 sensors. With current trends in feature size reduction and device yields, designs of full-frame arrays of 512 x 512 'smart' sensors may be feasible soon.

Despite these advances in existing designs, there is still room for improvement in such areas as large die sizes, device defects and low yields, and the high power consumption of existing circuits. As a way of overcoming these limitations a novel smart image sensor design is presented, based on the receptive field paradigm. The receptive field circuits consist of several photoreceptors connected by preprocessing circuitry, able to perform contrast enhancement and Gaussian smoothing, as well as more specialized tasks such as edge detection and motion detection. Built-in redundancy is provided by having several photoreceptors within a given cell. Receptive field circuits can be configured according to the task required, and very low power consumption makes them good candidates for wafer-scale integration (WSI) implementation.

This thesis examines the application of very-low-power current-mode analog processing circuitry to perform the local processing and detection of image

primitives, in conjunction with photodiodes which produce currents from 10pA to 1 $\mu$ A. In addition, by using a specially-designed voltage-controlled oscillator, the output signal of each receptive field is converted to a waveform with an output frequency range of up to 5 decades, thus eliminating the need for on-chip A/D converters, while at the same time providing high noise immunity.

The individual components of this sensor design have all been fabricated in  $3\mu$ m and  $5\mu$ m digital CMOS processes, and have been shown to work acceptably over the expected operational range of the photodetectors. A typical cell consisting of 5 detectors, current mirrors, and VCO has a power consumption of only 60 $\mu$ W with a 3 volt supply.

## Dedication

in memory of

#### Jean Ann Grigoleit 1939 - 1977

- a wife and mother who, by chance of an early death, was prevented from seeing her son grow to become a man.

### Acknowledgements

I would like to express my thanks to the following people:

**Dr. Marek Syrzycki**, for his supervision and guidance, his constructive criticisms and suggestions, and for his financial support for the duration of this project,

**Dr. Glenn Chapman**, for his advice and fruitful discussions on the laser measurement setup and principles of the WSI LARS approach, and

**Dr. Andrew Rawicz**, for providing the initial motivating concept of a silicon implementation of the vertebrate retina.

l would also like to thank my family, Dad and Karen, who have always been there for me.

This thesis was prepared on a Macintosh Plus computer, using MicroSoft Word 4.0, Cricket Graph, SuperPaint, and StatView software.

# Table of Contents

| Approval                                    | ii  |
|---------------------------------------------|-----|
| Abstract                                    | iii |
| Dedication                                  | v   |
| Acknowledgements                            | vi  |
| Table of Contents                           | vii |
| List of Figures                             | ix  |
| List of Tables                              | xii |
| 1. Introduction                             | 1   |
| 1.1. Previous Work                          | 1   |
| 1.2. New Approach                           | 4   |
| 1.3. Outline of this work                   | 6   |
| 2. The Receptive Field Model                | 7   |
| 2.1. Organic Receptive Fields               | 7   |
| 2.2. Applications to Computational Vision   | 9   |
| 2.2.1. Low-level Image Processing           | 9   |
| 2.2.2. Marr's Size-Tuned Filters            | 13  |
| 2.2.3. Image Encoding                       | 17  |
| 2.2.4. Optical Flow                         | 18  |
| 2.2.5. Summary                              |     |
| 2.3. Microelectronic Receptive Field        | 19  |
| 3. Image Sensor Design                      | 23  |
| 3.1. Photodetectors                         | 23  |
| 3.1.1. Photodiodes                          | 23  |
| 3.1.2. Vertical Bipolar NPN Phototransistor | 27  |
| 3.1.3. Photo-MOSFET                         |     |
| 3.1.4. Devices Chosen                       |     |
| 3.2. Processing Photodetector Outputs       | 31  |
| 3.2.1. Digital vs. Analog                   |     |
| 3.2.2. Current-Mode Methods                 |     |
| 3.2.3. Subthreshold Current Mirrors         |     |

|    | 3.3.   | Transmis    | sion of Detector Signals                     |    |
|----|--------|-------------|----------------------------------------------|----|
|    |        | 3.3.1.      | Conversion to Frequency                      |    |
|    |        | 3.3.2.      | Voltage Controlled Oscillator Designs        |    |
| 4. | Imple  | mentation   | & Testing of Image Sensor Components         | 43 |
|    | 4.1.   | Photod      | iodes & Phototransistors                     | 43 |
|    |        | 4.1.1.      | Light Source Setup                           | 45 |
|    |        | 4.1.2.      | Bias Voltages                                | 47 |
|    |        | 4.1.3.      | Measured Photocurrents                       | 50 |
|    | 4.2.   | Curren      | t Mirrors                                    | 53 |
|    |        | 4.2.1.      | Simple Current Mirrors                       | 55 |
|    |        | 4.2.2.      | Photodiode with Current Mirror               | 59 |
|    |        | 4.2.3.      | Cascading Current Mirrors                    | 60 |
|    |        | 4.2.4.      | Addition & Subtraction using Current Mirrors | 62 |
|    |        | 4.2.5.      | Summary of Current Mirrors                   | 64 |
|    | 4.3.   | Pseudo      | -DTL CMOS VCO Design                         | 65 |
|    |        | 4.3.1.      | Interface to VCO                             | 72 |
|    | 4.4.   | A Compl     | ete Receptive Field Cell                     | 74 |
| 5. | Applie | cation Issu | les                                          | 76 |
|    | 5.1.   | Receptiv    | e Field Simulations                          | 76 |
|    |        | 5.1.1.      | Simulation Setup                             | 77 |
|    |        | 5.1.2.      | Simulation Results                           | 79 |
|    | 5.2.   | Power Co    | onsumption and WSI                           |    |
|    | 5.3.   | Potential   | Resistance to Fabrication Defects            |    |
|    | 5.4.   | Future V    | Vork                                         |    |
| 6. | Conch  | usions      |                                              | 91 |
| 7. | Biblio | graphy      |                                              | 93 |

.

# List of Figures

| Figure 2.1 | 3x3 weighting function for image smoothing           | 10 |
|------------|------------------------------------------------------|----|
| Figure 2.2 | 5x5 weighting function for Gaussian smoothing        | 11 |
| Figure 2.3 | Weighting functions for gradient operators           | 13 |
| Figure 2.4 | $\nabla^2 G$ or DOG weighting function               | 15 |
| Figure 2.5 | 3x3 DOG weighting function                           | 15 |
| Figure 2.6 | Weighting functions for (a) vertical bar and (b)     |    |
|            | horizontal edge detection                            | 16 |
| Figure 2.7 | HOP transform                                        | 17 |
| Figure 2.8 | Block diagram of Microelectronic Receptive Field     |    |
|            | (MERF) components                                    | 22 |
| Figure 3.1 | Cross-sectional view of photodiode structures in     |    |
|            | CMOS technology                                      | 24 |
| Figure 3.2 | Photodiode configured in the photoconductive         |    |
|            | mode                                                 | 25 |
| Figure 3.3 | Cross-sectional view of vertical NPN phototransistor | 27 |
| Figure 3.4 | Cross-sectional view of the photo-MOSFET             | 29 |
| Figure 3.5 | Circuit diagram of the photo-MOSFET detector [27]    | 30 |
| Figure 3.6 | Current mirror circuits                              | 34 |
| Figure 3.7 | Current mirror circuits for (a) addition and (b)     |    |
|            | subtraction                                          | 35 |
| Figure 3.8 | Circuit of an nMOS VCO [41]                          | 40 |
| Figure 3.9 | Transfer function for the nMOS VCO [41]              | 41 |
| Figure 4.1 | Photomicrograph of large and small photodiodes       | 44 |
| Figure 4.2 | Measurement setup with laser light source            | 45 |
| Figure 4.3 | Measurement configuration for photodiodes            | 47 |
| Figure 4.4 | Dark current of photodiodes for a range of reverse   |    |
|            | bias voltages                                        | 48 |
| Figure 4.5 | Photocurrent of an 80 x 80µm photodiode for a range  |    |
|            | of reverse bias voltages at a constant illumination  | 49 |
| Figure 4.6 | Output current of photodiodes with a red laser       | 50 |

| Figure 4.7 Output current of photodiodes with a blue laser      | .51 |
|-----------------------------------------------------------------|-----|
| Figure 4.8 Ratio of photocurrents from photodiodes of areas 160 |     |
| x 160µm and 80 x 80µm under blue and red laser                  |     |
| light                                                           | 51  |
| Figure 4.9 Circuit layout of long-channel pMOS devices          | 54  |
| Figure 4.10 Current mirrors for smart vision sensors            | 55  |
| Figure 4.11 Current gain of wide-channel unity-gain nMOS        |     |
| current mirrors (Vdd=3V)                                        | 56  |
| Figure 4.12 Current gain of wide-channel unity-gain pMOS        |     |
| current mirrors (Vdd=3V)                                        | 56  |
| Figure 4.13 Gain of pMOS current mirror circuits with gain=2    |     |
| and gain=3                                                      | 58  |
| Figure 4.14 Circuit diagram of photodiode with pMOS current     |     |
| mirror                                                          | 59  |
| Figure 4.15 Cascading pMOS and nMOS current mirrors             | 60  |
| Figure 4.16 Current gain of cascaded wide-channel unity-gain    |     |
| pMOS and nMOS current mirrors (Vdd=3V)                          | 61  |
| Figure 4.17 Current mirror addition of two inputs in the 100nA  |     |
| range                                                           | 63  |
| Figure 4.18 Current mirror subtraction in the 100nA range       | 63  |
| Figure 4.19 Summary of current mirror addition and subtraction  |     |
| with 2 inputs in the 100nA range                                | 64  |
| Figure 4.20 Logic NAND gate designs using (a) bipolar DTL and   |     |
| (b) pseudo-DTL CMOS design                                      | 66  |
| Figure 4.21 Circuit diagram of the pseudo-DTL CMOS VCO.         | 67  |
| Figure 4.22 Circuit layout of the pseudo-DTL CMOS VCO           | 67  |
| Figure 4.23 Photomicrograph of the pseudo-DTL CMOS ring         |     |
| oscillator                                                      | 68  |
| Figure 4.24 VCO Frequency vs. Input Voltage (Vdd=3V).           | 69  |
| Figure 4.25 VCO Current Consumption vs. Output Frequency        | 69  |
| Figure 4.26 VCO Output Voltage vs. Output Frequency for 3-and   |     |
| 5-stage ring oscillators (Vdd=3V)                               | 70  |
| Figure 4.27 P-channel pullup diode interface to VCO.            | 72  |
| Figure 4.28 Output Voltage of pMOS pullup diodes of different   |     |
| aspect ratios for drain currents of $10pA$ to $10\mu A$ (for    |     |
| Vdd=3V)                                                         | 73  |

| Figure 4.29 Circuit diagram of a complete MERF cell7                 | 74 |
|----------------------------------------------------------------------|----|
| Figure 4.30 Calculated transfer function of the MERF cell7           | '5 |
| Figure 5.1 Weighting functions for simulated receptive fields7       | 77 |
| Figure 5.2 Cell layout for simulation of receptive field             |    |
| functions7                                                           | 78 |
| Figure 5.3 Response of 3x3 DOG receptive fields to an edge           |    |
| stimulus8                                                            | 30 |
| Figure 5.4 Response of 5x5 DOG receptive fields to an edge           |    |
| stimulus8                                                            | 30 |
| Figure 5.5 3x3 DOG receptive field output for different contrast     |    |
| ratios8                                                              | 32 |
| Figure 5.6 5x5 DOG receptive field output for different contrast     |    |
| ratios8                                                              | 32 |
| Figure 5.7 Adjacent edge detector receptive field outputs for edge   |    |
| stimulus                                                             | 33 |
| Figure 5.8 Edge detector receptive field output for different        |    |
| contrast ratios                                                      | 84 |
| Figure 5.9 Bar detector outputs for moving bar of width = 1.0        | 85 |
| Figure 5.10 Bar detector receptive field output for bar of width =   |    |
| 1.0 at different contrast ratios                                     | 85 |
| Figure 5.11 Bar detector receptive field output for different bar of |    |
| widths                                                               | 86 |
| Figure 5.12 3x3 DOG receptive field output with 1 or 2 damaged       |    |
| detectors                                                            | 87 |

## List of Tables

| Table 4.1 | Values of responsivity R for tested photodiodes under |    |
|-----------|-------------------------------------------------------|----|
|           | red and blue light                                    | 52 |

#### 1. Introduction

The last 10 years have seen significant advances in the area of machine vision, most notably in the design of improved hardware for image sensing. The development of image sensors has seen a push to incorporate preprocessing circuitry (to perform signal conditioning and image filtering) at the sensor level, in many cases integrating both the sensor and the processing circuitry on the same IC, thus creating a *smart image sensor*. The purpose of such designs is to off-load some of the rudimentary image processing tasks onto the sensor itself, reducing the need for extremely high data I/O rates for real-time image processing [1]. Application of such preprocessing to detector arrays is known as *focal plane* or *image plane processing*.

#### 1.1. Previous Work

Early attempts at image plane processing were from a background of digital VLSI architecture, using blocks of arithmetic logic unit (ALU) functions in various configurations [2]. The standard approach to distributed processing would be to lay out a regular array or network of processing elements and local memory cells, in a single instruction/multiple data stream (SIMD) or multiple instruction/multiple data stream (MIMD) configuration. Allowing each processing element to be programmable would classify such a sensor as 'intelligent', although some versatility could be sacrificed by hard-wiring the functions, to create a 'smart' sensor. Both of these methods produce structures that perform relatively simple functions yet are far too large to be considered for use in dense single-die arrays of sensors. More recent efforts have turned to analog computation circuitry, in order to reduce the circuit layout size of each cell. The results obtained by some of the major researchers in the field are summarized below.

Carver Mead and others [3,4] have focussed their work on a hexagonal tiling of logarithmic photodiode detector cells. These designs use a unique *horizontal resistor* network of MOS transistors to connect adjacent cells. The circuit implements gain adaptation, low-pass spatio-temporal filtering, and (most recently) orientation selection. The MOS transistor network depends on a special bias circuit at each node, which consumes 50% of the total cell area, while the photodetector accounts for only about 25%. The maximum array size is 48 x 48 pixels.

Tremblay and Poussart [5,6] also used a hexagonally-tiled detector array with the addition of a custom chip set, including an analog filter/convolver and digital controller module, to provide a sensor system with gain adaptation, edge detection, and a convolution kernel of 49 pixels. This performs more low-level image processing than do Mead's devices, but not strictly at the sensor level, since the actual computation is performed in a set of circuits external to the sensor. The original design, implemented in a  $3\mu m$  digital CMOS technology, contained a 64 x 64 array. At the time of this writing the design was being converted to 1.2 $\mu m$  technology, to achieve an array size of 200 x 200 pixels, and a detector pitch of 63  $\mu m$ .

Ginosar and Zeevi [7] used a local intensity averaging circuit to implement what they call *adaptive sensitivity*. This has the effect of compensating the gain of each pixel relative to its neighborhood, which dynamically adjusts the gain in a local area of an image. While this does not perform any traditional image processing function, it does allow for dynamic range compression of bright and darkly lit areas of an image. The main purpose of this approach is to offer an improvement over the standard video camera, which has only global sensitivity adjustment.

Finally, Fossum [1] has concentrated on CCD photodetectors and a mix of analog and digital circuitry to create a sensor cell containing 4 detectors plus

associated processing in an area of  $360\mu$ m x  $360\mu$ m. A 24 x 24 array of such cells (48 x 48 detectors) can perform image smoothing, thresholding, edge detection, and A/D conversion. This array is the same size as the one produced by Tremblay and Poussart, but performs all the processing within the one die. As in Mead's design, the photodetector accounts for only a small fraction (about 5%) of the cell area, while parallel data lines take up about 40% of the cell.

All these designs succeed in performing some basic image processing tasks at the sensor level. They all, however, face the same problems: limited area for each cell, and an upper limit on the die size. The more complex the processing tasks are, the larger the cell becomes, decreasing the sensor density and limiting the resolution of the sensor. Even the design by Tremblay [5] which does not include processing circuitry *per se* at the photodetector level, requires circuitry to gate the detector output to three separate axes. In addition, those methods that perform A/D conversion at the cell level, in order to speed data throughput, pay a further space penalty for the area consumed by the data buses.

In short, the amount of processing currently available at the sensor level is still limited to simple filtering and gain adjustment, and that on arrays of less than 100 x 100 detectors. Implementation concerns place a barrier to ever-increasing size and complexity, although current trends in die size and minimum feature size may allow single-die array sizes of up to 512 x 512 in the near future. All of the designs previously mentioned treat this problem as a TV camera sensor with some image processing, using uniform arrays of sensor cells on a single die.

The ultimate goal of smart image sensor design is to reproduce, as much as possible, the function of the vertebrate retina in the form of a large-area image sensor suitable for machine vision applications. This ideal image sensor would include local image processing, and built-in redundancy to

overcome defective devices. The main obstacles to achieving this goal are (a) the limited area of a single die, and (b) the vulnerability to fabrication defects encountered by regular arrays of sensors.

### 1.2. New Approach

In contrast to the single-die approach to circuit design, the advent of waferscale integration (WSI) offers new possibilities for image sensor design. Techniques exist that allow the interconnections between die on a wafer to be configured after fabrication, to use only working cells [8], thus opening up virtually the entire area of a wafer for circuit layout. One can now think in terms of architectures that use 100 times the number of devices as on a single die. This new environment thus changes one of the basic assumptions governing sensor array design - the use of dense, regular arrays of cells. The large contiguous area available on a wafer would allow photodetectors to be layed out in non-regular arrays of varying density, much the same way that photoreceptors are distributed in the vertebrate retina to achieve foveal and peripheral vision.

Several other factors also change in a WSI environment. Because of the expanded number of devices available, the power consumption of each cell must be kept to a minimum. Low yield factors are no longer as critical, as the defective cells can be wired around using techniques of the Large Area Reconstructurable Systems (LARS) approach [8]. The large number of sensors and the distances the weak signals must travel across a wafer also lead to the need for noise-free communication and possible encoding of the detector outputs.

In order to take advantage of the WSI environment for image sensor design, a new sensor architecture based on the structure of biological vision systems is proposed. This new paradigm for smart sensor architecture consists of three main concepts: built-in redundancy to compensate for fabrication defects; compression and encoding of image primitives to reduce the volume of data produced; and frequency-domain transmission of the sensor output to ensure reliable communication of the signals.

Central to this architecture is the *receptive field*. It is known from studies of vertebrate vision systems that early processing and encoding of image information occurs at the retinal level, through the use of specialized structures containing photoreceptors and other cells [9,10]. These groupings of photoreceptors, connected by horizontal and bipolar cells, perform basic image smoothing and encoding primitives by virtue of their size, placement, and interconnection. Such groups of photoreceptors and their connecting cells are called receptive fields. The receptive field model has already been successfully applied to computational vision [11] and image encoding [12].

The receptive field model has several advantages for implementation, as it does not require particularly regular layouts, nor does it require that every device work in order to function. This makes it ideal for use in WSI, where a vast area is available for circuit layout, but where 100% functionality can never be guaranteed because of processing defects.

This new model for sensor design introduces several new design constraints. As always, the processing circuitry needs to be compact, but analog methods can be used if a degree of numerical precision can be sacrificed in favour of circuit layout area. The need to produce circuits that use an absolutely minimal amount of current forces the operation of MOS transistors into the subthreshold region. This effect would have to be examined, to verify that all processing circuits still work as intended.

#### 1.3. Outline of this work

In order to prove the practicality of the receptive field model, certain basic building-block circuits must first be realized. The goal of this thesis will be to provide the design of extremely low-power circuits, patterned after biological models, that can sense, process, and encode image data in non-regular layouts at the sensor level. These component cells could then serve as the building blocks of a future smart vision sensor, one with an architecture based on the receptive field model of the vertebrate retina. With such an approach it is possible to incorporate image sensing, low-level filtering, data compression and encoding in the sensor itself, thus adding a level of data complexity to machine vision sensors. While the final design of a WSI vision sensor is beyond the scope of this work, it will be shown that such building-block sensor cells are ideally suited for implementation in digital CMOS WSI fabrication.

The remainder of this work is divided into five chapters. Ch. 2 discusses the fundamentals of early vision and image processing, and introduces the concept of the receptive field model, and how it relates to the design of a smart image sensor. Ch. 3 discusses the nature of photodetectors and analog processing circuitry in the context of digital CMOS technology. Ch. 4 describes the design and performance of a novel voltage-controlled oscillator, which is used to encode the current-domain photodetector outputs into a frequency-domain signal. Ch. 5 details the design and implementation of the photodetector and preprocessing circuits in digital CMOS technology, including experimental results. This chapter ends with a discussion of how the design and performance of the component cells relate to the design issues of a WSI image sensor. Ch. 6 concludes with a summary of all results.

#### 2. The Receptive Field Model

This chapter first presents the receptive field model in two contexts - the structure of the vertebrate retina, and the algorithms of computational vision. The hardware implementation of these algorithms applied to smart sensor design in a structure similar to that of the organic receptive field will then lead to a third context - the IC implementation of the receptive field model in the Microelectronic Receptive Field (MERF).

#### 2.1. Organic Receptive Fields

The ultimate example of a smart image sensor can be found in biological vision systems, in particular the vertebrate retina. This enormously complex organ, adapted for the special needs of every different creature, contains a wealth of image detection, low-level processing and image encoding mechanisms, and remains the standard by which man-made image sensors are measured.

In contrast to the simple video camera approach to machine vision, organic vision systems show an amazingly integrated structure, allowing them to perform well in spite of such factors as damaged components, cell response times that are comparatively slow, and transmission of enormous amounts of data over long pathways to the visual cortex of the brain. From the simplest vertebrates to humans, several key concepts are used in all organic vision systems to provide effective and versatile performance: photoreceptor distribution optimized for detail (center) and motion detection (peripheral), cellular interconnections within the retina providing data reduction and compensation, and reliable transmission of the vast amounts of image data along the optic nerve to the brain.

Consider the human retina, for example, which contains some 100 million photoreceptors. These are distributed nonuniformly across the optic plane, with the highest concentration in the fovea, or center of vision [13]. This nonuniform distribution results in a varying level of spatial resolution, more in the center and less towards the edges, allowing a reduction in the amount of information gathered from the entire field of vision. The individual photoreceptor outputs pass through several layers of interconnected cells, which perform some simple but powerful signal processing in a massively parallel fashion. The final output of the retina then passes into the optic nerve, which contains some 1 million nerve fibers, and is then encoded in the frequency domain as pulse streams and transmitted to the visual cortex of the brain. Note that an overall data reduction of approximately 100:1 takes place in the retina [10].

Much of the performance of the vertebrate retina is a direct result of the structure and interconnection of the photoreceptors and connecting cells, forming the biological receptive field model. The concept of the receptive field first arose out of studies of vertebrate vision, beginning in the 1960's and continuing to the present day [14,15]. Through tests on lab animals and human subjects using moving bar patterns, these studies have concluded that the individual photoreceptor cells in the retina are locally connected via horizontal cells, resulting in regions on the retina that are particularly sensitive to such features as moving bar and line patterns of differing speed and orientation. In addition, the phenomenon in human vision known as Mach bands, which is responsible for 'phantom' dark patches between sharp dark edges, has long been known to be produced by local gain compensation mechanisms in the retina [16].

All this evidence supports a modeling of the retina based on the distribution of locally-connected groups of photoreceptors, with differing size, function, orientation, and density, across the surface of the retina. By having these receptive field cells of various size and orientation distributed across the retina, the entire field of vision can be properly monitored, and some image primitives such as edges, bars, and lines can be encoded within the retina by a single receptive field output.

## 2.2. Applications to Computational Vision

In the field of computational vision the receptive field model has been used to construct efficient image encodings, extract motion from a sequence of digital images, and provide a local model for image preprocessing. All of these point to the usefulness of the receptive field model in machine vision, and in particular show the simple mechanisms by which image sensors can be constructed so as to incorporate these kinds of functions. Beginning with low-level image processing, this section will examine the ways that the receptive field model has found its way from biological studies into computational vision.

#### 2.2.1. Low-level Image Processing

As mentioned before, the conventional TV camera sensor does not contain any signal processing, so that any noise or offsets in the raw image data due to low contrast, uneven lighting, or detector variation must later be removed by digital image processing. Many of these low-level image processing algorithms operate on small windows of the image, making them compatible with the receptive field model. Several algorithms, such as those detailed in [17], are commonly used to enhance digital images, and are briefly described below.

Perhaps the simplest of all enhancement operations, *histogram enhancement* attempts to improve the range of intensities in an image by redistributing the profile of intensity values. A sensed image may contain an unusually narrow distribution of intensities, resulting in an image that looks flat and grey, obscuring almost all detail. A simple remapping of the intensities that spreads out the narrow cluster into a broader range is often enough to bring out the latent detail. Because this remapping is applied to a single pixel at a time, it reduces to a simple table lookup. However, it also requires knowledge of the entire image, and is thus better suited for use on static images.

*Smoothing* tries to reduce the amount of change in intensity between adjacent pixels. This may be desirable for reducing speckle noise or *snow*, which is usually caused by transmission errors, but which also might arise from a faulty detector cell. Smoothing of an image can be achieved in the spatial domain by performing a neighborhood averaging on a small window of the image, usually 3x3 or 5x5. This works by replacing the value of the center pixel by the average intensity within the window. In the case of a 3x3 window, each pixel has the weight  $w_i=1/9$ . The larger the window, the more pronounced will be the smoothing. Figure 2.1 shows an example of this type of weighting function, with the normalizing scale factor of 1/9 being understood.

Figure 2.1 3x3 weighting function for image smoothing.

One possible criticism of this method is that it tends to excessively blur edges and sharp details [17]. Instead of averaging, the new pixel value may be chosen as the *median* value within the window, thus giving a median filter. This filter is much more discriminating, by outright rejecting pixels that fall outside an acceptable limit, instead of incorporating their values into an average. This algorithmic method requires a sort of the neighborhood pixel values, and cannot be expressed as a simple weighting function.

Another weighting function method of achieving image smoothing is to convolve the image with a 2-dimensional Gaussian weighting function, given by

$$g(x,y) = \exp\left[-\frac{x^2 + y^2}{2\pi\sigma^2}\right]$$
 (2.1)

where x and y are the pixel coordinates from the origin, and  $\sigma$  is the standard deviation. A discrete, unnormalized form of this function is illustrated by the weighting function shown in Fig. 2.2.

$$\begin{bmatrix} 0 & 1 & 1 & 1 & 0 \\ 1 & 2 & 2 & 2 & 1 \\ 1 & 2 & 4 & 2 & 1 \\ 1 & 2 & 2 & 2 & 1 \\ 0 & 1 & 1 & 1 & 0 \end{bmatrix}$$

Figure 2.2 5x5 weighting function for Gaussian smoothing.

This low-pass filter, and other ones like it, attenuate the high-frequency components in the Fourier transform of an image. If the size of the  $\sigma$  in the

Gaussian function is chosen carefully, the blurring effect in the edges and sharp details can be minimized, while still smoothing the speckle noise.

Sharpening can be achieved by enhancing, rather than attenuating, the high frequency components of an image by convolving the image with an *inverted* Gaussian. This has the effect of highlighting areas in the image where intensities show a marked change, such as at edges. Such an operation might be useful for an image that had low contrast, or in an extreme case, for extracting the edges from an image. The simplest form of this function is the gradient operator. The purpose of this operator is to produce the magnitude of the gradient of the image intensity function f(x,y) between two given pixels. The gradient function,

$$G[f(\mathbf{x},\mathbf{y})] = \sqrt{(\partial f/\partial \mathbf{x})^2 + (\partial f/\partial \mathbf{y})^2}$$
(2.2)

can be approximated with a difference operator, when working on digital images. Two such possible functions are given by

$$G[f(x,y)] = \sqrt{[f(x,y) - f(x+1,y)]^2 + [f(x,y) - f(x,y+1)]^2}$$
(2.3)

$$G[f(x,y)] = \sqrt{[f(x,y) - f(x+1,y+1)]^2 + [f(x+1,y) - f(x,y+1)]^2}$$
(2.4)

Eqn (2.3) operates on pixels that lie directly below and beside, while eqn (2.4), also known as Robert's gradient [18], operates on cross-differences. Notice that both these difference operators consider only immediate neighboring pixels. The weighting functions for various gradients are shown in Fig. 2.3. Notice that the isotropic Laplacian, used on a uniform intensity field, will reduce all values to zero.

It can be seen from the descriptions of these functions that preprocessing can be viewed as a convolution of the image with a particular weighting function, with the weights varying according to the purpose. Moreover, the window size of the convolution filter is typically very small, posing the possibility of a highly parallel local computation architecture. Indeed, this local processing has been implemented by several researchers, as outlined in Ch. 1. These operations perform mostly just image enhancement. However, similar functions can be used in the same way to detect image primitives, as illustrated in Sec. 2.2.2.

| $\left[\begin{array}{rrr} +1 & -1 \\ -1 & +1 \end{array}\right]$ | $\left[\begin{array}{rrrr} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \end{array}\right]$ | $ \left[\begin{array}{rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr$ |
|------------------------------------------------------------------|-------------------------------------------------------------------------------------|-------------------------------------------------------------|
| (a)                                                              | (b)                                                                                 | (c)                                                         |

**Figure 2.3** Weighting functions for gradient operators: (a) simple difference, (b) 2-D isotropic Laplacian, and (c) edge gradient.

#### 2.2.2. Marr's Size-Tuned Filters

Marr [9] has also used the receptive field model as a basis for his analysis of image representation. He has identified several basic functions performed by receptive fields found in the vertebrate retina, and has applied these to the extraction of image primitives by the use of *size-tuned filters* or *channels*. It has been theorized that the size, orientation, and weighting of these receptive field channels is responsible for the detection and higher-level encoding of images. It may be possible to use a similar set of channels in the design of a smart image sensor, producing an image sensor that performs the encoding at the detector level. A brief description of these receptive field functions is given below.

The *Difference of Gaussians* (DOG) function is used to describe the oncenter/off-surround receptive field function of the vertebrate retina, and is actually an engineering approximation to a more complicated gradient function used to extract edges from a digital image [9]. The second-order Laplacian operator, given by

$$\nabla^2 = \left(\frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}\right) \tag{2.5}$$

is often used to detect zero-crossings. In order to apply this operator to intensity changes at different spatial frequencies, it can be combined with a Gaussian function, as given in eqn (2.1). The resulting filter function, tuned to give a maximum output for intensity swings at a given scale, is called the  $\nabla^2 G$  operator, and is given by

$$\nabla^2 G(\mathbf{r}) = \frac{-1}{\pi \sigma^4} \left[ 1 - \frac{-\mathbf{r}^2}{2\sigma^2} \right] \exp\left[ \frac{-\mathbf{r}^2}{2\sigma^2} \right]$$
(2.5)

where r is the radius relative to the pixel in question, and  $\sigma$  is the standard deviation. This circularly symmetric function resembles a Mexican hat in shape, and is shown in Fig. 2.4. The term Difference of Gaussians arises from the fact that this numerically complex  $\nabla^2 G$  function is virtually identical to a the difference between a broad negative Gaussian and a narrow positive one. The equivalence is best when the ratio of the space constants is 1:1.6 [9].

Marr argued that this DOG function forms the basis of the size-tuned filters believed to be responsible for the early detection of edge, line, and bar features in human vision. These feature-extraction mechanisms form the basis for Marr's model of early vision.



**Figure 2.4**  $\nabla^2 G$  or DOG weighting function.

From the point of view of sensor design, however, the purpose of this operation is twofold. First, it uses a local neighborhood of pixels to provide a local weighted average of intensity, for comparison with the pixel of interest. This local average is used as a negative weight against the central pixel, so as to reduce the effect of an unusually bright or dark (i.e. noisy) pixel. A second function of this operator is to enhance the contrast ratio at the boundary of two different intensity regions, by producing an output that is proportional to the difference in intensity between two adjacent regions. This effect is observed in the phenomenon of Mach bands in human vision [16]. Although the 1:1.6 space constant ratio is difficult to achieve with a discrete weighting function, an approximation to this is shown in Fig. 2.5. Notice how closely this resembles the Laplacian of Fig. 2.3b.

$$\left[\begin{array}{rrrr} 0 & -1 & 0 \\ -1 & 5 & -1 \\ 0 & -1 & 0 \end{array}\right]$$

Figure 2.5 3x3 DOG weighting function

This turns out to be one of the most useful receptive field functions, as it achieves both noise reduction *and* edge enhancement, functions that separately require different Gaussian weighting functions.

Another set of receptive field functions are the edge and bar detector functions. These are very similar to the difference operators discussed earlier. The purpose of these functions is to give a maximum output when an edge or bar occurs in the orientation determined by the weighting function. For all other inputs the output is less than optimal. With a proper threshold on the output, these filters can be made to act as simple feature detectors. The weighting functions for a sample of these detectors are shown in Fig. 2.6.

| ſ | - 0   | 0.5 | 0 -   | [ | - 0 | -0.5  | 0 7 |  |
|---|-------|-----|-------|---|-----|-------|-----|--|
|   | -0.75 | 1   | -0.75 |   | -1  | -0.75 | -1  |  |
|   | -0.75 | 1   | -0.75 |   | 1   | 0.75  | 1   |  |
| l | _ 0   | 0.5 | 0 _   |   | _ 0 | 0.5   | 0 _ |  |
|   |       |     |       |   |     |       |     |  |
|   | (;    | a)  |       |   |     | (b)   |     |  |

**Figure 2.6** Weighting functions for (a) vertical bar and (b) horizontal edge detection.

Using this set of receptive field functions, small groupings of detectors can be used to filter noise out of an image, and produce feature detection outputs. The exact layout of the individual receptive fields would be determined by their distance from the center of the sensor and the level of such detail required at that location. Indeed, just such a method has been used in an efficient image encoding scheme, described in the following section.

#### 2.2.3. Image Encoding

The receptive field model has been applied by Alan Watson to an efficient digital image encoding [12]. In this case, the receptive field was modelled as an orthogonal set of seven kernels, laid out in a hexagonal fashion, each of the seven functions (one center, six surround) tuned to give a maximum output for either lines and edges in one of three orientations. Fig. 2.7(a) shows the receptive field as a linear combination of 7 receptors, while the orthogonal set of kernels used to encode the oriented line and edge primitives are shown in Fig. 2.7(b). The resulting scheme was called the *Hexagonally-Oriented Pyramid* (HOP) encoding.



Figure 2.7 HOP transform: (a) a single receptive field cell and (b) orthogonal set of kernels (after [12]).

This set of kernels was applied in a recursive hexagonal fashion to the entire image. The output of this is a pyramid of coefficients, similar to the Laplacian pyramid encoding [19]. The resulting encoding is similarly efficient, requiring about 1 bit/pixel.

#### 2.2.4. Optical Flow

The receptive field model has also been applied successfully by Heeger [11] to the problem of extracting optical flow (ie. motion tracking) from a sequence of digital images. Using both a Gaussian pyramid encoding of the image and a set of 12 Gabor energy functions, Heeger has been able to compute the optical flow of a sequence of images. This is done by choosing a set of ellipsoidshaped filters, tuned to detect image gradients in one of 12 spatio-temporal orientations, and convolving them with the encoded images. The outputs of the filters over time show the direction and magnitude, to within 10%, of the velocity within the scene.

While many details have been overlooked, the key feature to observe is that the receptive field model, in the form of a local oriented energy function, has been used as the basis for this computational implementation, and shows that the receptive field model has general application to machine as well as biological vision systems.

#### 2.2.5. Summary

It can be seen from the preceding examination of low-level image processing functions and receptive field functions that they are all (in a computational sense) just a convolution of a weighting function with the original image.

Moreover, many of the low-level processing functions have direct counterparts in receptive field functions. As the discrete form of these functions is simply a linear function of the pixels within a receptive field, such functions require only a way of performing weighted addition and subtraction, in order to be implemented on a large scale at the sensor level. In this way, what appear to be simple weighting functions can actually be used to achieve sensor-level encoding of vision primitives. By including these weighting functions within the structure of an image sensor, it should be possible to use the architecture of the sensor itself to perform the preprocessing and image encoding algorithms discussed.

#### 2.3. Microelectronic Receptive Field

Regular arrays of smart image sensors achieve only the equivalent of a TV camera with built-in low-level image processing. Due to their regular structure, these image sensors cannot provide any inherent image coding. An improvement on the current designs would be to use the positional layout of the detectors to provide encoding of image primitives at different scales and orientations, in addition to the existing noise filtering. The heart of the Microelectronic Receptive Field model is the use of several detectors in conjunction with some processing circuitry, to produce an output that is a function of more than one detector. In this way the sensor's structure is changed from merely a TV camera with some added noise compensation to a genuine *vision sensor*.

The added information comes at a price, and this is that many more detectors are required for this type of sensor than for a simple TV image sensor. Using

receptive fields to encode image primitives would mean laying out receptive field circuits of differing size and orientation across the entire area of the sensor. Clearly, in order to provide the number of devices needed for such a sensor the single-die approach must be abandoned. It might be possible to use a multi-chip assembly constructed from separate cut dies, but this would leave gaps in the sensor plane between the individual dies. A better way would be to take advantage of the much larger circuit layout areas available with WSI techniques.

An image sensor constructed in a WSI environment would have some new design considerations. First, a suitable silicon sensor must be found, one that produces a large enough signal with a reasonable size and a minimum of support circuitry. Second, the processing circuits need to be as compact as possible. Digital processing requires conversion of the analog detector output to a digital form, and A/D conversion circuits are known to be space and power consuming. If some accuracy can be sacrificed, then analog processing methods can be used, avoiding the conversion circuitry altogether. A third consideration is power consumption. A single die can be cut and packaged so as to dissipate even a few watts of power safely, but it would be difficult for an entire wafer to dissipate a proportional amount of heat. Supplying power to the entire wafer and dissipating the resulting heat forces the use of micropower circuits. There is also the problem of communicating the vast number of detector outputs from the sensor, and ensuring that the signals do not get corrupted by noise in transmission. Conversion of the receptive field outputs to some other domain, such as frequency, would allow the large number of outputs to travel safely across a wafer and out along connecting wires without being lost in crosstalk and line losses, as is typical with analog signals. Finally, re-routing around defective devices and signal lines using a laser-link approach [8] is only feasible if the layout of the sensor is not strictly dependent on 100% utilization of the devices.

An interesting and useful result follows from the nonregular structure of the receptive field model. Because of this structure, and the fact that the receptive fields each use a number of detectors, this type of image sensor would be inherently less vulnerable to fabrication defects. By aggregating the detectors and including many inputs to each receptive field, this removes the dependence on any particular detector, so that a single defective device does not cripple the sensor array. This becomes a critical feature in larger sensors, as the probability of encountering fabrication defects increases with the size of the die.

All of the receptive field algorithms discussed, from low-level image processing to image encoding, can be reduced to a weighted summing function. This is not to say that a receptive field function could not also include nonlinear operations such as square and square root. Even other sensor-related functions, such as dark/light adaptation and internal sensor calibration, could be added. Implementing the receptive field functions described, however, would form the basis from which more complex algorithms could later be introduced as necessary.

In summary, using the internal structure and layout of a smart image sensor based on the receptive field model to perform the kinds of receptive field functions identified in the vertebrate retina would lead to a new class of image sensor, one that might better be called a true *vision sensor*. Implementing this sensor on a WSI scale would provide both the number of devices and the layout area needed for such a design.

As illustrated in Fig. 2.8, a smart image sensor must perform three separate tasks: sense, process, and transmit image data. The goal of this work will be to examine how these three objectives might be achieved using simple CMOS microelectronic structures, based on the concept of the receptive field, and using low-power analog current-mode circuits. The new smart image sensor circuits will be called Microelectronic Receptive Field (MERF) cells.

21



Figure 2.8 Block diagram of Microelectronic Receptive Field (MERF) components.

#### 3. Image Sensor Design

This chapter moves from the theoretical aspects of the receptive field model to a discussion of the underlying devices that are involved in sensing, processing, and communicating sensor signals, as a foundation to the circuit implementation of the Microelectronic Receptive Field model in CMOS VLSI technology.

#### 3.1. Photodetectors

The first stage of an image sensor is the photodetector, which transduces optical power into an electrical signal. There are several silicon devices that produce a current from the quantum interaction between photons and silicon-based semiconductors [20]. Each has its own strengths and weaknesses, and a consideration of these issues will lead us to a choice of which device to use.

#### 3.1.1. Photodiodes

The simplest device available in silicon is the photodiode, which can be implemented in digital CMOS technology either by the n+ diffusion of an n-channel MOS transistor and the p-well, or between a p+ diffusion and the n-type substrate as shown in Fig. 3.1.


Figure 3.1 Cross-sectional view of photodiode structures in CMOS technology.

Within a lattice of silicon atoms the top level electrons, which are normally bound in the valence band, can be excited by photon or thermal energy into the conduction band [21]. Unless there is an electric field present, these electron-hole pairs will simply recombine. All silicon photodetectors, including photodiodes, work on the principle of capturing these photogenerated electrons before they can recombine. By reverse biasing the p-n junction, any photogenerated electron-hole pairs that occur within and in the neighborhood of the depletion region are separated and appear as a reverse current.

The response of the photodiode is a function of wavelength, limited by both the bandgap energy of the semiconductor and the depth of the p-n junction below the surface. The maximum wavelength of light that can be absorbed is determined by the bandgap energy  $E_g$ , the energy required by a photon to excite a valence electron into conduction, and in silicon this upper limit is 1107nm [20]. The minimum wavelength that can be absorbed is determined by the depth of the depletion layer. If the p-n junction is too far below the surface, the photons of shorter wavelength are absorbed before they reach the depletion region, and therefore the photogenerated carriers simply recombine. For n+ diffusion depths on the order of 1µm the lower wavelength limit is between 300 and 400nm [20]. In addition, the protective layer of SiN coverglass absorbs the shorter wavelengths. Consequently, photodiodes manufactured in a digital CMOS technology usually have a peak sensitivity to red light of around 800nm, with half-power cutoff points at 400nm and 1000nm [22].

Photodiodes are usually operated in the photoconductive mode, as shown in Fig. 3.2. With a constant reverse bias V and load resistor R, the output voltage is a function of the photocurrent  $I_p$  produced by the incident light intensity.



Figure 3.2 Photodiode configured in the photoconductive mode.

There will always be some reverse current even in the absence of light, owing to the thermal generation of carriers. An accurate expression for the *dark current density* J of a reverse-biased diode is a function of diffusion ( $J_{Od}$ ) and recombination ( $J_{Or}$ ) components [22] and is given by the expression

$$J = J_{Od} \exp\left[\frac{qV}{kT}\right] + J_{Or} \exp\left[\frac{qV}{nkT}\right]$$
(3.1)

where V is the reverse bias voltage, and n is the diode slope proportionality factor. Practical values given for  $J_{Od}$  and  $J_{Or}$  in [22] are  $5pA/cm^2$  and  $20nA/cm^2$ , respectively. The parameter  $J_{Or}$  is by far the larger of the two, and for a photodiode with a diameter of  $100\mu m$  it would have a value of about 2pA. The dark current is an exponential function of reverse bias, as seen in eqn (3.1), so in order to minimize the dark current p-n junction photodiodes are usually biased with a low voltage, on the order of a few volts [21].

One practical measure of performance is the *responsivity* R of the detector, the amount of detector current per watt of incident optical radiation [22]. A related figure is the quantum efficiency  $\eta$ , which is defined as the number of excess carriers generated per incident photon. The quantum efficiency can be calculated theoretically for a given wavelenth, and the measured value compared to this theoretical maximum.

The technological process parameters are fixed, including the doping concentrations and diffusion depth, so that the only variable left to the designer is how big to make the detector. The detector area must be chosen large enough so that the lowest expected value of the photocurrent is much larger than the dark current, producing a response that varies linearly with light intensity over the largest possible range. Experiments with photodiodes [23] using the 3µm digital CMOS process available from the Canadian Microelectronics Corporation [24] found that the lower size limit was about 30µm x 30µm.

Silicon photodiodes have a simple structure and a good spectral response, and devices implemented in a digital CMOS process can generate a useful photocurrent with a modest size, making them good candidates for use in the Microelectronic Receptive Field. In this case, structures using the n+

diffusion within a p-well will be preferable for applications requiring a number of separate photodiodes, as this will offer better isolation between devices.

### 3.1.2. Vertical Bipolar NPN Phototransistor

With a p-well CMOS process, the same structure that is used to produce an nMOS transistor can also be configured as a vertical bipolar transistor, as shown in Fig. 3.3 below.



Figure 3.3 Cross-sectional view of vertical NPN phototransistor.

The isolated p-well forms the base, the n+ diffusion is the emitter, and the substrate forms the collector. This is precisely the same structure used by Mead and Mahowald as the primary photodetector in their retina chip, producing 100 to 1000 electrons for every photon absorbed [4]. The same device has also been used in an optically coupled neural network [25] implemented in the 3 $\mu$ m digital CMOS process previously mentioned.

Normally in a bipolar transistor, the emitter diffusion is much smaller than is shown; however, the structure shown can double as both a photodiode and a vertical phototransistor.

Photogenerated carriers are produced by the same mechanism as in the photodiode, since the structure is the same, but in this case the carriers produced in the p-well become the base current of the transistor, and are amplified the same way as in a conventional bipolar transistor. The current gain  $\beta$  depends on the particular process parameters, but values of between 50 and 200 have been reported in a standard CMOS process [26]. Such a current gain would allow this device to produce a photocurrent two orders of magnitude larger than if it was configured as a photodiode, for the same level of light intensity.

The efficiency of this device also has a potential problem, however, in that the device could easily produce currents that are too high. In an image sensor that uses several hundred thousand detectors, the current consumption would have to be kept to a minimum, perhaps under 1µA per detector. Thus the vertical bipolar phototransistor is probably better-suited for use at lower light levels.

#### 3.1.3. Photo-MOSFET

Yet another way to collect photogenerated carriers is utilized in a relatively new device developed by Chamberlain and Yee [27]. This device is reported to produce an output voltage swing of 1.0 to 7.0 volts, for a seven-decade change in light intensity. The circuit is shown in Fig. 3.4.

The photo-MOSFET consists of a short-channel nMOS transistor, with a large source diffusion acting as the light collector, which is shorted to the gate. The device works in the following way. The source acts as the n connection of a

p-n junction, with the p-well being the other end. With the drain biased at some large voltage, any photogenerated electrons in the source are attracted to the channel and into the drain. This changes the voltage at the source to vary



Figure 3.4 Cross-sectional view of the photo-MOSFET.

logarithmically with the generation of photocurrent, as in a typical photodiode. This process relies on two factors: a short channel, on the order of 3.5 to 4.5  $\mu$ m, and a connection between the gate and the source, to ensure that the device operates in the subthreshold region.

The complete configuration of the detector circuit is shown in Fig. 3.5. A second nMOS transistor is used to translate the changing voltage at the first drain to a current through a load resistor. The voltage drop across the second transistor serves as the output voltage.

This device has been shown to respond linearly for light intensities between 28.4 W/cm<sup>2</sup> and 2.84  $\mu$ W/cm<sup>2</sup>, a range of 10<sup>7</sup>. Despite the impressive performance of these devices, practical large-scale implementations continue to suffer from the large currents through the load resistor, which can range

up to several mA. In commercially available devices, the detector area is only  $14\mu m \times 14\mu m$  but the power consumption per pixel is on the order of 10mW [28]. To date, no arrays greater than 512 x 512 of these sensors have been fabricated.



Figure 3.5 Circuit diagram of the photo-MOSFET detector [27].

#### 3.1.4. Devices Chosen

Photodiodes have several advantages over the other devices: they are very simple, and produce a useful current with modest size; because of their simple layout they are less sensitive to spot defects [29], which can cripple a large-area, regular layout image sensor; and perhaps most importantly, they have extremely low current consumption. Although the vertical bipolar phototransistor offers better sensitivity to low light levels, photodiodes were chosen for use in this work in order to test the lower limits of current consumption.

## 3.2. Processing Photodetector Outputs

Once the optical signal has been transduced by the photodetector it must be compared and combined with the output of neighboring detectors. There are two domains, digital or analog, in which to process sensor signals. A choice must be made as to which domain to perform the processing in, based on the nature of the sensor outputs, the complexity of the processing to be performed, and the inherent limitations of either method. This chapter will examine these issues and propose a way of processing the photodetector outputs that is both compact and sufficiently accurate.

#### 3.2.1. Digital vs. Analog

As mentioned in Ch. 1, the conventional approach to VLSI implementations of sensor-level processing is to first digitize the data, then perform digital processing using local processing elements. Performing the digitization at an early stage, and performing all operations on digital values certainly preserves more accuracy and precision than any other method, but it does so at quite a cost. Performing analog-to-digital conversion at every sensor consumes both circuit area and power. This high power dissipation problem has been cited by others [1, 30] as a major limiting factor. The data bus lines required to communicate the sensor data off-chip can consume up to one third of the chip area [1]. The processing elements themselves can take up far more area than the actual sensor, resulting in low sensor densities. The only way around these problems so far has been the use of Z-plane architectures, in which identical IC's of processing circuitry are sandwiched and bonded, at right angles, to the back of the sensor array IC [30]. Devices of this design are approaching array sizes of 256x256, and promise to offer a way of retaining digital processing with high sensor densities. However, for conventional 2-D sensor arrays, most researchers are abandoning digital processing methods in favour of analog methods.

Analog processing circuits offer the advantage of requiring much less space, and do not require the expensive A/D conversion at each sensor. However, there is a penalty in data precision, as analog signals are far more sensitive to noise and variation of process parameters than digital signals. Analog signals can also be degraded during transmission along long lines, especially in a WSI environment.

Within the realm of analog processing, the signals can be treated in either the voltage or current domain. Since the output of a photodetector (and in particular a photodiode) is a current, it would make sense to process the signals in the current domain. Current-mode processing methods would also eliminate the need for either A/D conversion circuitry or op amps for voltage addition and subtraction, both of which consume a lot of IC layout area.

#### 3.2.2. Current-Mode Methods

An area of analog processing that has been gaining increasing popularity in recent years is current-mode circuit design [31]. MOS current-mode circuits have been used successfully in digital multiplier circuits [32] and have even been proposed for use in smart image sensor design [33], although no published results have yet appeared.

The basic idea behind current-mode circuit design is to convey signals in the current domain, instead of the usual voltage domain. This has several advantages: current signals have a much larger dynamic range than voltages,

32

usually several decades as opposed to several volts; current signals are less susceptible to environmental noise; and the circuits required are small, and can be easily implemented in CMOS technology. By comparison, voltagemode circuits fabricated in CMOS technology using minimum sized devices can lead to large response differences from cell to cell, in some cases producing fixed offset differences that are roughly as large as the actual signal [3].

One of the chief advantages with current mode processing is that the addition of signals is reduced to twisting wires together, doing away with the necessity of resistors and op amps that are required with voltage addition. Examples of current mode processing circuits include current mirrors, current conveyors, and transconductance amplifiers, the current mode equivalent of the op amp. The most basic of these is the current mirror, which can be used for scaling (i.e. multiplication), copying, and sign reversal. Examples of current mirror circuits are shown in Fig. 3.6.

The current mirror works in the following manner. The input usually consists of a current source connected in series with one of the power rails. This causes a drain current to flow through the first device to the same power rail, and since the gates of the two devices are connected to the same potential, the output current will depend only on the ratio of the two aspect ratios. The scaling of a current is achieved by scaling the relative aspect ratios of the two transistors in the circuit, and currents can be copied n times by including n transistors in parallel, instead of the single one at the output. Both p-channel and n-channel versions of the current mirror can be used.



Figure 3.6 Current mirror circuits: (a) scaling using p-channel devices, and (b) copying using n-channel devices.

Circuits for addition and subtraction of two currents using current mirrors are shown in Fig. 3.7. The n-channel and p-channel current mirrors can be cascaded because their input conductance is much larger than their output conductance [34]. Notice that the only difference between the addition and subtraction circuits is that an n-channel current mirror has been added to change the sign of one of the operands.



Figure 3.7 Current mirror circuits for (a) addition and (b) subtraction.

From the discussion of Ch. 2 it was found that the Microelectronic Receptive Field model requires certain mechanisms for implementation, such as a way to (a) scale currents, (b) reverse the sign of a current (c) make several copies of the current output of a photodetector, and (d) sum different currents together. These mechanisms would allow the weighted summing of photodetector outputs, as well as the sharing of a single output by several adjacent receptive fields. Current mirrors can be used to perform all of these functions, and for this reason have been selected for implementing the local processing of the receptive field model. As the photogenerated currents feeding the current mirrors will most likely be less than  $1\mu$ A, the current mirrors will be operating in the subthreshold region, and it must be determined whether the current mirrors will operate linearly in this region.

### 3.2.3. Subthreshold Current Mirrors

MOS transistors in a digital CMOS process are not usually operated in their subthreshold region of operation intentionally. In an effort to reduce the power consumption of circuits such as those used for neural networks, many researchers are now beginning to explore the subthreshold area of device operation [35], and work is being done to develop consistent models for all three regions of operation - subthreshold, linear, and saturation [44].

These results are not limited to analog IC processes, but have found use in digital IC processes as well. Earlier work on analog circuits in digital CMOS showed that useful analog circuits could be realized in an IC process optimized for digital circuits [45]. More recently, several researchers have made use of unity-gain current mirrors operating in the subthreshold region, and have successfully applied these in neural network circuits implemented in a 3µm bulk CMOS process [34,35]. These unity-gain current mirrors used MOS transistors with aspect ratios of W/L= 6µm x 6µm, and were able to operate linearly on inputs from 5pA up to 1  $\mu$ A, a range of nearly 6 decades.

This range of currents corresponds quite well with the expected output of the photodiodes that are possible in a similar CMOS process, making it possible to use these current mirrors as the means of processing the photodetector outputs.

# 3.3. Transmission of Detector Signals

After sensing and processing the detector signals, they must then be communicated to the host computer, which may be some distance from the actual sensor. Using an entire wafer for an image sensor system poses some special problems for signal transmission: first, the sheer number of data lines that are involved; and second, the distance the data must travel along those transmission lines. Choosing to leave the detector signals in the analog current domain serves well for the preprocessing of the image sensor outputs, but there still remains the problem of analog signal degradation during transmission. This makes it necessary to convert the current-mode signals into a form more resistant to transmission errors, while at the same time preserving the wide dynamic range of these signals. Signal conditioning and encoding at the sensor level can take many forms, including serial, parallel, frequency, phase or pulse encoding [36], but perhaps the best of these methods is frequency encoding.

#### 3.3.1. Conversion to Frequency

It is possible to encode sensor signals into the frequency domain using a circuit that has already been used extensively in the design of integrated sensors - the *voltage controlled oscillator* or VCO. The idea is to convert a continuous, linearly-varying current into a pulse stream of fixed amplitude, with a frequency that is proportional to the input current signal. Several examples exist of integrated sensors with frequency encoding, including temperature and pressure sensors [37,38].

Frequency encoding has several advantages [39]. The frequency-encoded signals are virtually immune to environmental noise, can be electrically isolated, or transmitted along other media such as optic fibers, and require

37

only one wire per output. In addition, the pulse stream can be easily converted to a digital value for use by a digital computer. This leads to the one drawback with the VCO, and that is the need for conversion at the host computer. Simple timers and counters are all that is required for this step, and while this step might require a significant amount of extra circuitry, it is not necessary to include this within the sensor system itself, and is therefore not an issue with the Microelectronic Receptive Field sensor design. As a point of interest, the use of frequency encoding also parallels the way in which data is transmitted along the optic nerve from the eye to the brain [10].

The only potential drawback with using frequency encoding is that the output frequency is a function of the incident light intensity, so that for one end of the input range (either very bright or very dark) the input may change faster than the ability of the frequency output to track this change. Use of a sufficiently high frequency, however, should be able to avoid this potential problem.

A discussion of current VCO circuits and their application to integrated sensors is presented in the following section.

#### 3.3.2. Voltage Controlled Oscillator Designs

Simple VCO's can be made from any odd number of inverting stages, such as logic gate inverters, along with an input to control the propagation delay time of each stage. Both bipolar and MOS technologies have been used to implement these circuits, and some recent examples are discussed below.

A pressure sensor consisting of a single resistor and a 9-stage ring oscillator using  $I^2L$  technology has been reported [40]. This circuit uses a pressure-sensitive resistor to supply a variable injection current to the ring

oscillator stages. The output range of such an oscillator is some 2.5 decades of frequency, for an injection current also varying over 2.5 decades. The typical supply current is about  $120\mu$ A, with  $30\mu$ A considered to be a low level. Such an oscillator is used mainly for transducers that produce either a changing resistance or current.

Another pressure sensor using MOS ring oscillators has also been reported [38]. In this case the ring oscillator itself serves as the pressure sensor, with the transistors exhibiting a piezoresistive effect, thus causing a change in the frequency of oscillation. Pullup transistors with very long aspect ratios  $(20/200\mu m)$  were used in order to provide a sufficient propagation delay at each stage, resulting in a circuit layout area of some  $80,000\mu m^2$ . This 9-stage pMOS oscillator has an operating frequency of 150 kHz, with a supply voltage of V<sub>B</sub> = -20V and a supply current of  $100\mu A$ .

A sensor for temperature that provides a variable capacitance has also been used to control the frequency of a ring oscillator [37]. Since a temperature sensor does not need to be integrated in any kind of array there is much less importance on achieving optimum space efficiency. This device uses a ceramic capacitive sensor and a discrete timing resistor to control a CMOS 4047 oscillator IC, all integrated on an 8mm x 25mm substrate. The entire circuit assembly consumes 0.2mW of power from a 5V supply and provides a sensitivity of 20 Hz/ $^{\circ}$ C over the range -25C to +85C.

All of these designs achieve reasonably small space and low power consumption, but are not suitable for use in high-density arrays where size, output frequency range, and power consumption must all be optimized. Since the photodetectors and processing circuitry can be implemented in a CMOS process, it would be desirable if a small, low-power ring oscillator could also be made using CMOS technology. The design that comes closest to conventional CMOS design is an nMOS ring oscillator similar to the MOS pressure sensor mentioned previously, and is described in [41]. This oscillator can be made with as few as 5 stages, and is shown in Fig. 3.8.



Figure 3.8 Circuit of an nMOS VCO [41].

The nMOS inverter uses a depletion mode transistor as a pullup resistor. An enhancement mode transistor is used as a voltage controlled resistor, in conjunction with a metal-polysilicon capacitor to give the variable propagation delay of each stage. The frequency produced by this 5-stage ring oscillator is given by the expression

$$f = \frac{1}{2nRC \ln\left(1 - \frac{V_{th}}{V_{dd}}\right)}$$
(3.2)

where  $V_{th}$  represents the switching threshold voltage of the nMOS inverter, n is the number of stages, and the resistance R is the combined effective resistance of the two nMOS transistors labelled R1 and R2. The approximate response of the nMOS VCO is shown in Fig. 3.9. Although this curve is basically nonlinear, it has a linear part over which the output frequency varies over three decades for a 0.5V change in the control voltage. It is this linear part of the curve that is used.



Figure 3.9 Transfer function for the nMOS VCO [41].

One of the main drawbacks of the nMOS VCO has to do with the amount of circuit layout area used by the capacitors. According to the circuit layout shown in [41] each of the capacitors in Fig. 3.8 accounts for 75% of the circuit area for a given inverter. The value of a CMOS capacitor is determined by the thickness of the oxide between the polysilicon and metal layers, and the area. Useful values of capacitance are therefore obtained only by large areas. Both capacitors and resistors are expensive to make in terms of layout area. MOS transistors can be configured to provide a voltage-controlled resistance, as shown in the nMOS circuit of Fig. 3.8, but no such substitution can be made for capacitors.

For applications involving high density arrays of integrated sensors, none of the ring oscillators discussed would be small enough. What is needed for an image sensor is a VCO which has at least 3 or 4 decades of range in output frequency, a minimum of current consumption, and a compact layout size. Removing the need for capacitors entirely would offer the best chance for improving these performance factors. Sec. 4.3 presents the design of a new compact VCO, one that meets all of these requirements. Moreover, the new design has been implemented in a digital CMOS process, making it compatible with the photodetectors and processing circuitry previously discussed in this chapter.

# 4. Implementation & Testing of Image Sensor Components

Based on the design decisions discussed of Ch. 3, several component circuits were fabricated using both the 3µm digital CMOS process available through the CMC [24] and a Plessey 5µm digital CMOS gate array [42]. The purpose in implementing only component circuits and testing them separately was to prove only the basic feasibility of the Microelectronic Receptive Field concept. It is enough to show that the building block circuits function properly, in order to use them in future complete MERF designs.

The measurement results and analysis of the three MERF component parts are described in the following sections, with a final assembly of these parts into a complete working cell.

## 4.1. Photodiodes & Phototransistors

The implementation of photodiodes in analog and digital CMOS technology has been exhaustively studied by others [4,6] so that the main concern with regards to this research was to find the size of photodiode that would give an acceptable magnitude and range of photocurrent. Two sizes of n+/p-well photodiodes were fabricated following the design in Fig. 3.1, one with a diffusion area 80 x 80µm and the other with a diffusion area 160 x 160µm. A photomicrograph of these two devices is shown in Fig. 4.1. The substrate contact is shown in the upper right hand corner. These photodiodes can also be configured as phototransistors as described in Sec. 3.1.2.



Figure 4.1 Photomicrograph of large and small photodiodes.

The photocurrent produced by these devices in the presence of light from red and blue lasers was measured, while they were configured as photodiodes. The laser light offered a single frequency of light of sufficient power to produce over 4 decades of light intensity with neutral density filters. The measurement setup is discussed first, followed by an analysis of the measurement results.

### 4.1.1. Light Source Setup

Two gas lasers were used as light sources, one with a blue output of 442nm and one with a red output of 633nm. The output of the laser is designed to have a Gaussian distribution, making it difficult to obtain a uniform light source from them. Special optics do exist for collimating the non-uniform laser output into a uniform beam; however, a simpler though less efficient method can also be used. A small uniform area exists at the center of the output, and can be isolated using a pinhole of the right size. The laser is



Figure 4.2 Measurement setup with laser light source.

placed several meters away from the detector, in order to allow the beam to disperse. Then the light passes through a pinhole made in an aluminum foil screen. The actual detector is placed about 5cm behind the pinhole. This setup is shown in Fig. 4.2.

A pinhole was made with a surgical needle in a piece of aluminum foil, with the edges of the hole flared to reduce light scattering. The laser was positioned far enough away to produce a spot about 1cm<sup>2</sup> at the pinhole, and the light intensity at this point was measured with a Newport 840 power meter and a square 1.00cm<sup>2</sup> detector. The red and blue lasers had a total power output of 1.79mW and 3.55mW, respectively, with the edge of the spot having a power half of that at the center. The effective pinhole size was determined using distant white light source (a halogen lamp) and by comparing the power meter reading with and without the pinhole. The resulting measurement showed the area of the pinhole to be 1.07mm<sup>2</sup>, with an uncertainty of 7%. This precise figure for the pinhole size was needed for later calculations of the power density and the photodiode responsivity. The pinhole was place at the center of the laser spot, and thus produced a fairly uniform light source big enough to illuminate both adjacent photodiodes. The same pinhole was used for all measurements.

The circuit configuration and instruments used to measure the photodiodes is shown in Fig. 4.3. A Keithley 617 Programmable Electrometer was used to measure the photodetector currents over the range 10pA to 1 $\mu$ A. For the photodiode measurements the substrate contact had to be connected to the pwell, in order to short out the effect of the p-well/nsub diode junction. This p-n junction also functions as a photodiode; unfortunately, because no metal shielding was provided in the circuit layout of these devices, connecting the substrate contact to V<sub>dd</sub> would create two photodiodes in parallel. The pwell/nsub diode collects photogenerated carriers over a much larger area, resulting in a photocurrent 10 times that of the n+/p-well junction. With the substrate contact effectively shorted out, only the current from the interior of the photodiode is collected. The use of metal shielding would normally remove this consideration, and allow the substrate to be connected to  $V_{dd}$ .



Figure 4.3 Measurement configuration for photodiodes.

Finally, a set of neutral density filters was used to cut down the intensity of light falling on the photodetectors. Filters with values of  $10^{-n}$ , for n = 0.20, 0.40, 0.60, 0.80, 2.00, and 3.00, were used in combinations of up to three at a time to provide over 4 decades of light intensity.

#### 4.1.2. Bias Voltages

The photodiodes fabricated in a CMOS process have relatively low breakdown voltages, and must therefore be operated at low reverse bias voltages. Tests were made in dark and light conditions to determine the appropriate bias voltage that would maximize the photocurrent while at the same time minimizing the dark current.

Fig. 4.4 shows the results of the first test, a measure of the photodiode dark currents as a function of reverse bias voltage in total darkness. As the plot in Fig. 4.4 shows, the dark currents start to increase exponentially at 15V, which is recognized as the upper limit of the useful reverse voltage for this application. For both photodiodes, the dark current remained under 1pA with a reverse bias of only 3V.



Figure 4.4 Dark current of photodiodes for a range of reverse bias voltages.

In addition to the dark current, a second test was made to determine how the photocurrent depended on the magnitude of the reverse bias voltage. For this test the light intensity was fixed at  $2\mu W/cm^2$  (measured at 800nm) and the resulting photocurrent was measured for a range of different reverse bias voltages. These data are shown in Fig. 4.5.



**Figure 4.5** Photocurrent of an 80 x 80µm photodiode for a range of reverse bias voltages at a constant illumination.

As Fig. 4.5 indicates, the reverse bias has only a moderate effect on the photocurrent below a certain point, with the photocurrent increasing only 23% as the reverse bias increases from 3V to 12V. The increases in photocurrent above 12V are due to the increase in dark current, as shown before in Fig. 4.4. From an analysis of the dark current of Fig. 4.4 and the photocurrent shown in Fig. 4.5, a reverse bias as high as 12V could be used. At very low signal levels, however, it is more important to minimize the dark current by reducing the reverse bias as much as possible, in order to ensure that the minimum photocurrent is 5 to 10 times as large as the dark current. For this reason, as well as to make the photodiode compatible with micropower techniques [43] a reverse bias of 3V has been chosen.

### 4.1.3. Measured Photocurrents

Having chosen the proper bias voltage, light source, and measurement setup, the response of the photodiodes to laser light, and the comparative response of photodiodes and phototransistors to white light were measured. The first of these plots, in Fig. 4.6 and 4.7, show that the logarithmic photodiode currents vary directly with the log of the light intensity, as expected, and both photodiode outputs are linear over more than 4 decades of light intensity. The values of light intensity have an uncertainty of about 2%, while the measured photocurrents have an uncertainty of about 5%. Thus, the drawn data points include the margin of error, with the exception of the data points under 100pA. The linear fit of these plots is very good, with the slopes of all four lines falling within  $1.02 \pm 0.01$ .



Figure 4.6 Output current of photodiodes with a red laser.







Figure 4.8 Ratio of photocurrents from photodiodes of areas 160 x 160 $\mu$ m and 80 x 80 $\mu$ m under blue and red laser light.

Fig. 4.8 shows the ratio of photocurrents from the two photodiodes. The two mean values fall easily within one unit of standard deviation from the expected value of 4.00, showing that the measurements made with the method discussed in Sec. 5.1.1 are fairly consistent.

The main figure of merit for the photodiode is the *responsivity* R [Sec. 3.1.1] which is the amperes of current produced by the detector per watt of incident optical power. This was calculated for the two photodiodes over the range of measured intensities of both red and blue light. The results are listed in Table 4.1. These values are in agreement with those achieved by others [22]. In addition, notice that the values of R for blue and red light are about the same, indicating that the depth of the n+ diffusion must be shallow enough to allow efficient collection of the shorter wavelengths.

|                      | Photodiode Size | Value of responsivity R |                       |
|----------------------|-----------------|-------------------------|-----------------------|
|                      |                 | Mean                    | Standard<br>Deviation |
| Red light            | 80 x 80 μm      | 0.358                   | 0.030                 |
| $(\lambda = 633 nm)$ | 160 x 160 μm    | 0.355                   | 0.026                 |
| Blue light           | 80 x 80 μm      | 0.383                   | 0.042                 |
| $(\lambda = 442 nm)$ | 160 x 160 μm    | 0.375                   | 0.041                 |

**Table 4.1** Values of responsivity R for tested photodiodes underred and blue light.

Another figure of merit is the *response non-uniformity* or RNU. This is a statistical measure of the uniformity for a sample of identical detectors, and is useful in determining how closely a large array of detectors will be matched, given local process variations. There were only 5 IC's fabricated, each with

exactly one pair of the photodiodes, and each coming from a different wafer, so that not enough data could be collected to estimate the parameter spread within a die or wafer. However, this small sample can give at least some indication of the spread between wafers. The photocurrents of the 5 different devices were measured under the same light conditions, and the outputs varied only  $\pm 2\%$  about the mean. While this is quite a small sample size, it does show that the devices tested had some degree of consistency.

In summary, the photodiodes tested provide a photocurrent large enough to overcome the dark current of the device, yet low enough to ensure low power consumption. Better detectors may yet be found that produce the same or higher photocurrents for a smaller size, but these photodiodes can at least be used to prove the feasibility of the Microelectronic Receptive Field model.

## 4.2. Current Mirrors

In order to test the operation of CMOS current mirrors, both n-channel and pchannel current mirrors were fabricated using transistors available from two CMOS processes: the 3µm digital CMOS process available through the Canadian Microelectronics Corporation, and a 5µm digital CMOS gate array produced by Plessey, and custom-configured using the Quick-Chip facilities in Engineering Science at SFU. The CMOS3 devices, shown in Fig. 4.9, all have long, narrow channels, and aspect ratios ranging from 1/2 to 1/6. These devices could be connected together to produce p-channel current mirrors with current multiplication factors of 1.5, 2 and 3 times. Identical gate array devices were used to produce wide-channel n-channel and p-channel current mirrors with current multiplication factors of exactly 1.



Figure 4.9 Circuit layout of long-channel pMOS devices.

Previously reported research [34] into subthreshold current mirrors focussed only on the use of unit-sized devices, which were  $6\mu m \times 6\mu m$ , operated with a supply of 2.5V. These unity-gain current mirrors were reported to be quite accurate over the range 5pA to 1 $\mu$ A. The devices chosen instead for this research have extreme aspect ratios - either very long or very wide. This was done to see what effect the extreme sizes would have on gain factors, and whether they would still accurately mirror currents in the subthreshold region. As will be seen in the following section the results were not exactly as expected, with the wide-channel devices proving unsuitable in the subthreshold region.

Fig. 4.10 shows the current mirrors that were used in these measurements: unity-gain mirrors constructed from the wide-channel Plessey devices, and long-channel pMOS mirrors with current gains of 2 and 3 constructed using the CMOS3 devices. These current mirrors were tested using an HP 4145A Semiconductor Parametric Analyser, over the range 10 pA to 10  $\mu$ A, in order to verify that they worked consistently over the entire 6-decade range of input currents possible from the photodetectors. Details of the measurements done and a discussion of the results are given in the following sections.



**Figure 4.10** Current mirrors for smart vision sensors: (a) unity gain p-type, (b) gain of 2 p-type, (c) gain of 3 p-type, and (d) unity gain n-type.

### 4.2.1. Simple Current Mirrors

The tests began with the different mirrors alone, to see how linear the response would be over the entire input range. The gate array IC's contained 4 nMOS current mirrors and 5 pMOS current mirrors, and all were tested using the SPA to provide the input current and monitor the output current. The input was programmed as a logarithmic current source from 10pA to  $10\mu$ A, and the output current was measured by programming the second channel as a constant voltage source, and the resulting current monitored. In this mode the SPA has a measurement uncertainty of at most 0.5% in each current range, so that the resulting current gain figures have a 1.0% uncertainty. The results are shown in the following graphs of Fig. 4.11 and 4.12.



Figure 4.11 Current gain of wide-channel unity-gain nMOS current mirrors (V<sub>dd</sub>=3V).



Figure 4.12 Current gain of wide-channel unity-gain pMOS current mirrors ( $V_{dd}=3V$ ).

In Fig. 4.11 the current gain of the wide-channel unity-gain nMOS current mirror is shown to be quite linear above  $10^{-10}$  A, with a gain of about 1.3. Fig. 4.12 shows the response of the wide-channel unity-gain pMOS current mirror over the same range, but this circuit shows an actual current gain that ranges from about 3.8 down to 1.4, with some variation between devices. This is a marked departure from the expected gain of 1. It can be seen from either plot that the current gain tends toward unity as the input current increases. The non-unity current gain behavior observed is the result of subthreshold region operation, and is not due to device failure.

Accurate Spice simulation models of MOSFET's in the subthreshold region require many specialized parameters [44], and identifying these parameters from a process that is optimized for digital device operation is a tedious and time-consuming task, and was not the focus of this work. Simple Level 2 Spice models can point to this non-unity gain in the wide-channel subthreshold nMOS and pMOS currents mirrors, although they lack the accuracy required for the entire 6-decade range in input current.

Several factors could contribute to this behavior. Perhaps the most important one is the degree to which the gate voltage is able to control the drain current, and how this parameter varies with the drain current in the subthreshold region [44,45]. This behavior is pronounced in devices with wide channels, so future work with subthreshold current mirrors should instead focus on unitsized or long-channel devices. In spite of this current gain offset of the widechannel current mirrors, they can still be used to test the linearity of addition and subtraction operations using current mirrors in the subthreshold region.

The two long-channel pMOS current mirrors were tested in the same way, and the results can be seen in Fig. 4.13.

57



Figure 4.13 Gain of pMOS current mirror circuits with gain=2 and gain=3.

Fig. 4.13 shows the current gain of the pMOS mirrors with gains of 2 and 3 expected from the transistor geometries. These values are produced, with the maximum deviations from the expected values being +8.7% and -18% for gain=2 device (Fig. 4.10b), and +10.9% and -20.7% for gain=3 device (Fig. 4.10c). These results are acceptable for use with the photodiodes. In addition, it was determined in Ch. 2 that small unit gains of up to 5 would be required in the receptive field weighting functions, so that these circuits could be used to scale the photodetector output currents, as shown in the following section.

#### 4.2.2. Photodiode with Current Mirror

The next step was to see how the photodiodes would work in conjunction with the current mirrors, since current mirrors could be used both to provide the reverse bias voltage and duplicate the photocurrent output of the photodiode. A concern about leakage currents always exists when dealing with MOS transistors in the subthreshold region of operation, especially in this case where the signal currents themselves are very small. Fig. 4.14 shows how the photodiode and current mirror are connected.



Figure 4.14 Circuit diagram of photodiode with pMOS current mirror.

In order to measure the leakage current of the current mirror through the photodiode, the SPA was used to provide 0V at the output of the current mirror and monitor the resulting output current. Initial tests using the wide-channel unity-gain pMOS current mirror with  $V_{dd}$ = 3V resulted in a constant leakage current of 2nA, even in total darkness. This is quite significant, as the dark current of the 80 x 80µm photodiode at this reverse bias voltage is less than 1pA (see Fig. 4.4). In order to reduce or eliminate this leakage current, thought to be caused by the diffusion of carriers across the very wide channel of the gate array transistors, the long-channel pMOS current mirrors were
used instead, under different supply voltages. By using the pMOS current mirror with a gain of 3 and a supply voltage of only 3V the leakage current could be reduced to 0.2pA. Thus, in addition to providing a reverse bias for the photodiode, the long-channel pMOS current mirror also provides a low leakage current comparable to the photodiode dark current, with a current gain of 3.

## 4.2.3. Cascading Current Mirrors

Before the testing of current mirror addition and subtraction can proceed it must first be determined that the cascade combination of pMOS and nMOS current mirrors functions properly. Because there were no long-channel nMOS devices available, and only a limited number of the long-channel pMOS devices, the wide-channel nMOS and pMOS current mirrors were used in the configuration shown in Fig. 4.15. The SPA was used to provide the input current and monitor the output current over the standard 6-decade range of input current, with the results shown in Fig. 4.16.



Figure 4.15 Cascading pMOS and nMOS current mirrors.



Figure 4.16 Current gain of cascaded wide-channel unity-gain pMOS and nMOS current mirrors (V<sub>dd</sub>=3V).

As the plot in Fig. 4.16 indicates, the combination of the two current mirrors reflects the same current gain offset, ranging from 3.5 to just over 1.3, that was evident in the test of each mirror separately in Sec. 4.2.1. From this it is demonstrated that the pMOS and nMOS current mirrors can be cascaded together, and that the final output reflects the characteristics of the separate current mirrors, as predicted in [34]. Even with this current gain offset, these devices *can* be used to measure the feasibility of addition and subtraction with subthreshold current mirror, with the understanding that long-channel devices would be better-suited to this task.

## 4.2.4. Addition & Subtraction using Current Mirrors

In this final test of the current mirrors, the addition and subtraction of two inputs was tested using the wide-channel pMOS and nMOS unity-gain current mirrors configured as shown in Fig. 3.7.

The addition and subtraction operations are only meaningful if the input signals are of roughly the same order of magnitude, so separate measurements were made of each operation in each of 6 ranges, from [0 .. 100pA] to  $[0 .. 10\mu\text{A}]$ . In each range, one input (Iin1) was held at one of 11 equal steps (0 to 10), while the second input (Iin2) was swept through the same range, 0 to 10.

Fig. 4.17 and 4.18 of the 0-100nA range are representative of all the ranges, and show that these circuits do indeed operate linearly. There is a multiplicative factor involved in each range, due to the gain offset of the wide-channel pMOS current mirror, but this remains relatively constant within a given range. In the case of addition, all the outputs are linear and parallel, with the 11 different slopes of Fig. 4.17 agreeing within  $\pm 3\%$  of the average. Also, the linear regression fit parameter  $R^2$  (calculated using CricketGraph) for all 11 outputs is 0.999, indicating that the linearity is quite good. For subtraction, the plots are similar, but because the current mirror output cannot go below 0V, the output is clamped at 0V. These two plots are combined in Fig. 4.19 for clarity.



Figure 4.17 Current mirror addition of two inputs in the 100nA range.



Figure 4.18 Current mirror subtraction in the 100nA range.



Figure 4.19 Summary of current mirror addition and subtraction with 2 inputs in the 100nA range.

From these plots, it is clear that the current mirror circuits shown are capable of performing addition and subtraction, and that they can function even at very low current levels.

#### 4.2.5. Summary of Current Mirrors

The wide-channel and long-channel current mirrors chosen for use in this work showed quite different results. The long-channel pMOS devices worked well with the photodiode, providing both the reverse bias voltage and a current gain of either 2 or 3 times, while limiting the leakage current through the photodiode. The wide-channel current mirrors produced non-unity current gains, particularly the pMOS current mirror which produced gain offsets between 3.8 and 1.3, depending on the drain current. Notice, however, that both devices produce current gains approaching 1 as the drain current increases towards  $10\mu$ A. These devices were used to verify that current mirror addition and subtraction could work in the subthreshold region. While these experiments prove the basic feasibility of using current mirrors in the subthreshold region, further studies and more accurate simulation are required to obtain more optimal aspect ratios of the current mirror devices.

# 4.3. Pseudo-DTL CMOS VCO Design

As discussed in Sec. 3.3.2, the Microelectronic Receptive Field model requires a small, low-power CMOS voltage-controlled oscillator (VCO). Such a device has been designed and fabricated in a  $3\mu$ m digital CMOS process, and has proven to be suitable for use in this work. This section describes the design of the new VCO, and examines the performance of the device.

The name "pseudo-DTL CMOS" derives from the use of standard CMOS components in a configuration borrowed from diode-transistor logic (DTL) gate design. This type of logic gate design produces a gate with a voltage-controlled propagation delay, which can be used in a ring oscillator configuration to produce a voltage-controlled oscillator.

As shown in Fig. 4.20(a) the basic structure of a DTL gate consists of one or more input diodes, a pullup resistor, and an inverting buffer. The diodes and resistor form a diode logic AND gate, while the inverter provides both current drive capability and a logic inversion (diode logic alone has no inverter). Level-shifting diodes are included before the base of the npn transistor to make sure that the transistor remains off when the diode logic input is at a 0.7V low.



Figure 4.20 Logic NAND gate designs using (a) bipolar DTL and (b) pseudo-DTL CMOS design.

The main design concept of pseudo-DTL CMOS [46] is to (a) replace the diodes by MOS transistors connected in a diode configuration, (b) replace the resistor-transistor logic (RTL) inverter with a CMOS inverter, and (c) replace the pullup resistor by a long-channel pMOS transistor. The pMOS transistor pullup has its gate connected to ground, so that it is always on, and the saturation current of the device serves to limit the current, much like an ohmic resistor. The level-shifting diodes are no longer necessary, as the voltage drop across the MOS diode is lower than the threshold voltage of the CMOS inverter, provided that  $V_{dd}$  is at least 3V.

The propagation delay of each gate can be controlled by varying the gate voltage of the pMOS pullup close to  $V_{dd}$ , causing the pullup transistor to operate close to the subthreshold region. The resulting long propagation delay of this gate makes it an ideal component of a ring oscillator, allowing a ring oscillator to be constructed with as few as 3 stages, producing a VCO with

only 12 transistors. The circuit diagram of the new design, and the circuit layout and photomicrograph of the circuit implemented in a  $3\mu$ m digital CMOS process [24], are shown in Fig. 4.21, 4.22, and Fig. 4.23, respectively.



Figure 4.21 Circuit diagram of the pseudo-DTL CMOS VCO.



Figure 4.22 Circuit layout of the pseudo-DTL CMOS VCO.



Figure 4.23 Photomicrograph of the pseudo-DTL CMOS ring oscillator.

The parasitic capacitance of the central node of each gate, combined with the controllable limited current through the pMOS pullup, ensure that the necessary timing delay can be achieved without the use of large capacitors, extra MOS resistors, or a large number of stages. This results in a very compact design.

A 5-stage ring oscillator design was also fabricated for comparison purposes, and the two devices were measured with respect to output frequency, supply current, and upper cut-off frequency. A supply voltage of 3V was used throughout. The measurement results are shown in the following graphs.



Figure 4.24 VCO Frequency vs. Input Voltage (V<sub>dd</sub>=3V).



Figure 4.25 VCO Current Consumption vs. Output Frequency.



Output Frequency (Hz)

**Figure 4.26** VCO Output Voltage vs. Output Frequency for 3-and 5-stage ring oscillators (V<sub>dd</sub>=3V).

From Fig. 4.24 it can be seen that the output frequency of the ring oscillator can be made to vary over 5.5 decades, with the linear response occurring for  $V_{in}$  between 2.20V and 2.65V. This corresponds to about 100mV per decade in the linear portion. This output frequency range is greater than any of the designs previously discussed in Sec. 3.3.2.

Fig. 4.25 shows that the current consumption depends on the output frequency, varying between 14 and  $22\mu$ A in the linear portion between 10Hz and 40kHz, with significant increases occurring at  $10^5$  Hz. The 3-stage device shows a slightly lower current consumption in this high frequency range. This figure for supply current is lower than any of the designs discussed in Sec. 3.3.2.

Notice from Fig. 4.26 that as the output frequency increases beyond  $10^5$  Hz, the amplitude of the output drops off, so that at 500 kHz the output drops below 2V. At this point the current consumption shown in in Fig. 4.25 also rises significantly, making this point the practical upper limit to the VCO's operational range.

From these three graphs it can be seen that the 3-stage design performs better than the 5-stage design: the linear output region is longer, the current consumption is lower, and the layout area is smaller. Only in respect to the maximum output frequency is the 5-stage design superior, and even then only marginally.

In terms of layout size, this design consumes approximately 10,000  $\mu$ m<sup>2</sup>, compared to the 39,000  $\mu$ m<sup>2</sup> of the 5-stage nMOS ring oscillator discussed in [41]. This represents a 75% reduction in circuit area, due primarily to the elimination of the capacitors. The sensitivity of the new device is about 100mV per decade of output frequency in the linear region, compared to 170mV per decade for the nMOS ring oscillator. While power consumption data were not published for the nMOS design, it is reasonable to assume that since the capacitance of each pseudo-DTL CMOS cell is simply a small parasitic capacitance, the current needed to charge and discharge it is much smaller. Thus, by providing a greater range of output frequency than the designs discussed in Sec. 3.2.2, and using very low current with a minimal device count, this VCO circuit makes an ideal interface device for use in the Microelectronic Receptive Field.

### 4.3.1. Interface to VCO

In order for the VCO to convert the photocurrent to a frequency, the logarithmic current must first be converted to a linear voltage, preferably with a range of about 0.5 volts, as this would coincide with the linear portion of the transfer function of Fig. 4.24. This can easily be achieved by using a pMOS transistor configured as a diode, as shown in Fig. 4.27. This circuit provides a high impedance match between the photodiodes and the VCO, as the VCO input is simply an MOS gate. This use of an MOS drain voltage to control the gate voltage of a second device closely parallels the structure of the photo-MOSFET shown in Fig. 3.5.



Figure 4.27 P-channel pullup diode interface to VCO.

The choice of optimal aspect ratio for this pMOS pullup diode is driven primarily by the need for a linear voltage output to feed the VCO. The three available devices listed in Table 4.2 were tried, in order to determine whether a long-channel or wide-channel device was more suitable. The SPA 4145A was used to measure the voltage at the drain of the MOS diode, but needed to be configured in a special way, as the input impedance of the SPA voltage monitor channel is only 1M $\Omega$ . On the other hand, the input resistance of the current source/voltage measurement channels is on the order of 10<sup>12</sup>  $\Omega$ . One of these channels was programmed to provide zero current and the resulting voltage required to ensure this zero current flow was monitored. This produced the accurate measurements shown in Fig. 4.28.



Figure 4.28 Output Voltage of pMOS pullup diodes of different aspect ratios for drain currents of 10pA to  $10\mu$ A (for V<sub>dd</sub>=3V).

The VCO output frequency graph of Fig. 4.24 shows that the input voltage to the VCO should be between 2.10V and 2.65V. From the results in Fig. 4.28, this linear voltage can be achieved for an input current up to 1 $\mu$ A by using a pMOS diode with an aspect ratio of 12/1. The long-channel devices are unsuitable for this task as they provide too much of a voltage drop, especially for currents above 100nA. Theoretically, a device with aspect ratio of 100/1 should perform even better, but this would require more layout area than the result would justify. For the range of input currents expected from the photodiodes, a pMOS diode with an aspect ratio of 12/1 is acceptable.

## 4.4. A Complete Receptive Field Cell

It has been shown in this chapter that all the component parts of the Microelectronic Receptive Field work separately. While the receptive field functions described in Ch. 2 can contain many photodetectors, it is enough to show that all of these components work when connected together in a prototype cell with only one photodetector. Such a cell, shown in Fig. 4.29, was assembled and tested.



Figure 4.29 Circuit diagram of a complete MERF cell.

The prototype cell worked well, providing a stable frequency that ranged from less than 2 Hz up to 500 kHz, in the presence of light ranging from total darkness to a bright desk lamp ( $5mW/cm^2$  @ 800nm). A small 100pF capacitor was needed at the input to the VCO, to smooth out 60 Hz noise from the power supply. The current consumption was around 20 µA at all times, or 60µW of power for the entire cell. The transfer function of the MERF cell was calculated from the measured data of all the component parts shown in Fig. 4.6, 4.24, and 4.28, with the resulting plot shown in Fig. 4.30.



Figure 4.30 Calculated transfer function of the MERF cell.

As the graph of Fig. 4.30 indicates, the MERF cell translates 4 decades of light intensity into 4 decades of output frequency, in a linear manner. In a MERF cell with more than one detector, the resulting output current should be scaled so as to produce the same range of output current as a single detector, in order to reproduce this transfer characteristic.

# 5. Application Issues

The final application of the MERF component circuits to the design of a largearea smart vision sensor is quite beyond the scope of this work. Indeed, any sensor based on the receptive field model will have a highly interconnected structure, so that it is not even possible to think of receptive fields as the basic sensor, but rather the components themselves that form a toolkit for building up an integrated sensor system. However, some attention can be given to the eventual application of these components, as a guide to further research. This chapter discusses the issues of expected receptive field outputs, performance of a receptive field with damaged detectors, expected power consumption of a large-area sensor, and potential tolerance of fabrication defects.

# 5.1. Receptive Field Simulations

Thus far, only the electrical characteristics of the component circuits have been discussed. In order to apply these components to the design of actual receptive field circuits, something must be known about the expected behavior of the particular receptive field functions, such as those discussed in Ch. 2. To that end, the outputs of several representative receptive field functions have been simulated, in order to show the approximate response to edges and bars of different contrast ratios, and to demonstrate the response of a receptive field cell with damaged detectors.

#### 5.1.1. Simulation Setup

(a)

The receptive field functions that were simulated are as follows: 3x3 and 5x5 Difference of Gaussians (DOG); 3x4 edge detector; and 4x3 bar detector. The weighting functions are shown in Fig. 5.1. Notice that the DOG weighting function of Fig. 5.1(a) has smaller weights that the function shown in Fig. 2.5. This is because the smaller weights are easier to implement with current mirrors, and both functions have a net weight of 1 (3 - 4\*(0.5) = 1).

$$\begin{bmatrix} 0 & -0.5 & 0 \\ -0.5 & 3 & -0.5 \\ 0 & -0.5 & 0 \end{bmatrix} \begin{bmatrix} 0 & -0.25 & -0.5 & -0.25 & 0 \\ -0.25 & -0.5 & 0.5 & -0.5 & -0.25 \\ -0.5 & 0.5 & 5 & 0.5 & -0.5 \\ -0.25 & -0.5 & 0.5 & -0.5 & -0.25 \\ 0 & -0.25 & -0.5 & -0.25 & 0 \end{bmatrix}$$

(b)

|     | 1         | 1                 | 0         | Г | - 0   | 0.5 | 0     | ٦ |  |
|-----|-----------|-------------------|-----------|---|-------|-----|-------|---|--|
|     | 0.75<br>1 | -1<br>-0.75<br>-1 | -0.5<br>0 |   | -0.75 | 1   | -0.75 |   |  |
| 0.5 |           |                   |           |   | -0.75 | 1   | -0.75 |   |  |
|     |           |                   |           | L | 0     | 0.5 | 0     |   |  |
|     |           |                   |           |   |       |     |       |   |  |
| (c) |           |                   |           |   | (d)   |     |       |   |  |

**Figure 5.1** Weighting functions for simulated receptive fields: (a) 3x3 DOG, (b) 5x5 DOG, (c) vertical edge detector, and (d) vertical bar detector.

A C program was written to simulate the response of a given receptive field weighting function to a moving edge or bar stimulus, where the stimulus consisted of two light intensities, DARK and BRIGHT. The data structures to support this simulation consisted of a matrix of incident light intensities, and a second matrix of receptive field output values. The basic structure for a 3x3 weighting function is shown in Fig. 5.2. For the 3x3 case, a total of 4 receptive field outputs were computed, each one shifted 1 pixel to the right of its predecessor, so that the first and last receptive fields do not overlap at all, but only butt together. The two receptive fields shown in Fig. 5.2 differ by two positions, and share exactly 1 detector.



Figure 5.2 Cell layout for simulation of receptive field functions.

The leading edge of the light stimulus was moved across this matrix from left to right, in 10 increments per pixel. Note that the pixel layout assumes a 100% fill factor for each detector, which may not be the case in an actual layout. While this will not affect the maximum or minimum receptive field output values, it will mean that the transitions would be more sudden. The output of each receptive field was evaluated at each increment of the light stimulus. The diagram in Fig. 5.2 shows the layout of the 3x3 DOG function of Fig. 5.1(a). In all the graphs that follow in Sec. 5.1.2, the outputs of these receptive fields are plotted against time, so that as the stimulus moves across the matrix of detectors each successive receptive field peaks one position to the right of its predecessor. The input and output levels of these simulations are meant to represent normalized light levels and photodetector outputs, so that the relative effects of the different light input levels can be observed. The main feature of interest in the light input is the contrast ratio, which is the ratio of the bright level to the dark level. A 2:1 contrast ratio represents the minimum detectable difference, while a 10:1 ratio is taken as the upper limit for the purposes of this analysis. Since the VCO translates a linear change in photocurrent to a linear change in output frequency, any changes in the normalized receptive field outputs would theoretically be translated into a proportional output frequency change.

#### 5.1.2. Simulation Results

The plots of Fig. 5.3 and 5.4 show the photocurrent output of adjacent receptive fields to the transient edge for the 3x3 and 5x5 Difference of Gaussian functions, respectively. From left to right, the first and third plot lines of Fig. 5.3 correspond to the two receptive fields of Fig. 5.2. In this way, each successive plot line represents the output of a receptive field that is shifted to the right by exactly one pixel.

The plots of Fig. 5.3 and 5.4 show the response of overlapping receptive fields as the edge stimulus is moved across from left to right. The dark value was 0.1 and the bright value was 1.0, producing a contrast ratio of 10:1. Notice how the outputs reach a minimum as the edge covers the inhibitory cells, and a maximum as the edge covers the center, but not the trailing inhibitory cells. It is this on-center/off-surround that produces the contrast enhancement known as the Mach band effect [16] in human vision.



Figure 5.3 Response of 3x3 DOG receptive fields to an edge stimulus.



Figure 5.4 Response of 5x5 DOG receptive fields to an edge stimulus.

Another thing to note is the response of overlapping receptive fields. Each receptive field varies from its neighbor by an offset of exactly one detector. For the 3x3 DOG function, this means that two fields that have their centers two pixels apart share one detector. From Fig. 5.3 it can be seen that the receptive fields with peak outputs at position 3 and position 5 (corresponding to the two receptive fields of Fig. 5.2) collectively provide a continuous response to the edge stimulus: as the first output levels off after the peak, the second output reaches its minimum value. Thus, adjacent 3x3 receptive field functions of the same size need only share a single detector in order to provide continuous coverage of the input. For the 5x5 case of Fig. 5.4, the outputs with peaks at position 4 and position 8 again share one level of pixels, and provide a response that is continuous. In this way, the receptive fields could be spaced so as to overlap by only one detector, and provide an output that adequately performs contrast enhancement at a moving edge, which is the purpose of the DOG function.

The first receptive field of each kind was also isolated, and the contrast ratio of the edge stimulus was varied from 10:5 to 10:1, in order to see what effect this would have on the output of the cell. The ratio of 10:5 corresponds to a difference in intensity of one photographic f-stop. These plots are shown in Fig. 5.5 and 5.6.



Figure 5.5 3x3 DOG receptive field output for different contrast ratios.



Figure 5.6 5x5 DOG receptive field output for different contrast ratios.

Since both of these functions have a net weight of 1 in a uniform intensity field, the outputs level off at unity once the entire receptive field is passed over by the edge input. Observe that the 3x3 DOG function produces a peak output that is 50% higher than the uniform-field value, and the 5x5 DOG function produces a value almost 3 times higher at the peak. This is partly the result of using a high central weight in order to compensate for the wide-area off-surround of this function. Finally, it is worthwhile to note that the peak output reaches a limit soon after the contrast ratio reaches 10:1; if the contrast ratio were to increase to 1:0.01, the peak normalized output of the 3x3 DOG function would not increase above 1.5.

Similar simulations were run for the edge detector function, as shown in Fig. 5.7 and 5.8. Again, as in the plots of Fig. 5.3 and 5.4, each successive plot line represents the output of a receptive field that is offset by exactly one pixel from its predecessor.



Figure 5.7 Adjacent edge detector receptive field outputs for edge stimulus.



Figure 5.8 Edge detector receptive field output for different contrast ratios.

Of particular interest in the plots of Fig. 5.7 and 5.8 is the fact that the output reaches a peak of almost 3, using weights that are at most 1 and light levels that are at most 1. This shows the value of using multiple detectors to produce an aggregated output of much higher value than would be possible with single detectors. Again, note the limit to the peak output under high contrast ratios. Even the 10:5 contrast ratio was able to produce a significant peak output of just over 1.5.

The bar detector receptive field function was tested with a bright moving bar against a dark background. The width of the bar was initially set to 1.0, which is the width of one detector. The resulting outputs are shown in Fig. 5.9 and 5.10.



**Figure 5.9** Bar detector outputs for moving bar of width = 1.0.



Figure 5.10 Bar detector receptive field output for bar of width = 1.0 at different contrast ratios.

The same high peak output seen in the edge detector is also evident in the bar detector, as shown in Fig. 5.9. Note also that neighboring receptive fields that share only one detector (centers at positions 3 and 5) both have a minimum when the bar sits exactly between them. As a result, a more continuous output would only be possible if the two receptive field functions shared more detectors, such as when the centers of the fields were only one pixel away.

The response of the bar detector was also simulated for bar widths other than 1.0. The plot of Fig. 5.11 shows the output reaching only half the peak with bar widths of 0.5 and 2.0 as is possible with a width of 1.0. Thus, such a detector function is able to discriminate quite well bar widths that are equal to the width of the center detectors.



Figure 5.11 Bar detector receptive field output for different bar of widths.

One final experiment was made to see how the output of a 3x3 DOG receptive field function would be altered if one or two of the 5 detectors were to be damaged (assuming 'damaged' means no signal). Fig. 5.12 shows the result. It was assumed that only one of the surround detectors would be affected, as the loss of the center detector would be catastrophic for the entire cell. With only one detector down, the output shows little change, except for the loss of some of the inhibitory input, which would result in a negative value anyway. Even with the loss of two surround detectors, the receptive field still functions, although with a peak output that is 33% higher than normal.



Figure 5.12 3x3 DOG receptive field output with 1 or 2 damaged detectors.

## 5.2. Power Consumption and WSI

For the purpose of performing a crude estimate of the expected power consumption of a large-area sensor based on the MERF model, consider what would happen if an entire 10cm wafer was used for such a sensor. Each photodiode, including some overhead for current mirrors, could be about 100 x 100 $\mu$ m. Assuming that 50% of the layout area can be given to photodetectors, and further assuming that only half of the detectors are functional, this would result in some 200,000 photodetectors. Next, consider the number of VCO's used, as this will probably be the determining factor in the power consumption. If a layout can be found that resulted in the same 100:1 compression that exists in the retina [13], then we can expect to have 2000 VCO's. Each of these is about the same size as a photodetector, so that the space taken by them is almost negligible by comparison.

In the worst case, the photodiodes could each use up to  $1\mu$ A of current, considering that the 200nA output of each detector might be mirrored and shared by 4 neighbors. This would require 200mA, but would represent a short-term worst case situation. More likely is a figure only one tenth of this, or 20mA. At 20 $\mu$ A each, the total current needed by all 2000 VCO's is 40mA with a 3V supply. Thus, the total average current required by this sensor is only 60mA @ 3V, for a power dissipation of only 180mW.

# 5.3. Potential Resistance to Fabrication Defects

A regular array sensor can be critically affected by a single bad element, especially with serial-readout schemes such as are used in CCD arrays. This vulnerability to single defects is not necessarily shared by the receptive field architecture. There are two possibilities. If a single photodetector is rendered inoperable by a fabrication defect, the remaining detectors within the cell are not affected, only the output of that receptive field will be modified. This can be seen from the results of Sec. 5.1.2, in which it was shown from simulations that a receptive field of only 5 detectors can still function reasonably well if 1 or 2 of them are dead. If, on the other hand, the processing circuitry or the VCO of an entire receptive field cell is damaged, the surrounding cells that share the same photodetectors may still continue to function, although that particular cell is 'blind'.

While they still lack the ability to recover from damage in the way that organic detectors do, the multiple-detector scheme of the receptive field design does allow a greater degree of freedom from otherwise crippling defects, compared to other sensor designs. In the future, more research will be required into the impact of defective devices on the performance of an integrated sensor system built with these Microelectronic Receptive Field components, in order to aid in the final layout of an even more defectresistant sensor.

## 5.4. Future Work

As this work was intended mainly as a feasibility study and proof-of-concept, much work remains in the optimization of the components, and the design of the eventual sensor. Several areas for further study are suggested as follows.

Better photodetectors should be investigated, such as the BJT phototransistor. This device should offer comparable currents using less than 10% of the area currently needed by the photodiodes. In addition, metal shielding should be used to confine the light-sensitive area of each phototransistor. This was not done in the photodetector designs tested here, and prevented these structures from being used as phototransistors.

The current mirror circuits could be extended to include non-linear functions, as outlined in [33]. For example, the addition of a resistor would allow the simple current mirror to perform a squaring operation, even in the subthreshold region. As one way of reducing the number of VCO's needed, multiplexing schemes could be investigated, which could allow several receptive field outputs to share a single VCO. In addition, attention should be given to the unique packaging needs of a sensor with so many outputs.

One effect that should not be overlooked is the low frequency output of the VCO in the case of low light intensities. This would cause the system to have a slower response in darker light than in brighter light. If the frequency range is kept as it is, then a switching system may be utilized in order to adapt to the changing light level, producing an acceptable response time in any light level.

After the component circuits have been optimized, but before the actual vision sensor can be built, extensive modelling and simulation will be required in order to determine the size, number, orientation, and layout of the individual receptive fields, according to the particular needs of the desired application. This work could easily become the topic for a Ph.D. project.

## 6. Conclusions

The main focus of this thesis has been an investigation of the component circuits that can be used in the design of a smart image sensor, based on the receptive field paradigm. With such an approach it is possible to incorporate image sensing, low-level filtering, data compression and encoding in the sensor itself, thus adding a level of data complexity to machine vision sensors.

The two sizes of photodiodes examined had diameters of  $80\mu$ m and  $160\mu$ m, and were able to produce photocurrents in the range 10pA to  $1\mu A$ . These currents are large enough to overcome the dark current of 2pA, yet small enough to allow the current mirrors to operate entirely in the subthreshold region.

Long-channel pMOS current mirrors were used to bias the photodiodes and collect the photocurrent outputs, while providing current gains of 2 and 3 times with only a 20% variation over the 6 decade range of input currents. The current mirror addition and subtraction circuits both performed linearly over the input current range of 10pA to  $10\mu$ A, although they reflected the subthreshold gain offset produced primarily by the wide-channel pMOS current mirrors.

The receptive field circuits can be hardware-configured to produce any of the standard low-level image processing functions including lowpass filtering and edge enhancement, as well as detection of edges and bars. Unlike previous smart sensor designs, which used only 5 to 25% of the cell area for photodetection, the photodetectors of a receptive field cell could use 50% or

more of the circuit layout area, due to the use of very compact analog currentmode processing circuitry.

A new voltage-controlled oscillator was presented, and was shown to convert the photocurrent signals with a 4-decade range into a 4-decade range of frequency. The new VCO has a high-impedance input, providing excellent isolation and buffering of the sensor signals. In addition to using an average of only  $20\mu$ A of current at 3V, the new circuit requires only 1/4 of the layout area as a comparative nMOS design, and can be fabricated in a standard digital CMOS process. The new VCO is approximately the same size as one of the small photodiodes.

The power consumption of the photodetectors, processing circuits, and VCO have been shown to be low enough to warrant the use of this type of sensor cell in wafer scale integration. Because the receptive field model does not rely on regular layouts, as other sensors do, and as every receptive field contains several detectors, it also has the advantage of providing a potential resistance to fabrication defects.

# 7. Bibliography

- [1] Eric R. Fossum, "Architectures for focal plane image processing", Optical Engineering, vol. 28 no. 8, pp. 865-71, August 1989.
- [2] D R Clark, J Owczarczyk, "VLSI Architectures for Low-level Vision", Proc. Image Processing '88, pp. 31-44, Blenheim Online Publications, Pinner, Middx, UK, 1988.
- [3] Tim Allen, Carver Mead, Federico Faggin, and Glenn Gribble, "Orientation-Selective VLSI Retina", Proc. of Visual Communications and Image Processing '88, SPIE vol. 1001, pp. 1040-1046, 1988.
- [4] Carver Mead, <u>Analog VLSI and Neural Systems</u>, Addison-Wesley Publishing Co., Reading Mass., 1989.
- [5] Marc Tremblay and Denis Poussart, "M.A.R.: An early Vision System with Integrated Optics and Processing", Proc. of Canadian Conference of VLSI, pp. 33-40, October 1989, Vancouver, BC.
- [6] Denis Poussart, Marc Tremblay, and Abdel Djemouai, "Integrated Architectures for Computer Vision Sensing", Proc. VISION INTERFACE '90, pp. 39-46, Halifax, NS, May 1990.
- [7] Ran Ginosar and Yehoshua Y. Zeevi, "Adaptive Sensitivity/Intelligent Scan Imaging Sensor Chips", Proc. of Visual Communications and Image Processing '88, SPIE vol. 1001, pp. 462-468, 1988.
- [8] Glenn Chapman, "Laser-Linking Technology for Restructurable VLSI", <u>Wafer Scale Integration</u>, Ch. 5.2, C. Jesshope & W. Moore, eds., IOP Publishing Ltd., Bristol, England, 1986.
- [9] David Marr, <u>Vision</u>, W.H. Freeman and Co., San Francisco, 1982.
- [10] J.D. Pettigrew, K.J. Sanderson, W.R. Levick, eds, <u>Visual Neuroscience</u>, Cambridge University Press, 1986.
- [11] David Heeger, "Optical Flow Using Spatiotemporal Filters", International Journal of Computer Vision, pp. 279-302, 1988.

- [12] Andrew B. Watson, "Receptive Fields and Visual Representations", Proc. of Human Vision, Visual Processing, and Digital Display, SPIE vol. 1077, pp. 190-197, 1989.
- [13] Philip Scheltens, Andrew Rawicz, Marek Syrzycki, "Modeling and Simulation of Human Retinal Vision Processing", Proc. Canadian Conference on Electrical and Computer Engineering, Ottawa, Ont., Sept. 4-6, 1990, pp. 29.1.1-29.1.4.
- [14] D.H. Hubel and T.N. Wiesel, "Receptive fields and functional architecture of monkey striate cortex", Journal of Physiology, vol 195, pp. 215-243, 1968.
- [15] J.P. Jones and L.A. Palmer, "The two-dimensional spatial structure of simple receptive fields in cat striate-cortex cells", Journal of Neurophysiology, vol. 58, pp. 1187-1211, 1987.
- [16] Floyd Ratliff, Mach Bands: <u>Quantitative Studies on Neural Networks</u> in the Retina, Holden-Day Inc., San Francisco CA, 1965.
- [17] Rafael Gonzalez and Paul Wintz, <u>Digital Image Processing</u>, Addison-Wesley, 1987.
- [18] L. G. Roberts, "Machine Perception of Three-Dimensional Solids", Optical and Electro-Optical Information Processing, J.T. Tippet et all, eds., M.I.T. Press, Cambridge MA, 1965, pp. 159-197.
- [19] Peter Burt, Edward Adelson, "The Laplacian Pyramid as a Compact Image Code", Readings in Computer Vision, pp. 671-79, Fischler & Firscheim, eds, 1987.
- [20] S. Middelhoek, S.A. Audet, <u>Silicon Sensors</u>, Ch. 2, Academic Press, San Diego, CA, 1989.
- [21] Tudor E. Jenkins, <u>Optical Sensing Techniques and Signal Processing</u>, Prentice-Hall, 1987, Ch. 2.
- [22] Andrei P. Silard, "Electro-Optical Performance of Large-Area Silicon Sensors for Radiative Energy Signals", Sensors and Actuators, vol. 12 (1987), pp. 23-34.
- [23] Rob Barman, VLSI researcher at UBC, personal communication.
- [24] <u>Guide to the Integrated Circuit Implementation Services of the</u> <u>Canadian Microelectronics Corporation</u>, Canadian Microelectronics Corporation, GICIS Version 4:0, March 1989.

- [25] L. Delpup et al, "An Optically coupled Neural Network for Process Control", Canadian Conference on VLSI '90, Ottawa, Ont., pp. 4.2.1-4.2.7.
- [26] Eric Vittoz, "MOS Transistors Operated in the Lateral Bipolar Mode and Their Application in CMOS Technology", IEEE Journal of Solid-State Circuits, vol. SC-18, no. 3, pp. 273-279, 1983.
- [27] Savvas G. Chamberlain and Jim P.Y. Lee, "A Novel Wide Dynamic Range Silicon Photodetector and Linear Imaging Array", IEEE Journal of Solid-State Circuits, vol. 19, no. 1, pp. 41-48, February 1984.
- [28] <u>CCD Image Sensors</u>, Dalsa, Inc., (catalog), 1989.
- [29] S. Director, W. Maly, A. Strojwas, <u>VLSI Design for Manufacturing:</u> <u>Yield Enhancement</u>, Kluwer Academic Publishers, 1990.
- [30] D. Ludwig, N. Woodall, M. Spanish, "On-Focal Plane Analog-to-Digital Conversion with Detector Gain and Offset Compensation", Materials, Devices, Techniques, and Applications for Z-Plane Focal Plane Array (FPA) Technology, SPIE vol. 1097, pp. 73-84, 1989.
- [31] C. Toumazou, F.J. Lidgey, D.G. Haigh, eds., <u>Analogue IC Design: the</u> <u>current-mode approach</u>, Institution of Electrical Engineers, Peter Peregrinus Ltd, London, UK, 1990.
- [32] S. Kawahito, M. Kameyama, T. Higuchi, H. Yamada, "A 32 x 32-bit Multiplier Using Multiple-Valued MOS Current-Mode Circuits", IEEE Journal of Solid-State Circuits, vol. 23, no. 1, pp. 124-132, Feb. 1988.
- [33] S. Kawahito, M. Ishida, T. Nakamura, "Analogue MOS Current-Mode Circuits for Three-Dimensional Integrated Smart Image Sensor", Electronic Letters, vol. 26, no. 3, pp. 177-179, 1st Feb. 1990.
- [34] K. Boahen, P. Pouliquen, A. Andreou, R. Jenkins, "A Heteroassociative Memory Using Current-Mode MOS Analog VLSI Circuits", IEEE Transactions on Circuits and Systems, vol. 36, no. 5, pp.747-755, May 1989.
- [35] M. Maher, S. Deweerth, M. Mahowald, C. Mead, "Implementing Neural Architectures Using Analog VLSI Circuits", IEEE Transactions on Circuits and Systems, vol. 36, no. 5, pp.643-651, May 1989.
- [36] S. Middelhoek, A. C. Hoogerwerf, "Smart Sensors: When and Where?", Sensors and Actuators, vol. 8 (1985) pp. 39-48.
- [37] S. Leppavuori et al, "Miniature Frequency-Output Temperature Transmitter Based on a Ceramic Capacitive Sensor", Sensors and Actuators, vol. 4 (1983), pp. 573-580.
- [38] J. Neumeister, G. Schuster, W. Von Munch, "A Silicon Pressure Sensor Using MOS Ring Oscillators", Sensors and Actuators, vol. 7 (1985), pp. 167-176.
- [39] J.E. Brignell, "Digital Processing of Sensor Signals", Proc. VLSI and Microelectronic Applications to Intelligent Peripherals, vol. 3, pp. 107-114, Hamburg, FRG, 1989.
- [40] H. Reichl, H. J. Hwang, H. Riedel, "Frequency-Analog Sensors Using the I<sup>2</sup>L Technique", Sensors and Actuators, vol. 4 (1983), pp. 247-254.
- [41] Malcolm R. Haskard and Ian C. May, <u>Analog VLSI Design: nMOS and</u> <u>CMOS</u>, Ch. 8, Prentice Hall of Australia Pty Ltd, 1988.
- [42] "MA/MH Design Manual", Plessey Semiconductors, Inc., Sequoia Research Park, Scotts Valley, California, 1990.
- [43] Eric Vittoz, "Micropower Techniques", from <u>Design of MOS VLSI</u> <u>Circuits for Telecommunications</u>, Y. Tsividis and P. Antognetti, eds, Prentice Hall, New Jersey, pp. 104-144, 1985.
- [44] L. Dunlop, "An Efficient MOSFET Current Model for Analog Circuit Simulation - Subthreshold to Strong Inversion", IEEE Journal of Solid-State Circuits, vol. 25, no. 2, pp. 616-619, April 1990.
- [45] Eric Vittoz, "The Design of High-Performance Analog Circuits on Digital CMOS Chips", IEEE Journal of Solid-State Circuits, vol. SC-20, no. 3, pp. 657-665, 1985.
- [46] Mark Grigoleit and Marek Syrzycki, "Design and Character- ization of Pseudo-DTL CMOS Gates", to be published.