Improvisation of Gabor Filter Design Using Verilog HDL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2010 International Conference on Electronic Devices, Systems and Applications (ICEDSA2010)

IMPROVISATION OF
GABOR FILTER DESIGN USING VERILOG HDL
Idros,M.F.M Mohamed,S.A Razak,A.H.A Zoolfakar,A.S Al-Junid,S.A.M
Faculty of Electrical Engineering, Universiti Teknologi Mara,Malaysia

Abstract — This paper presents the improvisation of


A. Digital Gabor Filter
Gabor Filter design using Verilog HDL. This paper details
important enhancement made to the Digital Gabor filter Digital Gabor Filter was designed by transforming the design
to minimize the sizing problem and the coding style that into verilog using xilinx 10.1. The target device is Spartan 3A
synthesizable. The intention is to study, analyze, simplify family. The figure shown below is the summary of the
and improvise the design synthesis efficiency and accuracy synthesized design. It can be seen that the utilization of the
while maintaining the same functionality. The main resource of the device exceeded 100%[1]. This particular
characteristic of the proposed approach was to replace the point was where the improvement needed to be done to
parallel multiplication-accumulation unit (MAC) to a achieve an effective and efficient design.
serial multiplication-accumulation unit where the
convolution matrix takes place. This significant change
helps to reduce the sizing problem without jeopardizing
the functionality of the Digital Gabor Filter. The result
provides area efficiency architecture for the effective
design.

Index Terms — Digital filter, digital design, fingerprint,


FPGA image processing, Gabor filter, MAC, verilog HDL,
Xilinx.

I. INTRODUCTION
Fig 1: Design summary
F ingerprint enhancement using Gabor filter is one of highly
computational complexity in fingerprint verification
process. Gabor filter has a complex valued convolution kernel
Basically there were 3 major parts in the filter: CLU,
ALU and MEMORY [4]. The ‘convolution’ signal indicates
and a data format with complex values is used. So
the operation of the filter. If the signal is high then the
implementing Gabor filter is very significant in fingerprint
convolution process takes place. If it is low then the filter
verification process. Designing Gabor filter will help
receives image input and stores it to the memory based on the
enhancing the quality of fingerprint image. In fingerprint
input location. The data enters the filter pixel by pixel. The
recognition, Gabor filter optimally capture both local
‘PIXEL_X’ and ‘PIXEL_Y’ signal gave the address of the
orientation and frequency information from a fingerprint
memory location [1].
image. By tuning a Gabor filter to specific frequency and
direction, the local frequency and orientation information can
be obtained. Thus, it is suited for extracting texture
information from images [1].
The convolution matrix took place at the
multiplication-accumulation unit (MAC) of the digital filter
design. The MAC parallel design is for the speed of the
convolution process. Parallel design allows a group of series
data to be sent or transferred simultaneously [1]. By designing
a parallel MAC, the size of the filter was compromised. The
main objective is to replace the parallel design MAC to a
serial design MAC. The serial design allows a group of series Fig 2: Top level
data bits to transfer one series of data at a time[6]. Even B. Arithmetic Unit
though the design might compromise the speed, but the area
consumption was reduced. The speed of serial design can be This is the main part of the filter that is doing the convolution
overcome by operate at a higher frequency. process. This is where the Gabor coefficient is stored[4]. It
consists of 3 parts: ROM, DECODER and MAC. The ROM

978-1-4244-6632-0/10/$26.00 ©2010 IEEE 183


has 16 address locations but only 9 of it are used to store the too. It means that every memory location will be stored for
coefficient. The MAC divided into 2 parts: multiplier and value for 1 image pixel[4][10].
adder. The multiplier has 9 parallel multipliers. So the When the convolution signal is triggered to ‘1’ the
multiplication will be done in the same time as to speed up the convolution process starts. The controller will read the image
convolution process. The adder consists of 8 adder connected that is stored in the memory and send the data to the
in sequence. The adder is to sum up all the 9 multiplier arithmetic unit. The controller will call the data from the
outputs. Since the design uses 9 parallel multipliers and 8 determined memory location. In arithmetic unit there is also a
adders, the design is significantly large. Both multiplier and ROM which will permanently store the coefficient kernel
adder use Xilinx IP cogen floating_point V3.0. This IP Cogen value. The value of kernel will also be called by the control
is generated from the Xilinx library[1]. into the convolution circuit. When both data has entered the
convolution circuit the process of multiplication and
accumulation will take place. Only one series of data will be
II. METHODOLOGY convoluted at a time. The counter will count for 9 convolution
The focus of this work is not to design a new digital Gabor operation before giving out the result of filtered image. The
filter but to improve the design so it can be implemented on reason why count for 9 consecutive cycle stands for the 9
the device. As an ASIC designer, there are three major factors coefficient kernel value. This will also be the result of the
needed to be considered, maximization of speed, filter.
minimization of area and power consumption. In this work,
minimization of area consumption will be the main priority. III. RESULT AND DISCUSSION
After redesigning the gabor filter in verilog using Xilinx 10.1
software, the code was then synthesized. The summary of the
A. Design
design was shown in figure 4. From the summary, the
The design of the new multiplication-accumulation unit must numbers of warnings were reduced from 80 to 26 warnings.
be done precisely[6]. This is due to the sensitivity of the The warnings generated are related to the incomplete if and
transition in a single data path. Below is the design flow of else statement which a latch might be generated. In this
the filter. summary, the target device Spartan3-S200 was used. This
device contains large resources suitable for a design such as
START
this. The numbers of Slices, Slice Flip Flops and LUTs were
significantly reduced.
0 INPUT DATA STORED
CONVOLUTION IN THE MEMORY

1
READ DATA FROM
MEMORY

SEND ONE DATA TO THE MULTIPLIER-


ACCUMULATOR UNIT(MAC)

MAC read data from memory and coefficient from ROM. After
that it will perform the matrix convolution

0
COUNT FOR 9
TIMES

FIG 4: Design Summary


1
Output=filter image A. Top level
Figure 5 shows the schematic view of the top level filter.
There were 6 input pins and one output pin on the top level.
STOP

Newdata stands for an unfiltered 32-bits image data. Pixel-X


and Y hold the position of the memory when the write
memory occurred. Clock and reset pins indicates the
generated clock with 40ns period and reset button for the
Fig 3 : Gabor filter flow filter. The ‘convolution’ signal is to indicate the operation of
the filter. If the signal is high then the convolution process
Firstly, when the convolution signal is ‘0’ the input takes place. If it is low then the filter receives image input and
data which is in pixel format will enter the filter and stored in stores it to the memory based on the input location.
the memory. The size of the memory depends on the pixel
size. If the pixel is 16x16 then the memory size will be 16x16

184
address. When the coefficient address was counted up until 9,
the memory address for Y- direction will count a plus one.
And the X-direction address must wait until Y-direction
counts until 16 then it counts a plus one.
This CLU will also read feedback from the
arithmetic unit which is the ‘SET’ and ‘RDY’ signals. These
feedbacks from the arithmetic unit are used to control the
operation ‘OP’ of the arithmetic unit. When the ‘OP’ signal
was high, the convolution process at the arithmetic unit starts.
When the ‘OP’ signal is low, the convolution process stops.
Fig 5: Toplevel
The ‘OP’ was designed this way to control accurate series
data sent to the arithmetic unit so there won’t be any
From figure 7 the output result for the filter is
mismatch of data. The memory decoder decodes the data and
0.006764772(3BDDAB06) but the expected result in figure 6
sends the correct memory address and coefficient address
was 0.006764705(3BDDAA75). The difference was
separately to the memory and the arithmetic unit. Figure 8
0.00000068. The error was only 0.001%. This new design
shows the schematic view of the controller and figure 9
verifies that even though the multiplication and accumulation
verifies the operation of the controller.
were design in serial, it can still give and maintain the same
result from previous parallel design. It took 222 cycles to
finish the convolution process in serial design. Since there is
no verification from the top level from previous design, the
estimated time for parallel design to finish the convolution is
127cycles.

Fig 8: Controller

Fig 6: Real convolution data

Fig 9: Verification of CLU

C. Memory
Fig 7: Verification of top level filter The memory block is used to store the image pixel. The
decoder only decodes address for Y-direction only. The clock
was removed from the decoder so the decoded Y-direction
B. Controller (CLU)
can arrived at the same clock cycle with the X-direction. The
The control logic unit functions as controller for the data flow adress for X-direction is supplied directly from the CLU or
in the filter. It gives instruction to the other blocks to do their from the filter input. The image input is also connected
job. Basically, it gives the memory address to read data to the directly from the filter input. The writenable signal indicates
MEMORY and give address of coefficient to the ALU. whether the operation is a write data or read data.
This CLU will only generate the address location From the figure 10, first the ‘WRITENABLE’ signal is
when the ‘START’ signal is high. This signal indicats the high to indicate the writing process is taking place. Then the
convolution process that has taken place but if the signal is signal goes low to read the data in the memory. The memory
low, it indicates that the writing of image data into the will give the output on the same clock cycle as the address
memory takes place. location enters.
This CLU contains only 2 different blocks. One is
the counter for the coefficient and memory address, and the
other one is the counter decoder. The design of the counter
gives the relationship between the coefficient and memory

185
Fig 10: Verification memory unit Fig 13: MAC

D. Arithmetic (ALU) IV. CONCLUSION


This is the main part of this work, arithmetic unit. This is The design enhancement proposed for Gabor Filter has
where the convolution process takes place. It consists of two successfully reached. The area of the design has been
parts: the ROM and the MAC. The ROM is used to store the 9 significantly reduced while the function of the filter is
coefficient values that are needed to convolute with the image perfectly maintained.
while MAC consists of a buffer, a multiplier, an adder and a The numbers of slices used from previous design reduce
counter. The crucial part of this design was to make sure that from 5759 slices to 1625 slices. This significant change is due
the convolution process happened align with the correct to the reduction of multiplier and adder used in the
image data and coefficient. The ‘CONVO’ signal plays multiplication and accumulation unit. The enhancement made
important role to ensure there was no mismatch of data read. in the multiplication-accumulation unit has been proven
From the figure 11 below, the ‘CLOCK’ and the effectively reliable and functional.
‘CONVO’ both were connected to the ROM and MAC. When By adjusting the memory and the controller unit, the
the ‘CONVO’ went from low to high, the convolution process functionality of a complete and correct digital Gabor Filter is
starts. The feedback ‘READY’ and ‘SET’ were sent to the obtained. Even though, the precision of this Gabor Filter is
CLU indicates convolution process completed. The CLU then 0.001% away from the calculated data. By minimizing the
will push the ‘CONVO’ signal from high to low before the area, the speed of the design is relatively slower. It took 222
next convolution takes place. These processes take 9 complete complete cycles to finish the convolution.
convolutions before sending the convoluted data out.
The verification of the arithmetic unit can be
observed from figure 12. V. REFERENCES
[1] Razak, A.H.A. Taharim, R.H. “Implementing Gabor
Filter for Fingerprint Recognition using verilog HDL,”
IEEE explorer , March 2009.
[2] P. H. W. L. Ocean Y. H. Cheung, Eric K.C.
Tsang,Bertam E.SHi, "Implementing Of Gabor-type
Fig 11: Arithmetic Filters on Field Programmable Gate Arrays," 2005.
[3] A. P. Arrigo Benedetti, Nello Scarabottolo, "Image
Convolution o FPGAs:the implementation of a multi-
FPGA structure," 1998.
[4] K. S. Vasily G. Moshnyaga, Keikichi Tamaru, "A
Memory based architecture for real-time convolution
with variable kernels," 1998.
[5] Clifford E. Cummings, “Verilog-2001 Behavioral and
Fig 12: Verification of ALU Synthesis Enhancement,” Dec 2001.
[6] Himanshu Bhatnagar, “Advanced ASIC Chip Syhthesis,”
Figure 13 shows the detailed structure inside the MAC. The Kluwer Academic Publisher, 1999, pp 202-203.
buffer was used to hold the ‘CONVO’ operation for 1 cycle [7] Don Mills,Clifford E. Cummings, “RTL Coding Styles
before the multiplier. The intention was to wait for the correct That Yield Simulation and Synthesis Mismatches,” Oct
data sent from the memory for the convolution process. The 2000.
Multiplier and the adder are connected in series. The design [8] Rajesh Bawankule “ Verilog Code Writing Guidelines,”
was done in such a way to lessen the area consumption of the Sept 2002.
filter. After 9 consecutive multiplications and additions, the [9] Michael D. Ciletti, “Modeling, Synthesis, and Rapid
counter in the MAC will gives the expected result. Since the Prototyping With the Verilog HDL,” Prentice Hall, Dec
design is a single data path also known as pipeline, the 1999.
multiplication and the addition will take a longer period of
time. The total cycles required for convolution for this design
is 222 clock cycles with a time period of 40ns per cycle.

186

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy