Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/182449
Title: 16-bit high-speed CMOS multiplier IC design
Authors: Feng, Haotian
Keywords: Engineering
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Feng, H. (2025). 16-bit high-speed CMOS multiplier IC design. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182449
Abstract: This dissertation focuses on a high-speed 16-bit CMOS multiplier design. In order to satisfy the increasing demands of contemporary computing systems, multiplication—a basic operation in digital signal processing, cryptography, and arithmetic units—needs to be implemented efficiently in hardware. The goal of the dissertation is to determine which a greater design is with good speed and efficiency performance by investigating a variety of multiplication algorithms, such as the Vedic, Booth, and Wallace-tree. Since Vedic algorithm’s parallelism and simplicity, Vedic multiplier has the lowest latency in 16-bit unsigned number multiplication, making it an ideal choice for applications that require fast computations. In the contrast, Booth Algorithm, with Wallace-tree optimizations, can do better partial product compression but introduces more complexity, limiting its performance in 16-bit operation systems. This dissertation also examines how three adder architectures—Ripple Carry Adder (RCA), Carry Lookahead Adder (CLA), and Kogge-Stone Adder (KSA)—affect overall multiplier performance. In terms of the RCA adder, the CLA adder and the KSA adder improve the computational speed of the final 32-bit addition by 93% and 51%, respectively. Simulation show that CLA-based Wallace-Booth multiplier has a better performance over KSA-based in smaller bit-width applications because of lower wiring overhead and complexity. Post-synthesis was done in Verilog on Design Vision, with default timing restrictions in the ST 65nm process library. This dissertation shows that the Vedic multiplier with cascaded CLA adders has the shortest worst computation time of about 1280 ps when multiplying sixteen-bit unsigned numbers, which is about 3.75 times the computation time of a conventional RCA multiplier (4800 ps). The worst computation time is also improved by a factor of 2 from 2800 ps to the multiplier unit synthesized by the DC's own library and its synthesis logic. The findings made an instruction to the trade-offs between speed, complexity, and hardware area in multiplier design, leading further research on higher-bitwidth and pipeline-optimized systems.
URI: https://hdl.handle.net/10356/182449
Schools: School of Electrical and Electronic Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
FengHaotian_Dissertation.pdf
  Restricted Access
1.58 MBAdobe PDFView/Open

Page view(s)

65
Updated on Mar 16, 2025

Download(s)

5
Updated on Mar 16, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.