

# **DESIGN AND ANALYSIS OF LOW POWER MBFF USING DIFFERENT LOGICS**

### Amgoth Laxman

Research Scholar, University College of Engineering, Osmania University, Hyderabad, India.

### Dr. N. Siva Sankara Reddy

Associate Professor, Department of ECE, Vasavi College of Engineering, Hyderabad, India.

# Dr. B. Rajendra Naik

Professor, Department of ECE, University College of Engineering, Osmania University, Hyderabad, India.

#### Abstract

Power dissipation is responsible for high-performance digital systems in the sub-micron range. As a result, power inside small systems is critical. This study analyzes four various types of MBFF employing regular CMOS, transmission gates, pass transistors, and GDI gates. According to simulation results, the recommended GDI-based MBFF has the lowest PDP. When compared to others, PTL requires fewer transistors to build an MBFF. T-SPICE is used to model circuits with 45 nm manufacturing technology. The simulation is carried out in terms of delay, power, and Power Delay Product (PDP) for supply voltages of 0.8 V and 1.2 V.

### 1. INTRODUCTION

Because of the rapid growth of nanotechnology, the rise in current integrated circuit (IC) features and performance has made power consumption a major challenge for all IC makers [1]. High transistor density has resulted from technical growth, increasing the complexity in the IC business sectors [2]. Power consumption is perhaps the most significant aspect of any electronic device's effectiveness. As a result, attaining remarkable efficiency, low energy usage, and even a constrained space dimension requires addressing power consumption throughout the whole design phase, from design to circuit conception [3]. Furthermore, because the clock signal toggles every cycle, the clock circuit's overall power consumption may be high. Because of the fast-switching activity, clock power would be the dominant dynamic power source [4].  $P_{clk} = C_{clk}V_{dd}^2 f_{clk}$  (1)

Where

 $P_{clk}$  - power of the clock,  $f_{clk}$ - frequency of the clock,  $V_{dd}$ - voltage supplied to the circuit,  $C_{clk}$ -Switching capacitance includes the gate capacitance for flip-flops controlled by a clock pulse and clock network linking capacitance, including capacitance linked to buffers/inverters used in clocking devices. Furthermore, as a result of [5-6], leakage power reduction has become a critical component of a low-power solution. The power squandered by circuitry while it is sleeping or in standby mode is referred to as leaky power dissipation. The leakage power is given by equation 2, while the propagation delay (Tpd) of the circuitry is given by equation 3.  $P_{leak} = I_{leak}V_{DD}$  (2)





(3)

Where  $I_{leak}$  - Leakage current that passes through the transistor whenever it is turned off, Vdd -supply voltage, Vth - transistor's threshold voltage. This power regulates dynamic power, which is very critical in deep sub-micron circuitry, such as circuits that are quiescent for an extended period of time [7]. The two most prevalent types of leakage are sub-threshold leakage and gate leakage. Clock gating and stopping the clock for particular flip-flops when they are not needed is an excellent method for decreasing switching capacitance [8], [9]. The power savings of clock gating, on the other hand, is highly reliant on logic functions. As a result, in this work, we looked at a unique type of flip-flop cell known as a multi-bit flip-flop (MBFF) [10].

Figure 1 depicts a single-bit flip-flop circuit consisting of two chain inverters and two cascaded latches. Due to DFM requirements for current technology, which shorten the latency between the clock edge and output data, the inverter chain is broad and incorporates a primary driving force. As a result, it can power up to two cascaded latches. As shown in Figure 2, grouping many SBFFs together may aid in dispersing the driving force of the inverter chain.



Figure 2: A MBFF having two input and output sets.

The multi-bit flip-flop has various advantages over the single-bit flip-flop due to its architecture. Several recent research studies use relevant numbers to highlight these principles. We want to present the foundations of MBFF in a clear manner, without relying on data. The key





advantages of multi-bit flip-flops are area reduction, power reduction (possibility of low power design), enhanced clock skew control, and timing enhancement, which is why MBFFs are widely used today.

The rest of the article is arranged as follows. Section 2 gives a review of the MBFF literature, Section 3 covers CMOS implementations of SBFF and MBFF, and Sections 4 and 5 examine the simulation findings and conclusion, respectively.

### 2. Literature Survey

Clock gating and MBFFs are two extensively utilized low-power approaches addressed by Yassine Attaoui et al. [11]. In this research, we developed an innovative strategy for reducing power usage in IoT devices via iterative MBFF insertion. Several experiments will be conducted to evaluate the utilization of multi-bit Flip-Flops throughout every step of logic synthesis, placement, and optimization. We examined the impact of MBFF merger on power & circuit efficiency in numerous development stages, employing 2-bit & 4-bit MBFFs. In the second experiment, we combined 2-bit as well as 4-bit MBFFs and evaluated power numbers and circuit efficiency. Finally, we used an incremental MBFF merging technique to compare the PPA effect of standard MBFF implementation. This experiment obtained a superior overall MBFF combining coverage of 76.6%, excellent rout ability maintenance with reduced congestion, and a 3.8% reduction in wire length, resulting in a 3.3% increase in switching power with no time or area deterioration.

Jin-Tai Yan et al. [12] created a concurrent framework using a set of 1-bit FFs in a deployment plane. The timing constraints of the associated signals are essentially on FFs and available MBFFs inside a cell library. Depending on the timing limitations of the signals on the flip-flops, a time-constrained merging graph (TCMG) might be built initially. Furthermore, ILP formulations leveraging the current MBFFs in the given cell library to merge 1-bit FFs into accessible MBFFs that allow clock power outages might be shown. Compared with the original design, the testing results show that our proposed ILP-based approach could save 20.05 percent clock power overall average for five samples tested.

Taehee Lee et al. [13] investigated the energy clocking network that may be constructed by creating MBFFs and clustering gated clock tree-aware FFs to pull FFs close to the same integrated clock gating cells. It possesses the opportunity to be a compelling choice for lowering clock power. This research looks into the two main techniques' multi-corner and multimode time constraints. This proposed approach creates industrial electronic IP rights elements of a cutting-edge 14-nm CMOS portable device. The results of the experiments show that the MBFF-producing technology decreases clock power by approximately 22%. Incorporating gated clock tree-sensitive FFs clustered on the MBFF generation reduces power by 32%.

Doron Gluzer et al. [14] suggested low-power design strategies for DDCG and MBFFs, typically considered independently. Putting these strategies together in a uniform group method and design methodology allows for even more power efficiency. We explore MBFF redundancy and its link to the frequency of FF data-to-clock toggling. Probability models are built to optimize the predicted energy savings by organizing FFs to enhance their data-to-clock transition chances. We suggest a front-end design approach affected by design structure issues





for a 65-nm 32-bit MIPS and even a 28-nm effective system processor. It has been demonstrated to save 23% and 17% of electricity compared to systems using standard FFS. Integrating the DDCG into MBFFs was responsible for around half of the savings.

Taehyun Kwon et al. [15] suggested a novel clock network optimization technique to lower the clock network's variable power usage. Virtual tiles have been generated throughout the design process, while the most effective columns for arranging flip-flops into lines have been chosen. Following the determination of the influencing columns, FFs are moved based on the generated tiles in the columns while maintaining the shortest possible moving range in view. It is possible to significantly reduce wire capacitance or wire length by simply matching flip-flops. There is no loss in timing or other limitations since it does not affect clock structure, contrasting standard clock network optimization approaches that employ multi-bit flip-flops or registered banks. In establishing commercial IP solutions, the recommended method reduces wire capacitance, length, as well as through number by as to 23.2%, 10.2%, and 16.4%, respectively.

Jaehyun et al. provide a hybrid flip-flop with reduced leakage and a little improvement in setup time and clock-to-Q delay [16]. Weiqiang presents a leakage usage reduction method for adiabatic circuitry design that employs a two-phase complementary PT-adiabatic logic with a power gating mechanism [17]. Meng Tie et al. [19] offer a dual voltage allocation strategy that includes controlled clock skew management. To begin, this work proposes a clock skew scheduling approach to broaden the static power reduction options. The leakage power is then reduced. In [19], Jin Tao Jiang et al. investigated gate leakage reductions in PT adiabatic logic utilizing pull-up topology circuitry. In [20], Hamid et al. offer a classic data-retention approach based on a latch connected to a TG-based FF. Latch and many switches not on important routes use high sub-threshold transistors to reduce leakage power. This method demands extra datapreserving latches and sophisticated sequencing for data transfer among latches and flip-flops throughout any transition from power down into active state and vice versa.

### **3. CMOS Implementation of MBFF**

Because of the widespread usage of memory storage systems and sequential logic in contemporary electronics, fundamental memory components must be designed to be low-power and fast. The D flip-flop is indeed an essential fundamental memory element (DFF). This section presents four distinct implementations of D flip-flops using CMOS logic.

The design of multi-bit flip-flops is responsible for all of their benefits. Figures 1 and 2 depict a single-bit FF and a two-bit MBFF scheme. A similar design may also be imagined for higherbit MBFFs. Overall, inverter count decreases when we utilize a multi-bit flip-flop instead of a single-bit flip-flop. Whenever we utilize a larger MBFF, the impact of this reduction becomes more evident. Because the quantity of inverters is minimized in MBFF, clock power and area are saved. After the MBFF conversion, the functioning of flops remains unchanged.

We describe the approach to solving this problem as a step-by-step procedure for addressing the issue. Figure3 depicts how the design parameters must first be determined via a literature review. Second, the proper device technology should always be chosen depending on the





requirements. The proposed circuit will then be optimized and simulated to ensure it matches the needed parameters. Finally, the proposed technique for improving performance is offered.

For low-power operations, the transistors are intended to operate in weak or moderate inversion zones. The topic of designing or operating a transistor with an inversion zone has been investigated within the unified all-regional theory, sometimes known as the "EKV Model." The concept of the inversion coefficient has relied heavily on unifying all-regional MOSFETs.



Figure 3: Methodology adopted

The model is well-known to its creators and is used to reduce noise and power [21-24]. The EKV configuration is calculated using the MOS transistor formula shown below.

$$IC = \frac{I_D}{I_S}$$
(4)

$$g_{\rm m} = \frac{I_{\rm D}}{n\Phi_{\rm c}} \frac{1 - \exp(-\sqrt{\rm IC})}{\sqrt{\rm IC}}$$
(5)

$$I_{s} = 2\mu C_{ox} \frac{W}{L} \phi_{t}^{2}$$
(6)







Where,

I<sub>s</sub> - normalization current.

I<sub>D</sub> - Drain Current.

ŋ- slope factor, normally taken as 1.

 $C_{ox}$ -oxide Capacitance.

$$C_{ox} = \frac{\varepsilon_0 \ \varepsilon_{si}}{t_{ox}}$$

 $\in_0$  -permittivity of free space

 $\in_{si}$  - relative permittivity of Silicon,

 $t_{ox}$ - thickness of the oxide layer.

 $\phi_t$  -thermal voltage at room temperature, taken as 25.6mv.

IC - Inversion Coefficient.



Figure 4: D Latch based on NAND gate

Figure 4 depicts a D latch constructed using logic gates, especially a NAND gate, along with a NOT gate. It is made up of four NAND gates. Every NAND gate has been constructed using CMOS logic and four transistors, two of which are NMOS & two of which are PMOS, coupled in a push-pull manner. A D-latch may be created by feeding complementary inputs to the S and R inputs of the SR latch. Figure 5 shows how to build a D flip-flop by cascading 2 D latch back-to-back with a master-slave arrangement. NAND and NOT gates are implemented using CMOS logic. The CMOS implementations of NAND and NOT gates are depicted in Figures 6 and 7, respectively.



Figure 5: Master -- slave D Latch based on NAND gate







# Figure 7: CMOS NOT gate

Design 2 utilizes pass transistors (PT) as well as inverters for master-slave latch, as shown in Figure 8. Whenever the PMOS looping transistor (PMOS 3) is switched on, clk Equals 0, the two connected inverters (PMOS1-NMOS1 & PMOS2-NMOS2) are always in storage mode. Another two chained inverters (PMOS5-NMOS5 & PMOS6-NMOS6) on the rightmost operate in the opposite direction. The flip-flop changes its state during clock's lowering edges.

Figure 9 depicts design3, which makes use of transmission gates (TG) as well as inverters. Transmission gates PMOS1-NMOS1 (T1) and PMOS7-NMOS7 (T2) are ON near the negative edge of the provided clock, whereas transmission gates PMOS2-NMOS2 (T2) and PMOS6-NMOS6(T3) would be OFF. During this time, the slave runs a loop via two inverters, PMOS8-NMOS8, PMOS9-NMOS9, as well as T4. The slave has now saved the d-input before triggered result. The master latches the next state at the same time; but, because T3 is still turned off, it cannot be transferred to the slave. T2 and T3 have been switched on during the positive clock





edge, and indeed the newly latched value is transmitted to the slave via the circuit of two inverters, PMOS4-NMOS4, PMOS5-NMOS5, and T2.



Figure 9: Transmission Gate logic Implementation of DFF





Figure 10: GDI logic Implementation of DFF

Figure 10 depicts a master-slave relationship between two GDI-based D-latches. Throughout this example, the bodily gates all have control of the circuit's state. The CLK signal controls these gates that generate two different paths. One represents the latch's transparent phase whenever the clock is low, and signals continue flowing via PMOS transistors. The other corresponds to the latch's holding state, where the clock pulse signal remains high and internal information is kept owing to NMOS transistor conductivity. Inverters are largely in charge of maintaining complementary levels between internal signals as well as circuit results. The Single bit FF implemented in Figures 5, 8, 9, and 10 are used to implement the corresponding technology-based MBFF as depicted in Figure 2.

#### 4. Simulation Results

To assess the effectiveness of the MBFFs, the various SBFFs and MBFFs using various techniques have been designed and executed in Tanner EDA tools using CMOS 45nm technology. The analytical evaluation is based on performance measures such as power consumption, delay, and transistor count. To ensure consistency in comparisons, the entire circuits described in the findings run at a frequency of 20 kHz, at a temperature of 27°C. The simulation results of the SBFF and MBFF are depicted in Figures 11 and 12 respectively.





Figure 11: Transient Simulation of Single Bit DFF

A DFF only accepts one input, D- input. Because the inputs to flip-flop frequently depend on the condition of its outputs, the master-slave design has the benefit of being edge-triggered, which makes it easier to utilize in more extensive circuits. The circuit comprises two D flipflops linked together. The D input remains retained in the initial latch while the clock goes high; however, the secondary latch cannot change state. Whenever the clock gets turned off, the output of the first latch remains stored within the second latch, while the initial latch cannot consider switching. Consequently, output could only change abruptly whenever the clock transitions from high to low. MBFF accepts various information data and produces several information yields. The operation of an MBFF is similar to that of an SBFF; whenever the clock has been activated, the dynamic state flip flop catches all contributions to yield. In an inactive state, the flip flop stores the data.



Figure 12: Transient Simulation of MBFF

# **4.1 Comparison Results:**

Mentor Graphics tools has been used to model the MBFF circuits utilizing 45 nm technology. The circuitry has been compared using different techniques. It has been discovered that the suggested circuit outperforms the prior proposed circuits. As a consequence, in this part, we shall compare performance with various simulation outcomes.

# **Performance based on the Power-Delay Product (PDP)**

Table 1 compares the suggested MBFF utilizing fundamental approaches with different techniques. The objective is to use as little Power Delay Product as feasible. After required number of iterations, the finest PDP values from the team has been selected as the optimal PDP, and also the corresponding transistors widths have been chosen for circuitry building. The simulation is performed regarding delay, power, as well as Power Delay Product (PDP) at supply voltages of 0.8 V, & 1.2 V are shown in Table 1. According to the findings, the GDI excelled in terms of PDP. It is also noticed that as the supply voltage increases the power consumption also increases and vice-versa in case of delay.

Even though the Pass Transistor logic and TG logic excelled in terms of power, both techniques fail to pass the full voltage swing at the output. Regardless of the fact that perhaps the GDI technique does not reach the fastest possible, this value is small enough even for practical application in modern CPUs. As technology improves, the performance of connections is becoming more reliant on resistance rather than capacitance. To get the best performance, the connection resistance & capacitances should be re-optimized

| Table 1: Comparative Analysis of MBFF w.r.t different techniques |                   |      |     |                   |     |      |     |  |  |
|------------------------------------------------------------------|-------------------|------|-----|-------------------|-----|------|-----|--|--|
| Multibit<br>Flipflop                                             | Average Power(µW) |      |     | <b>Delay</b> (ps) |     | PDP  |     |  |  |
|                                                                  | <b>0.6</b> V      | 0.8V | 1 V | 0.8V              | 1 V | 0.8V | 1 V |  |  |
|                                                                  |                   |      |     |                   |     |      |     |  |  |





https://seyboldreport.net/

| ISSN: | 1533 - | 921 |
|-------|--------|-----|

| CMOS | 1.470082 | 2.61255 | 4.134225 | 699.1 | 698.418 | 1,826.6043e- | 2,887.4157e |
|------|----------|---------|----------|-------|---------|--------------|-------------|
|      | e-007    | 3e-007  | e-007    | 645n  | 5n      | 16           | -16         |
| PTL  | 2.324291 | 5.25256 | 1.735078 | 529.9 | 506.882 | 2,662.152e-  | 879.3837e-  |
|      | e-008    | 6e-008  | e-007    | 619n  | 9n      | 17           | 16          |
| TG   | 3.285213 | 9.46400 | 1.794333 | 219.1 | 218.142 | 2,073.7840e- | 391.418e-16 |
|      | e-008    | 1e-008  | e-007    | 234n  | 0n      | 17           |             |
| GDI  | 3.649135 | 1.07799 | 3.657395 | 392.1 | 60.1316 | 422.302577e- | 219.9250e-  |
|      | e-008    | 1e-007  | e-007    | 101n  | n       | 16           | 16          |

# 5. Conclusion:

This work proposes MBFF implementations utilizing traditional CMOS, transmission gates, pass transistors, and GDI gates executed in Tanner EDA tools using CMOS 45nm technology. The concepts' average power and delay (CLK-Q) are being demonstrated. The analytical evaluation is based on performance measures such as power consumption, latency, and transistor count. To ensure consistency in comparisons, the circuits described in the findings run at a frequency of 20 kHz, at a temperature of 27°C. The GDI technique-based implementation characteristics like power, delay, and area make considerable improvements. The simulated results regarding delay, power, and Power Delay Product (PDP) for supply voltages of 0.8 V and 1.2 V are shown. According to the findings, the GDI technique-based MBFF excelled in power and PDP. The number of transistors needed to create an MBFF is lowered in PTL when compared to others.

### Reference

[1] Rahman, M.; Afonso, R.; Tennakoon, H.; Sechen, C. Design automation tools and libraries for low power digital design. In Proceedings of the 2010 *IEEE Dallas Circuits and Systems Workshop*, Richardson, TX, USA, 17–18 October 2010.

[2] Lin, G.J.Y.; Hsu, C.B.; Kuo, J.B. Critical-path aware power consumption optimization methodology (CAPCOM) using mixed-VTH cells for low-power SOC designs. In Proceedings of the 2014 *IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne*, VIC, Australia, 1–5 June 2014; pp. 1740–1743.

[3] Gautam, S. Analysis of multi-bit flip flop low power methodology to reduce area and power in physical synthesis and clock tree synthesis in 90nm CMOS technology. In Proceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Greater Noida, New Delhi, 24–27 September 2014; pp. 570–574

[4] L.-T. Wang, Y.-W. Chang, and K.-T. Cheng, Eds., Electronic Design Automation: Synthesis, Verification, and Test. Burlington, MA: Elsevier/ Morgan Kaufmann, 2009.

[5] B.S. Deepak Subramanian and Adrian Nunez, (2007) "Analysis of Sub-threshold Leakage Reduction in CMOS Digital Circuits", *Proceedings of the 13th NASA VLSI Symposium*, USA, June 5-6.

[6] International Technology Roadmap for Semiconductors: www.itrs.net/Links/2005ITRS/Design 2005.pdf.





[7] Jaehyun Kim, Chungki Oh and Youngsoo Shin, (2009) "Minimizing Leakage Power of Sequential Circuits Through Mixed-Vt Flip-Flops and Multi-VT Combinational Gates", *Journal ACM Transactions on Design Automation of Electronic Systems*, Volume 15 Issue 1, December 2009.

[8] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits: A Design Perspective, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 2003.

[9] M. Keating, D. Flynn, R. Aitken, A. Gibbons, and K. Shi, Low Power Methodology Manual. Berlin, Germany: Springer, 2007.

[10] L. Chen, A. Hung, H.-M. Chen, E. Y.-W. Tsai, S.-H. Chen, M.-H. Ku, and C.-C. Chen, "Using multi-bit flip-flop for clock power saving by Design Compiler," in Proc. Synopsys User Group (SNUG), 2010 [Online]. Available: http://www.synopsys.com.cn/information/snug/2010/ using-multi-bit-flip-flop-for-clockpower-saving-by-designcompiler.

[11] <u>Yassine Attaoui; Mohamed Chentouf; Zine El Abidine Alaoui Ismaili; Aimad El Mourabit,</u>" A new MBFF merging strategy for post-placement power optimization of IoT devices", <u>2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA)</u>.

[12] <u>Jin-Tai Yan; Meng-Tian Chen; Chia-Heng Yen</u>, "Cell-aware MBFF utilization for clock power reduction", <u>2016 IEEE International Conference on Electronics, Circuits and</u> <u>Systems (ICECS)</u>.

[13] <u>Taehee Lee; David Z. Pan; Joon-Sung Yang</u>," Clock Network Optimization With Multibit Flip-Flop Generation Considering Multicore Multimode Timing Constraint", <u>IEEE</u> <u>Transactions on Computer-Aided Design of Integrated Circuits and Systems</u>, Volume: 37, <u>Issue: 1</u>, January 2018.

[14] <u>Doron Gluzer; Shmuel Wimer</u>," Probability-Driven Multibit Flip-Flop Integration With Clock Gating", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Volume: 25, Issue: 3, March 2017.

[15] <u>Taehyun Kwon; Muhammad Imran; David Z. Pan; Joon-Sung Yang," Virtual-Tile-Based Flip-Flop Alignment Methodology for Clock Network Power Optimization", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Volume: 28, Issue: 5, May 2020.</u>

[16] Jaehyun Kim, Chungki Oh and Youngsoo Shin, (2009) "Minimizing Leakage Power of Sequential Circuits Through Mixed-Vt Flip-Flops and Multi-VT Combinational Gates", *Journal ACM Transactions on Design Automation of Electronic Systems*, Volume 15 Issue 1, December 2009.

[17] Weiqiang Zhang, Yu Zhang, Shi Xuhua and Jianping Hu, (2010) "Leakage Reduction of Power Gating Sequential Circuits Based on Complementary Pass-Transistor Adiabatic Logic Circuits", Innovative Computing & Communication, *Intl. Conference on and Information Technology & Ocean Engineering*, pp.282-285.





[18] Meng Tie, Haiying Dong, Tong Wang and Xu Cheng, (2010) "Dual-Vth leakage reduction with Fast Clock Skew Scheduling Enhancement, Design, *Automation& Test in Europe Conference & Exhibition (DATE)*.

[19] Jin Tao Jiang, Li Fang Ye and Jian Ping Hu, (2010) "Leakage Reduction of P-Type Logic Circuits Using Pass- Transistor Adiabatic Logic with PMOS Pull-up Configuration", *Journal Applied Mechanics and Materials*, Vol.39, pp. 73-78.

[20] Hamid Mahmoodi-Meimand and Kaushik Roy, (2004) "Data-Retention Flip-Flops for Power-Down Applications", *Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS 2004)*, pp.677-680.

[21] Enz et. Al., (1995) An Analytical MOS Transistor Model Valid in all Regions of Operation and Dedicated to Low Voltage and Low Current Applications, in *Analog Integrated Circuits and Signal Processing*.

[22] Sansen., (2006) Analog Design Essentials. Cunha et. Al., (1998) A MOS Transistor Model for *Analog Circuit Design, in Journal of Solid-State Circuits*.

[23] Sansen., (2015) Minimum Power in Analog Amplifying Blocks, in *IEEE Solid-State Circuits Magazine*, vol. 7, No. 4.

