

# 2021 PRESS KIT



**ISSCC Press Kit Disclaimer** 

The material presented here is preliminary. As of November 6, 2020, there is not enough information to guarantee its correctness. Thus, it must be used with some caution.

# ISSCC 2021 VISION STATEMENT

The International Solid-State Circuits Conference is the foremost global forum for presentation of advances in solid-state circuits and systems-on-a-chip. The Conference offers a unique opportunity for engineers working at the cutting edge of IC design and use to maintain technical currency, and to network with leading experts.

# **Table of Contents**

| Table of Contents                                                                          | 4  |
|--------------------------------------------------------------------------------------------|----|
| Preamble                                                                                   | 7  |
| FAQ on ISSCC                                                                               | 7  |
| Overview: ISSCC 2021 – Integrated Intelligence is the Future of Systems                    | 10 |
| Plenary Session (Session 1)                                                                |    |
| Plenary Session — Invited Papers                                                           |    |
| Plenary Session — Invited Papers                                                           | 13 |
| Special Events (SE)                                                                        | 15 |
| SE: What Technologies Will Shape the Future of Computing?                                  | 15 |
| SE: Going Remote: Challenges and Opportunities to Remote Learning, Work, and Collaboration | 15 |
| SE: Favorite Circuit Design and Testing Mistakes of Starting Engineers                     | 15 |
| SE: ICs in Pandemics                                                                       |    |
| Special Events (SE)                                                                        |    |
| SE: Making a Career Choice                                                                 |    |
| SE: Student Research Preview                                                               |    |
| Session Overviews and Highlights                                                           |    |
| Conditions of Publication                                                                  |    |
| PREAMBLE                                                                                   |    |
| FOOTNOTE                                                                                   |    |
| Session 2 Overview: Highlighted Chip Releases: 5G and Radar Systems                        | 19 |
| Session 2 Highlights: Highlighted Chip Releases: 5G and Radar Systems                      |    |
| Session 2 Highlights: Highlighted Chip Releases: 5G and Radar Systems                      | 21 |
| Session 3 Overview: Highlighted Chip Releases: Modern SoC Designs                          | 23 |
| Session 3 Highlights: Highlighted Chip Releases: Modern SoC Designs                        | 24 |
| Session 3 Highlights: Highlighted Chip Releases: Modern SoC Designs                        | 25 |
| Session 3 Highlights: Highlighted Chip Releases: Modern SoC Designs                        |    |
| Session 4 Overview: Processors                                                             |    |
| Session 4 Highlights: Processors                                                           |    |
| Session 5 Overview: Analog Interfaces                                                      |    |
| Session 5 Highlights: Analog Interfaces                                                    |    |
| Session 6 Overview: High Performance Receivers and Transmitters for Sub-6GHZ Radios        |    |
| Session 6 Highlights: High Performance Receivers and Transmitters for Sub-6GHZ Radios      |    |
| Session 7 Highlights: Imagers and Pange Sensors                                            |    |
| Session 8 Overview: Ultra-High-Speed Wireline                                              |    |
| Session 8 Highlights: Ultra-High-Speed Wireline                                            | 38 |
| Session 9 Overview: ML Processors From Cloud to Edge                                       |    |
| Session 9 Highlights: ML Processors from Cloud to Edge                                     |    |
| · · · · · · · · · · · · · · · · ·                                                          |    |

| Session 10 Overview: Continuous-Time ADCs and DACs                                    | 43 |
|---------------------------------------------------------------------------------------|----|
| Session 10 Highlights: Continuous-Time ADCs and DACs                                  | 44 |
| Session 11 Overview: Advanced Wireline Links and Techniques                           | 45 |
| Session 11 Highlights: Advanced Wireline Links and Techniques                         | 46 |
| Session 12 Overview: Innovations in Low-power & Secure IoT                            | 48 |
| Session 12 Highlights: Innovations in Low-power & Secure IoT                          | 49 |
| Session 13 Overview: Cryo-CMOS for Quantum Computing                                  | 50 |
| Session 13 Highlights: Cryo-CMOS for Quantum Computing                                | 51 |
| Session 14 Overview: mm-Wave Transceivers for Communication and Radar                 | 52 |
| Session 14 Highlights: mm-Wave Transceivers for Communication and Radar               | 53 |
| Session 15 Overview: Compute-in-Memory Processors for Deep Neural Networks            | 55 |
| Session 15 Highlights: Compute-in-Memory Processors for Deep Neural Networks          | 56 |
| Session 16 Overview: Computation in Memory                                            | 57 |
| Session 16 Highlights: Compute-in-Memory                                              | 58 |
| Session 17 Overview: DC-DC Converters                                                 | 59 |
| Session 17 Highlights: DC-DC Converters                                               | 60 |
| Session 18 Overview: Biomedical Devices, Circuits, and Systems                        | 61 |
| Session 18 Highlights: Biomedical Devices, Circuits, and Systems                      | 62 |
| Session 19 Overview: Optical Systems for Emerging Applications                        | 63 |
| Session 19 Highlights: Optical Systems for Emerging Applications                      | 64 |
| Session 20 Overview: High-Performance VCOs                                            | 65 |
| Session 20 Highlights: High-Performance VCOs                                          | 66 |
| Session 21 Overview: UWB Systems and Wake-Up Receivers                                | 67 |
| Session 21 Highlights: UWB Systems and Wake-Up Receivers                              | 68 |
| Session 22 Overview: Terahertz for Communication and Sensing                          | 69 |
| Session 22 Highlights: Terahertz for Communication and Sensing                        | 70 |
| Session 23 Overview: THz Circuits and Front-Ends                                      | 71 |
| Session 23 Highlights: THz Circuits and Front-Ends                                    | 72 |
| Session 24 Overview: Advanced Embedded Memories                                       | 73 |
| Session 24 Highlights: Advanced Embedded Memories                                     | 74 |
| Session 25 Overview: DRAM                                                             | 75 |
| Session 25 Highlights: DRAM                                                           | 76 |
| Session 26 Overview: RF Power-Amplifier and Front-End Techniques                      | 77 |
| Session 26 Highlights: RF Power-Amplifier and Front-End Techniques                    | 78 |
| Session 27 Overview: Discrete-Time ADCs                                               | 79 |
| Session 27 Highlights: Discrete-Time ADCs                                             | 80 |
| Session 28 Overview: Biomedical Systems                                               | 81 |
| Session 28 Highlights: Biomedical Systems                                             | 82 |
| Session 29 Overview: Digital Circuits for Computing, Clocking, and Power Management   | 83 |
| Session 29 Highlights: Digital Circuits for Computing, Clocking, and Power Management | 85 |
| Session 29 Highlights: Digital Circuits for Computing, Clocking, and Power Management | 86 |

| Session 30 Overview: Non-Volatile Memory                                          | 87 |
|-----------------------------------------------------------------------------------|----|
| Session 30 Highlights: Non-Volatile Memory                                        |    |
| Session 31 Overview: Analog Techniques                                            |    |
| Session 31 Highlights: ΔΣ Class-D Headphone Amplifier                             | 90 |
| Session 32 Overview: Frequency Synthesizers                                       |    |
| Session 32 Highlights: Frequency Synthesizers                                     |    |
| Session 33 Overview: High-Voltage, GaN and Wireless Power                         | 93 |
| Session 33 Highlights: High-Voltage, GaN and Wireless Power                       | 94 |
| Session 34 Overview: Emerging Imaging Solutions                                   | 95 |
| Session 34 Highlights: Emerging Imaging Solutions                                 |    |
| Session 35 Overview: Adaptive Digital Techniques for Variation Tolerant Systems   | 97 |
| Session 35 Highlights: Adaptive Digital Techniques for Variation Tolerant Systems |    |
| Session 36 Overview: Hardware Security                                            |    |
| Session 36 Highlights: Hardware Security                                          |    |
| Trends                                                                            |    |
| Conditions of Publication                                                         |    |
| PREAMBLE                                                                          |    |
| FOOTNOTE                                                                          |    |
| Analog – 2021 Trends                                                              |    |
| Power Management – 2021 Trends                                                    |    |
| Data Converters – 2021 Trends                                                     |    |
| RF – 2021 Trends                                                                  |    |
| Wireless – 2021 Trends                                                            |    |
| Wireline – 2021 Trends                                                            |    |
| Digital Architectures & Systems (DAS) – 2021 Trends                               |    |
| Digital Circuits – 2021 Trends                                                    |    |
| Machine Learning (ML) & AI – 2021 Trends                                          |    |
| Memory – 2021 Trends                                                              |    |
| IMMD – 2021 Trends (Medical)                                                      |    |
| IMMD – 2021 Trends (Imagers)                                                      |    |
| Technology Directions – 2021 Trends                                               |    |
| INDEX                                                                             |    |
| Technical Topics Mapped to Papers                                                 |    |
| Selected Presenting Companies/Institution Mapped to Papers                        |    |
| Contact Information                                                               |    |
|                                                                                   |    |

#### FAQ on ISSCC

#### What is ISSCC?

ISSCC (International Solid-State Circuits Conference) is the **flagship** conference of the IEEE Solid-State Circuits Society. According to the SIA, the Semiconductor industry generated US\$468.8 billion in sales in 2018 and ISSCC continues to be the premier technical forum for presenting advances in solid-state circuits and systems. According to the SIA, in 2020, semiconductor sales are expected to reach U.S\$ 433 billion worldwide. Semiconductors are crucial components of electronics devices and the industry is highly competitive. The <u>year-on-year growth rate</u> in 2020 is expected to see growth rates of 5.9 percent.

#### Who Attends ISSCC?

Attendance at ISSCC 2021 is expected to be around **3000**. Corporate attendees from the semiconductor and system industries typically represent around **60%**.

#### Where is ISSCC?

The 68th ISSCC will be held virtually from February 13th through February 21st, 2021.

#### Are there Keynote Speakers?

After a day devoted to educational events, ISSCC 2021 begins formally on Monday, February 15, 2021 with four exciting plenary talks:

- Mark Liu, Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan
- Victor Peng, Xilinx, San Jose, CA
- Dina Katabi, Massachusetts Institute of Technology, Cambridge, MA
- Albert J. P. Theuwissen, Delft University of Technology & Harvest Imaging

#### What is the Technical Coverage at ISSCC?

ISSCC covers a full spectrum of design approaches in advanced technical areas broadly categorized as: (1) Communication Systems, (2) Analog Systems, (3) Digital Systems, and (4) Innovations including micro-machines and MEMS, imagers, sensors, biomedical devices, as well as forward-looking developments that may take three or more years for commercialization.

#### How are ISSCC Papers Selected?

Currently around 600 submissions are received each year across the broad spectrum of specified topics. Review is by a team of over 150 scientific and industry experts from the Far-East, Europe, and North America. These experts are organized into 12 Sub-Committees that cover the 4 broad areas described earlier:

- Communication Systems includes Wireless, RF, and Wireline Subcommittees
- Analog Systems includes Analog, Power Management, and Data Converter Subcommittees
- Digital Systems includes Memory, Digital Circuits, Digital Architectures and Systems, and Machine Leaning and Al Subcommittees
- Innovative Topics includes Imagers/MEMS/Medical Devices/Displays and Technology Directions Subcommittees

#### What Companies are Presenting this year?

Companies presenting papers at ISSCC 2021 include Analog Devices, Broadcom, Huawei Technologies, IBM, Intel, Qualcomm, Realtek Semiconductor, Renesas, Samsung, Sony, SK Hynix, STMicroelectronics, TSMC, Texas Instruments, just to name a few. A more complete list can be found in the Index.

#### Are there educational sessions?

ISSCC features a variety of educational events which include:

- Twelve Tutorials (targeted toward participants looking to broaden their horizon)
- Six Forums (targeted toward experts in an information sharing context)
- One Short Course (targeted toward in-depth appreciation of a current hot topic)

#### Are There Other Events?

A more complete list of all activities at ISSCC 2021:

- Four Plenary Presentations
- Six Invited Industry Talks on Highlighted Chip Releases
- Technical Sessions (35 distinct sessions)
- Six Special Events and Panels, including:
  - What Technologies Will Shape the Future of Computing?
  - o Going Remote: Challenges and Opportunities to Remote Learning, Work, and Collaboration
  - o Favorite Circuit Design and Testing Mistakes of Starting Engineers
  - o ICs in PandemICs
  - Making a Career Choice
  - Student Research Preview (for the introduction of graduate-student research-in-progress)
- Educational Sessions Featuring:
  - o Twelve Tutorials
  - Six Forums
  - One Short Course
- Demonstration Sessions from Academia and Industry
- Networking Events
- Author Interview Sessions
- A Number of University Alumni Events

#### How Do I Use this Press Kit?

The Press Kit provides a PREAMBLE section that features this FAQ and other general information. The kit also includes SESSION OVERVIEWS AND HIGHLIGHTS of all 35 technical sessions into which the 195 papers are grouped, together with brief descriptions and context for each. As well, there is an abstract for each of the Plenary talks. For your convenience, the Kit includes two structural charts in the INDEX section: (a) a list of the 4 Technical Topics and their associated Subcommittees and Sessions; (b) a list of contributing companies and institutions with their associated papers. Thus, to located information of interest you can access Chart 4.1 to identify sessions of interest, after which you might logically access its Session's Overview or Highlight section. Alternatively, if your interest is in particular organization then Chart 4.1 will direct you immediately to papers of interest each of which is detailed in its corresponding Session Overview and possibly in the Highlights section. For anyone's interest it is useful to use Chart 4.1 to access the appropriate Trend information which provides a broad historical view of the context of your interest and often includes reference to current ISSCC 2021 papers.

#### Anything New This Year?

ISSCC will hold an invited Industry Track (Sessions 2 and 3) which will highlight recent hot-product releases from Analog Devices, Baidu, Infineon/Google, Microsoft/AMD, Nvidia, and Texas Instruments discuss innovative ways they solved product-level challenges.

#### **Overview: ISSCC 2021 – Integrated Intelligence is the Future of Systems**

Circuits are no longer used simply as isolated functions, but rather part of an expanding world more complex, and all around us. Correspondingly, IC functions are becoming more sophisticated with a focus on building larger smart integrated systems. We still need some general work-horse components, such as, advanced accelerators, GHz to THz radios to communicate, sophisticated sensors, along with new ways to convert data from analog to digital. At the same time, power management is evolving to make systems more energy efficient. Specialized circuits are used to accelerate Artificial Intelligence algorithms to better understand the world. Furthermore, new packaging technologies will enable complex multi-chip systems, and emerging devices are expected together with new architectures and algorithms to enable new systems' functions, and ultimately future smart applications that are not yet envisioned.

#### Plenary Session (Session 1)

The Plenary Session on the mornings of February 15 and 16 2021, will feature four renowned speakers:

- Mark Liu, Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan
- Victor Peng, Xilinx, San Jose, CA
- Dina Katabi, Massachusetts Institute of Technology, Cambridge, MA
- Albert J. P. Theuwissen, Delft University of Technology & Harvest Imaging

Highlights of these Plenary talks are provided in the following section.

# **ISSCC 2021** PLENARY SESSION – INVITED PAPERS



Chair: Kevin Zhang, Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan ISSCC Conference Chair

Associate Chair: Makoto Ikeda, University of Tokyo, Tokyo, Japan ISSCC International Technical-Program Chair

#### 1.1 Unleashing the Future of Innovation

Mark Liu, Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan

The foundry business model, pioneered by TSMC more than three decades ago, brought a sea change to technology innovation and how integrated circuits (ICs) and systems are designed and manufactured. Access to semiconductor technology is no longer limited to large corporations that invest billions of dollars to build a fabrication plant. The foundry model has democratized IC innovation, making it available to all visionaries and innovators. Today, an open innovation platform that connects innovators with semiconductor-technology providers is a vital link in the global supply chain. Our industry has already begun to look beyond just engineering individual chips manufactured on wafers, and have moved to integrate individual chips into systems. System performance and energy efficiency will continue to advance at historical rates, driven by innovations from many aspects, including materials, device and integration technology, circuit design, architecture, and systems. User applications drives design choices, and design choices are enabled by technology advancements. Advances in an open innovation ecosystem will further lower the entry barriers and unleash the future of innovation!

#### 1.2 Adaptive Intelligence in the New Computing Era

Victor Peng, Xilinx, San Jose, CA

We are in a new computing era where hundreds of billions of intelligent devices are being connected and deployed in the cloud, at the edge, and endpoints, that generate, transport, process, and store Zettabytes of unstructured data. Developing products for this new era will require platforms that are not only intelligent with embedded AI, but also adaptive to enable rapid innovation and adaptation for optimizing changing workloads and market needs, both before and after deployment. Adaptive intelligence will be pervasive in all aspects of data generation, transport, storage, and processing. Infrastructure that is massively scaled out and connected is better optimized and more resilient to change when constructed with adaptive computing platforms. This paper will describe the trends driving this new computing paradigm, including the ability for technology scale-up, scale-out, and scale-down, and explain how adaptive platforms are enabling these trends. We will show real use-cases where adaptive intelligent platforms are helping usher in this new computing era!

## 1.3 Working at the Intersection of Machine Learning, Signal Processing, Sensors, and Circuits

Dina Katabi, Massachusetts Institute of Technology, Cambridge, MA

The past few decades have witnessed major advances in wireless devices and sensors. We argue however that future innovations require novel designs that transcend the traditional boundaries between computer science and electrical engineering, and deliver software and hardware systems, where neural networks can directly interpret radio signals. We show that such a design delivers a form of x-ray vision, introducing a new class of sensors that can see through walls and occlusions. Unlike cameras which sense the visible light, the new sensors leverage that radio signals traverse walls and occlusions and reflect off objects and people. The sensing devices have embedded neural networks that interpret radio reflections to see people through walls and detect their actions. Our devices can also infer people's emotions (for example, sad, happy, angry) even if the emotion does not show on one's face! These sensors can also monitor breathing, heart rate, sleep, gait, and falls, without wearable devices or body contact, enabling a new generation of contactless health and wellness monitors

#### 1.4 There's More to the Picture Than Meets the Eye (and in the future it will become only much more)

Albert J. P. Theuwissen, Delft University of Technology & Harvest Imaging

Over the last five decades, solid-state imaging has gone through a difficult "childhood", changing technology during its "adolescence", and finally growing up to become a mature, "adult" that can compete with the human visual system when it comes to image quality. State-of-the-art mobile devices enjoyed by consumers, rely on a multi-disciplinary mixture of analog electronics, digital circuits, mixed-signal design, optical know-how, device physics, semiconductor technology, and algorithm development. As a result, CMOS image sensors utilized in today's mobile phones come close to perfection as far as imaging characteristics are concerned. However, this does not mean that further developments in the field are no longer necessary. On the contrary, new technologies and new materials are opening up new dimensions and new applications which complement the classical imaging functionality of sensors. This trend will ultimately convert the image sensor landscape from image capturing to smart vision. Consequently, the future of solid-state imaging will not only revolve around the shooting of beautiful images, as the market driver will no longer be limited only to mobile phones.

# ISSCC 2021 SPECIAL EVENTS

# **Special Events (SE)**

ISSCC 2021 will continue the popular tradition of special sessions where experts often of opposing views, discuss topics which range from the lighthearted to the controversial (but always informative and entertaining!). This year's are "What Technologies Will Shape the Future of Computing?"; Going Remote: Challenges and Opportunities to Remote Learning, Work, and Collaboration"; "Favorite Circuit Design and Testing Future of Computing?; ICs in Pandemics.

In addition, ISSCC 2021 will include additional special events including Making a Career Choice and a Student Research Preview.

#### SE: What Technologies Will Shape the Future of Computing?

#### Friday, February 19

General-purpose computing has derived performance gains from clock frequency and instructions-per-clock for over four decades; achieving an impressive  $\sim 10_5$  performance increase over the same timeframe. With the future of the traditional computing roadmap in doubt, this event will discuss what other technologies could help shape the future of computing.

How much further can we push traditional CPU micro-architectures? Will 3D integration help extend the traditional roadmap for another decade? Will dedicated accelerators become mainstream alternatives for everyday computing tasks? Can memory materials or architectures provide a performance breakthrough for traditional architectures and what if error-free memory is no longer a constraint?

#### SE: Going Remote: Challenges and Opportunities to Remote Learning, Work, and Collaboration

#### Friday, February 19

For years there has been a call to increase remote work, conferencing, and education. Although many companies have geographically distributed teams and students have moved to online instruction, remote working and learning has yet to become the norm despite the available technology and resources. Remote work and education provide positive environmental benefits as well as improved work-life integration and flexibility. Today, key challenges include effective communication, laboratories, isolation, and privacy. Being at the forefront of innovation, our community often leads technology adoption. In this evening session, we explore how to shape the inevitable shift to more distributed and remote styles of working and learning.

#### SE: Favorite Circuit Design and Testing Mistakes of Starting Engineers

#### Friday, February 19

This evening event will focus on typical mistakes that all starting graduate students seem to make when they first start designing circuits. Many of these errors are simple and easily avoided if only they are pre-warned. These errors show up at different phases of design, simulation, layout and testing. Renowned speakers will share their experience and provide examples of their own mistakes!

#### **SE: ICs in Pandemics**

#### Friday, February 19

The COVID-19 pandemic has imposed a powerful test across the globe. As the current pandemic unfolds, revolutionary social and economic changes have accelerated that would otherwise have taken decades to materialize, especially the digital transformation enabling virtual presence. The IC industry continues to forge ahead, providing the building blocks for innovations that improve the economic and social prosperity of the world. From smarter robots to automation, from connected medical devices to Al-driven data analytics, cost-effective, secure, portable and high-accuracy IC technology is already in place. This evening event brings together experts from industry and academia in cloud-connected biosensors, advance algorithms and artificial intelligence (AI) to discuss our preparedness to combat the spread of infectious diseases now and in the future. The talks will feature recent work on accelerated drug discovery, enhanced contact tracing, continuous remote patient monitoring and data analysis with related security and privacy concerns.

# **Special Events (SE)**

#### SE: Making a Career Choice

#### Saturday, February 20

This interactive event will include several distinguished panelists representing a broad variety of career choices in the areas of start-ups, industry, research, and academia available to graduates in electrical and computer engineering. Following short introductory remarks by each panelist, this forum will open for audience interaction in which the audience is invited to express a broad range of questions to the panelists.

#### **SE: Student Research Preview**

#### Saturday, February 20

The Student Research Preview (SRP) will highlight selected student research projects in progress. The SRP consists of 90 second presentations followed by a Poster Session, by graduate students from around the world, which have been selected on the basis of a short submission concerning their on-going research.

The Student Research Preview will include a talk by a distinguished member of the solid-state circuits community, Dr. Jennifer Lloyd, Analog Devices.

# ISSCC 2021 SESSION OVERVIEWS AND HIGHLIGHTS



#### PREAMBLE

The Session Overviews and Highlights to follow serve to capture the context, highlights, and potential impact, of the papers to be presented in each Session at ISSCC 2021 in February.

OBTAINING COPYRIGHT to ISSCC press material is EASY!

You may quote the Subcommittee Chair as the author of the text if authorship is required.

You are welcome to use this material, copyright- and royalty-free, with the following understanding:

- That you will maintain at least one reference to ISSCC 2021 in the body of your text, ideally retaining the date and location. For detail, see the FOOTNOTE below.
- That you will provide a courtesy PDF of your excerpted press piece and particulars of its placement to shahriar@ece.ubc.ca

#### FOOTNOTE

• From ISSCC's point of view, the phraseology included in the box below captures what we at ISSCC would like your readership to know about this, the 68th appearance of ISSCC, on February 13<sup>th</sup> to February 22<sup>nd</sup> 2021.

This and other related topics will be discussed at length at ISSCC 2021, the foremost global forum for new developments in the integrated-circuit industry. ISSCC, the International Solid-State Circuits Conference, will be held virtually on February 13 - February 22, 2021

ISSCC Press Kit Disclaimer

The material presented here is preliminary.

As of November 6, 2020, there is not enough information to guarantee its correctness.

Thus, it must be used with some caution.

## Session 2 Overview: Highlighted Chip Releases: 5G and Radar Systems Invited Papers

Session Chair: Theodoros Georgantas, Broadcom, Athens, Greece Session co-Chair: Yves Baeyens, Nokia-Bell Labs, Murray Hill, NJ Session Moderator: Alice Wang, Everactive, Plano, TX

This session highlights innovations in 5G and Radar systems announced within the last year. The invited product papers are at the cutting edge within the exciting emerging fields of 5G and Radar. The papers delve into practical system-related topics, mass-production related challenges and solutions (e.g., cost, reliability, thermal/voltage issues, packaging, etc.) in addition to circuit content and silicon measurement results.

- In Paper 2.1, Analog Devices describes their full line-up of millimeter-Wave (mmW) 5G radios, with a focus on the millimeter-Wave front-end portion, and how it addresses some of the challenges related to cost, heat dissipation and array calibration.
- In Paper 2.2, Texas Instruments showcases a high-performance 76-to-81GHz FMCW Automotive Radar that supports multichip cascading to enable higher angular resolution, and a compact 57-to-64 GHz single chip Radar with integrated antennas on package.
- In Paper 2.3, Infineon and Google jointly introduce SOLI, a new human-machine interface that represents the first ever tiny 60GHz radar system integrated into a smartphone, the Google Pixel 4.

# Session 2 Highlights: Highlighted Chip Releases: 5G and Radar Systems

## [2.1] mm-Wave 5G Radios: Baseband to Waves

Paper Authors: Ahmed Khalil

Paper Affiliation: Analog Devices, Cairo, Egypt

Invited Industry Chairs: Dennis Sylvester, University of Michigan, Ann Arbor, MI and Alice Wang, Everactive, Plano, TX

#### CONTEXT AND STATE OF THE ART

- 5G is the next-generation mobile internet technology that will provide faster speeds and more reliable connections on smartphones and other devices.
- Designing millimeter-Wave radios for the new 5G standard has a lot of challenges, particularly related to its operating frequency, higher cost, and heat dissipation vs. 4G, the fourth-generation technology.

#### TECHNICAL HIGHLIGHTS

- Analog Devices describes their full line-up of millimeter-Wave (mmW) 5G radios used today, with a focus on the millimeter-wave front-end portion, and addresses some of the challenges related to cost, heat dissipation and array calibration.
  - The authors disclose details about their dual-polarized 24-to-30GHz band mmW radio that has a 16 channel, 2×8 high-performance beamformer that has best-in-class linear output power of 12dBm/channel @ 3% Error Vector Magnitude (EVM) and 21dBm P1dB.

#### APPLICATIONS AND ECONOMIC IMPACT

- The high speed and increased network capacity of 5G has the potential to open up a variety of new applications such as selfdriving vehicles, Augmented Reality (AR), Virtual Reality (VR), drones and remote surgery.
- According to the World Economic Forum, "Fast, intelligent internet connectivity enabled by 5G technology is expected to create approximately \$3.6 trillion in economic output and 22.3 million jobs by 2035 in the global 5G value chain alone." [1]

[1] http://www3.weforum.org/docs/WEF\_The\_Impact\_of\_5G\_Report.pdf

# Session 2 Highlights: Highlighted Chip Releases: 5G and Radar Systems

# [2.2] High-Performance and Small Form-Factor mm-Wave CMOS Radars for Automotive and Industrial Sensing in 76-to-81GHz and 57-to-64GHz Bands

#### [2.3] SOLI: A Tiny Device for a New Human-Machine Interface

**Paper 2.2 Authors:** Krishnanshu Dandu<sup>1</sup>, Sreekiran Samala<sup>1</sup>, Karan Bhatia<sup>1</sup>, Meysam Moallem<sup>1</sup>, Karthik Subburaj<sup>2</sup>, Zeshan Ahmad<sup>1</sup>, Daniel Breen<sup>1</sup>, Sunhwan Jang<sup>1</sup>, Tim Davis<sup>1</sup>, Mayank Singh<sup>1</sup>, Shankar Ram<sup>2</sup>, Vashishth Dudhia<sup>2</sup>, Marc DeWilde<sup>1</sup>, Dheeraj Shetty<sup>2</sup>, John Samuel<sup>2</sup>, Zahir Parkar<sup>2</sup>, Cathy Chi<sup>1</sup>, Pilar Loya<sup>1</sup>, Zachary Crawford<sup>1</sup>, John Herrington<sup>1</sup>, Ross Kulak<sup>1</sup>, Abhinav Daga<sup>2</sup>, Rakesh Raavi<sup>2</sup>, Ravi Teja<sup>2</sup>, Rajesh Veettil<sup>2</sup>, Daniel Khemraj<sup>1</sup>, Indu Prathapan<sup>2</sup>, Prakash Narayanan<sup>2</sup>, Naveen Narayanan<sup>2</sup>, Sangamesh Anandwade<sup>2</sup>, Jasbir Singh<sup>2</sup>, Venkatesh Srinivasan<sup>1</sup>, Neeraj Nayak<sup>1</sup>, Karthik Ramasubramanian<sup>2</sup>, Brian Ginsburg<sup>1</sup>, Vijay Rentala<sup>1</sup>

Paper 2.2 Affiliation: 1Texas Instruments, Bangalore, India, 2Texas Instruments, Dallas, TX

Paper 2.3 Authors: Saverio Trotta<sup>1</sup>, Dave Weber<sup>2</sup>, Reinhard W. Jungmaier<sup>1</sup>, Ashutosh Baheti<sup>1</sup>, Jaime Lien<sup>2</sup>, Dennis Noppeney<sup>1</sup>, Maryam Tabesh<sup>2</sup>, Christoph Rumpler<sup>1</sup>, Michael Aichner<sup>3</sup>, Siegfried Albel<sup>3</sup>, Jagjit Singh Bal<sup>4</sup>, Ivan Poupyrev<sup>2</sup>

Paper 2.3 Affiliation: <sup>1</sup>Infineon Technologies, Neubiberg, Germany, <sup>2</sup>Google, Mountain View, CA, <sup>3</sup>Infineon Technologies, Villach, Austria, <sup>4</sup>Infineon Technologies, Milpitas, CA

Invited Industry Chairs: Dennis Sylvester, University of Michigan, Ann Arbor, MI and Alice Wang, Everactive, Plano, TX

#### CONTEXT AND STATE OF THE ART

• Millimeter-wave (mmWave) radar sensors are a key component of advanced driver-assistance systems for enhanced automotive safety, robotics, building automation and healthcare.

## TECHNICAL HIGHLIGHTS

- Texas Instruments showcases a high-performance 76-to-81GHz FMCW Automotive Radar that supports multi-chip cascading to enable higher angular resolution and a compact 57-to-64 GHz single-chip Radar with integrated antennas on package.
  - This paper presents the highest front-end performance for an 80GHz CMOS Radar that, in MIMO mode, enables sub-2° angular resolution, and the first 60GHz single chip CMOS antenna-on-package Radar intended for broad market applications
- Infineon and Google jointly introduce SOLI, a new human-machine interface (HMI) that represents the first-ever tiny 60GHz radar system integrated into a smartphone, the Google Pixel 4.
  - Smart presence detection use-case runs at <5mW for a detection range up to 5m (in the boresight), for the first time enabling the adoption of mm-wave radar sensor technology in the consumer and IoT space.

## APPLICATIONS AND ECONOMIC IMPACT

- Demand for advanced driver-assistance systems (ADAS) is expected to increase over the next decade, fueled largely by regulatory and consumer interest in safety applications that protect drivers and reduce accidents. For instance, both the European Union and the United States are mandating that all vehicles be equipped with autonomous emergency-braking systems and forward-collision warning systems by 2022.
- With the introduction of the Internet of Things (IoT), there is an increasing focus on human-to-machine interaction. Nowadays, sensors make system and robots to see, hear, feel and intuitively "understand" their surroundings. Millimeter-wave radar provides a very attractive solution for the sensing of human motion enabling specific use cases such as: smart presence, hand gesture, and vital signs monitoring. Those can enhance the user experience in wearables, mobile devices, TVs, smart homes, automotive infotainment systems and AR-VR applications.

## Session 3 Overview: Highlighted Chip Releases: Modern SoC Designs Invited Papers

Session Chair: Thomas Burd, AMD, Santa Clara, CA
Session Co-Chair: Rangharajan Venkatesan, Nvidia, Santa Clara, CA
Session Moderator: Dennis Sylvester, University of Michigan, Ann Arbor, MI

This session highlights three major new Systems-on-Chip released recently, spanning several application areas. The invited product papers reside at the bleeding edge within the exciting fields of gaming, machine-learning accelerators, and data-center GPUs. The papers delve into practical system-related topics, mass-production related challenges and solutions (e.g., system interconnection design decisions, thermal/voltage/acoustic issues, packaging, etc.) in addition to circuit content, software interaction, and silicon measurement results.

- In Paper 3.1, Microsoft describes their new XBOX Series X System-on-Chip, with emphasis on power-reduction techniques such as fine-grained power management and supply monitoring, thermal/acoustic constraints, and yield/performance/power tradeoffs using compute-unit redundancy. The 7nm chip improves CPU/GPU performance by 3×/2× over Microsoft's priorgeneration gaming SoC.
- In Paper 3.2, Nvidia highlights their new A100 datacenter GPU and Ampere architecture, focusing on a next-generation Tensor core for efficient matrix multiplies. The 826mm<sup>2</sup> 54B transistor A100 die includes a large number of new features including support of new data types, the streamlining of data movement reflecting recent advances in deep-learning algorithms, and advanced hardware and software support for multi-GPU systems including improved high-speed I/O.
- In Paper 3.3, Baidu introduces Kunlun, their first in-house design targeting artificial intelligence. The chip seeks to combine programmability in its XPU-cluster compute unit with high energy efficiency in deep learning via its XPU software-defined neural network compute unit. This hybrid architecture combined with a unified programming model allows Kunlun to be readily applied to a range of applications; diverse examples are shown in industrial defect detection and conventional search engines.

# Session 3 Highlights: Highlighted Chip Releases: Modern SoC Designs

## [3.1] XBOX Series X: A Next-Generation Gaming Console SoC

Paper Authors: Paul Paternoster<sup>1</sup>, Andy Maki<sup>2</sup>, Andres Hernandez<sup>2</sup>, Mark Grossman<sup>1</sup>, Michael Lau<sup>1</sup>, David Sutherland<sup>2</sup>, Aditya Mathad<sup>3</sup>

Paper Affiliation: 1Microsoft, Sunnyvale, CA, 2Microsoft, Redmond, WA, 3AMD, Austin, TX

Invited Industry Chairs: Dennis Sylvester, University of Michigan, Ann Arbor, MI and Alice Wang, Everactive, Plano, TX

## CONTEXT AND STATE OF THE ART

- Gaming consoles represent a \$40B+ market and a key driver of innovations in high-performance graphics, storage, networking, and other chip designs areas.
- Several major new gaming consoles have just been launched, and the underlying circuit and system advances in these platforms
  is of broad interest, to circuit designers but also the general public.

#### TECHNICAL HIGHLIGHTS

- Microsoft describes their new XBOX Series X system-on-chip, with emphasis on power-reduction techniques, discussion on thermal/acoustic constraints, and yield/performance/power tradeoffs using redundancy.
  - The authors describe their 358mm<sup>2</sup> chip designed in TSMC 7nm technology that improves CPU/GPU performance by 3×/2× and 2.4× higher GPU performance per Watt over Microsoft's prior-generation gaming SoC.

### APPLICATIONS AND ECONOMIC IMPACT

- This SoC will be included in 10s of millions of XBOX Series X gaming consoles expected to be sold in the next few years.
- Various power-reduction techniques used in the design lead to 20-to-25% power savings, directly corresponding to enormous carbon footprint reduction within this sector, which traditionally has had a large environmental impact [1].

[1] Mills, Evan, Norman Bourassa, Leo Rainer, Jimmy Mai, Arman Shehabi, and Nathaniel Mills. "Toward greener gaming: Estimating national energy use and energy efficiency potential." The Computer Games Journal 8, no. 3-4 (2019): 157-178.

# Session 3 Highlights: Highlighted Chip Releases: Modern SoC Designs

### [3.2] The A100 Datacenter GPU and Ampere Architecture

Paper Authors: Jack Choquette, Ronny Krashinsky, Edward Lee, Vishnu Balan, Brucek Khailany

Paper Affiliation: Nvidia, Santa Clara, CA

Invited Industry Chairs: Dennis Sylvester, University of Michigan, Ann Arbor, MI and Alice Wang, Everactive, Plano, TX

## CONTEXT AND STATE OF THE ART

- GPU-based acceleration in data centers is a fast-growing market due to the diversity of workloads handled in data centers today.
- New GPU architectures should offer machine-learning acceleration capabilities to manage these diverse workloads, and are
  often memory bandwidth limited, making full system optimization (rather than a single chip) critical.

#### TECHNICAL HIGHLIGHTS

[1]

- Nvidia describes their new A100 datacenter GPU, which relies on the Ampere architecture including new canonical tensor cores or execution units to dramatically accelerate deep-learning applications.
  - The authors describe their 826mm<sup>2</sup> chip designed in TSMC 7nm technology that improves performance by 1.5-2.1× over the previous-generation data center GPU on a range of scientific applications, and by up to 2.5× on common inference tasks such as speech recognition and recommendation models.

## APPLICATIONS AND ECONOMIC IMPACT

- This GPU now serves as a foundational component in data centers across the world, in use by all leading cloud providers.
- Inference in the data center is a growing market, expected to reach \$9 to 10B by 2025 by McKinsey [1].

https://www.mckinsey.com/~/media/McKinsey/Industries/Semiconductors/Our%20Insights/Artificial%20intelligence%20hardware% 20New%20opportunities%20for%20semiconductor%20companies/Artificial-intelligence-hardware.ashx

# Session 3 Highlights: Highlighted Chip Releases: Modern SoC Designs

## [3.3] Kunlun: A 14nm High-Performance AI Processor for Diversified Workloads

Paper Authors: Jian Ouyang, Xueliang Du, Yin Ma, Jiaqiang Liu

Paper Affiliation: Baidu, Beijing, China

Invited Industry Chairs: Dennis Sylvester, University of Michigan, Ann Arbor, MI and Alice Wang, Everactive, Plano, TX

## CONTEXT AND STATE OF THE ART

- Accelerators are increasingly used in data centers to provide higher performance and better energy efficiency than generalpurpose processors can provide.
- Machine-learning workloads are one critical part of data centers today, but scientific computing and other more 'traditional' applications should remain supported, often calling for multi-chip solutions and corresponding programming challenges.

## TECHNICAL HIGHLIGHTS

- Baidu describes their new Kunlun Al accelerator, that offers flexibility and programmability for both machine learning and traditional applications in the data center.
  - The authors describe a new data center AI-focused chip designed in Samsung 14nm technology that uses a hybrid architecture with both matrix-multiply acceleration for deep-learning tasks (both training and inference are supported), and a programmable cluster compute unit targeting data-parallel tasks such as scientific computing. Peak performance is 281 Teraoperations per second (TOPS).

## APPLICATIONS AND ECONOMIC IMPACT

- This processor is now deployed in Baidu web searches, as well as in inference-as-a-service settings.
- Inference in the data center is a growing market, expected to reach \$9 to 10B by 2025 by McKinsey [1].

#### [1]

https://www.mckinsey.com/~/media/McKinsey/Industries/Semiconductors/Our%20Insights/Artificial%20intelligence%20hardware% 20New%20opportunities%20for%20semiconductor%20companies/Artificial-intelligence-hardware.ashx

## **Digital Architectures and Systems Subcommittee**

Session Chair: Sanu Mathew, Intel, Hillsboro, OR, USA

Session Co-Chair: Shidhartha Das, Arm, Cambridge, UK

Session Moderator: Hugh Mair, MediaTek, Austin, TX

This session showcases advancements in performance, power, and implementation efficiencies for general-purpose and domain-specific processors, featuring processors from a diverse range of applications, including smartphones, video, automotive, biomedical, IoT and combinatorial optimization. In addition to circuit techniques and processor chips, innovative design methodologies are featured addressing the productivity challenge of ever-increasing design complexity.

- In Paper 4.1, MediaTek describes the 3.0GHz Arm Cortex-A78 CPU from their latest 5G Smartphone SoC. The paper details
  implementation and circuit techniques used to achieve the 3.0GHz clock rate in volume production. A 15% performance
  improvement vs. the traditional core is demonstrated, while a mid-gear CPU provides improved power efficiency for a broad
  range of workloads.
- In Paper 4.2, Renesas details their 12nm autonomous driving SoC. The processor features a convolutional neural network achieving 60.4TOPS at 13.8TOPS/W, combined with task-separated ASIL-D control. The authors show how a device pair meets the requirements for an air-cooled Level-3 autonomous driving application.
- In Paper 4.3, University of California at Berkeley presents an 8-core RISC-V-based SoC with programmable-precision vectoraccelerators. The SoC showcases an agile design methodology, where a generator-based design-flow enables a small-design team to achieve silicon success with a complex 1.44GHz, multi-core 1.125M-gate mixed-signal SoC using Chisel, BAG and Hammer as component-generators.
- Paper 4.4, University of Bologna showcases a wide-dynamic-range IoT SoC, achieving a combination of 1.7μW state-retentive sleep power and a peak-performance of 32GOPS at 1.3TOPS/W efficiency. The 12mm2 SoC integrates an always-on cognitive wake-up unit with a 10 RISC-V application cores and embedded 4MB MRAM-based NVM.
- In Paper 4.5, University of Electronic Science and Technology of China describes a 65nm 1.74mm<sup>2</sup> biomedical AI processor capable of detecting ECG, EMG, and seizure anomaly events. The processor operates at 0.75V at frequencies up to 2.5MHz with a classification energy less than 6µJ.
- In Paper 4.6, Hitachi demonstrates a scalable 144Kb, 9-chip annealing processor system operating at 100MHz. The 128×128 spin chip, featuring 5b spin coefficients and latency optimized chip-to-chip communication, is fabricated in 40nm CMOS.
- In Paper 4.7, National Taiwan University describes a power-optimized 3.33mm<sup>2</sup> super-resolution processor supporting 90fps in Full HD. The 40nm CMOS design contains 3.11M gates, operates at 200 MHz and consumes 91mW from a 0.93V supply.
- In Paper 4.8, Samsung delivers a multi-format video decoder supporting the AV1 video standard. The 5nm CMOS decoder supports up to 8K Ultra HD at 30fps with a 468MHz clock frequency with 0.12nJ/pixel energy efficiency operating at 0.7V.

# **Session 4 Highlights: Processors**

# [4.1] A 7nm 5G Mobile SoC Featuring a 3.0GHz Tri-Gear Application Processor Subsystem

# [4.2] A 12nm Autonomous Driving Processor with 60.4TOPS, 13.8TOPS/W CNN Executed by Task-Separated ASIL D Control

# [4.8] An Area and Energy Efficient 0.12nJ/Pixel 8K 30fps AV1 Video Decoder in 5nm CMOS Process

**Paper 4.1 Authors:** Hsinchen Chen<sup>1</sup>, Rolf Lagerquist<sup>1</sup>, Ashish Nayak<sup>1</sup>, Hugh Mair<sup>1</sup>, Gokulakrishnan Manoharan<sup>1</sup>, Ericbill Wang<sup>2</sup>, Gordon Gammie<sup>1</sup>, Efron Ho<sup>1</sup>, Anand Rajagopalan<sup>1</sup>, Lee-Kee Yong<sup>1</sup>, Ramu Madhavaram<sup>1</sup>, Madhur Jagota<sup>1</sup>, Chi-Jui Chung<sup>1</sup>, Sudhakar Maruthi<sup>1</sup>, Jenny Wiedemeier<sup>1</sup>, Tao Chen<sup>1</sup>, Henry Hsieh<sup>2</sup>, Daniel Dia<sup>2</sup>, Amjad Sikiligiri<sup>1</sup>, Manzur Rahman<sup>1</sup>, Barry Chen<sup>2</sup>, Curtis Lin<sup>2</sup>, Vincent Lin<sup>2</sup>, Elly Chiang<sup>2</sup>, Cheng-Yuh Wu<sup>2</sup>, Po-Yang Hsu<sup>2</sup>, Jason Tsai<sup>2</sup>, Wade Wu<sup>2</sup>, Achuta Thippana<sup>1</sup>, SA Huang<sup>2</sup> <sup>1</sup>MediaTek, Austin, TX, <sup>2</sup>MediaTek, Hsinchu, Taiwan

**Paper 4.2 Authors:** Katsushige Matsubara, Lieske Hanno, Atsushi Nakamura, Manabu Koike, Kazuaki Terashima, Shun Morikawa, Yoshihiko Hotta, Takahiro Irita, Seiji Mochizuki, Hiroyuki Hamasaki, Tatsuya Kamei Renesas Electronics, Kodaira, Japan

Paper 4.8 Authors: Tae Sung Kim, Seokhyun Lee, Kyungkoo Lee, Sunyoung Shin, SeungSick Jun, YongMi Lee, Seungyong Lee, Homin Kang, Changhyun Yim, Yohan Lim, Eikyung Moon, Sukhwan Lim, Kyungah Jeong, Inyup Kang, Samsung Electronics, Hwaseong, Korea

Subcommittee Chair: Thomas Burd, AMD, Santa Clara, CA

## CONTEXT AND STATE OF THE ART

- Smartphone processors utilizing 7nm CMOS and beyond break the 3GHz barrier.
- Innovation in design productivity is emerging as a tool to constrain the escalating development costs of large processor SoCs.
- Domain-specific processors play an increased role in high performance, high complexity and low power applications.

#### **TECHNICAL HIGHLIGHTS**

- MediaTek achieves full-yield 3.0GHz Arm Cortex-A78 in 7nm technology.
  - The 3.0GHz core improves performance by 15% compared to the 3, 2.6GHz balanced-performance cores.
  - o The di/dt voltage-droop mitigation techniques are improved over previous work.
- Renesas details their 12nm autonomous driving Level-3 ASIL-D capable SoC.
  - The SoC delivers a peak performance of 60.4TOPS at 13.8TOPS/W enabling air-cooled operation for automotive applications.
- Samsung delivers a 5nm multi-format decoder supporting the AV1 video standard.
  - The decoder supports 30fps 8K Ultra HD at 468MHz clock frequency with 0.12nJ/pixel energy.

## APPLICATIONS AND ECONOMIC IMPACT

- Circuit and architectural innovations enable a combination of efficiency gains and performance improvements for nextgeneration applications in 5G and ADAS.
- Combining non-volatile memories with algorithmic optimizations enables ultra-low-power always-on processing with highperformance sensor computing for new applications in IoT and biomedical electronics.
- Agile design methodologies mitigate the cost and complexity impact of modern process technologies enabling processor designs for next-generation applications.

## **Analog Subcommittee**

Session Chair: Jens Anders, University of Stuttgart, Germany

Session Co-Chair: Taeik Kim, Samsung Electronics, Korea

Session Moderator: David Blaauw, University of Michigan, MI

This session highlights advances in state-of-the-art analog interfaces. The first paper describes a power-aware high-performance humidity sensor, followed by an ultra-low-voltage capacitance-to-digital converter without external references. Next, two temperature sensors are presented, one with the most compact size ever reported for hot-spots monitoring, and the other with a high self-calibrated accuracy of up to 0.03°C using a hybrid sensor core. The following two papers report highly-efficient magnetometers for contactless current sensing. The session continues with a high-resolution MEMS Coriolis mass-flow sensor readout. The last paper introduces a high-slew single-stage amplifier for large capacitive loads, showcasing the best figures-of-merit over the state of the art.

- In Paper 5.1, Peking University describes a CMOS humidity sensor that achieves a 0.0094% RH resolution with 1.5µW power consumption, demonstrating a state-of-the-art resolution-FoM of 0.135pJ•%RH<sup>2</sup>.
- In Paper 5.2, the National University of Singapore presents a capacitance-to-digital converter (CDC) for direct harvesterpowered low-cost systems, showing a 7bit ENOB down to 0.3V at 1.37nW power without any external reference or voltageregulation requirements.
- In Paper 5.3, Delft University of Technology demonstrates a highly digital resistor-based temperature sensor that exhibits a ±1.3°C (3σ) inaccuracy from -55°C to 125°C after a 1-point trimming, while occupying only 2210µm<sup>2</sup>.
- In Paper 5.4, Delft University of Technology introduces a hybrid thermal-diffusivity and resistor-based temperature sensor. It achieves a self-calibrated inaccuracy of ±0.25°C (3σ) from -55°C to 125°C after a 2-point trimming, without requiring any external voltage/temperature references.
- In Paper 5.5, the Massachusetts Institute of Technology showcases a BW-scalable integrated-fluxgate (IFG) magnetometer for contactless current sensing, performing efficient duty-cycled operation with 100× lower compensation energy and 20× lower quiescent power at 3kS/s over prior arts.
- In Paper 5.6, Delft University of Technology reveals a fully integrated hybrid hall+coil sensor for wideband magnetic currentsensing applications. Using an S-shaped lead frame for differential field measurement, it attains a high resolution of 64mA<sub>rms</sub> at 19mW and 1.8MHz BW, corresponding to a state-of-the-art FoM of 23.1.
- In Paper 5.7, Delft University of Technology develops a MEMS Coriolis Mass-Flow Sensor suitable for both liquids and gases, featuring a 300µg/h/√Hz resolution and a ±0.8mg/h zero stability while consuming only 8.4mW.
- In Paper 5.8, KAIST reports a rail-to-rail-input high-slew single-stage amplifier that uses a parallel linear OTA and dynamic Class-C amplifier configuration. It demonstrates >100dB DC gain and 0.01-to-0.127MHz GBW over 0.8-to-10nF load with >58.6° phase margin.

# **Session 5 Highlights: Analog Interfaces**

# [5.1] 1.5µW 0.135pJ·%RH<sup>2</sup> CMOS Humidity Sensor Using Adaptive Range-Shift Zoom CDC and Power-Aware Floating Inverter Amplifier Array

# [5.3] A Highly Digital 2210 $\mu$ m<sup>2</sup> Resistor-Based Temperature Sensor with a One-Point Trimmed Inaccuracy of ±1.3°C (3 $\sigma$ ) from -55°C to 125°C in 65nm CMOS

Paper 5.1 Authors: Heyi Li<sup>1</sup>, Zhichao Tan<sup>2</sup>, Yuanxin Bao<sup>3</sup>, Han Xiao<sup>3</sup>, Hao Zhang<sup>1</sup>, Kaixuan Du<sup>1</sup>, Yihan Zhang<sup>1</sup>, Le Ye<sup>1,3</sup>, Ru Huang<sup>1</sup>

**Paper 5.1 Affiliation:** <sup>1</sup>Peking University, Beijing, China, <sup>2</sup>Zhejiang University, Hangzhou, China, <sup>3</sup>Advanced Institute of Information Tecnhology of Peking University, Hangzhou, China

Paper 5.3 Authors: Jan A. Angevare<sup>1</sup>, Youngcheol Chae<sup>2</sup>, Kofi A. A. Makinwa<sup>1</sup>

Paper 5.3 Affiliation: 1Delft University of Technology, Delft, The Netherlands, 2Yonsei University, Seoul, Korea

Subcommittee Chair: Kofi A.A. Makinwa, Delft University of Technology, Delft, The Netherlands, Analog

## CONTEXT AND STATE OF THE ART

- Paper 5.1: Energy-efficient sensing systems enable perpetual detection/monitoring in many emerging IoT applications. Low power consumption and low noise with high flexibility are the key requirements for successful system deployment.
- Paper 5.3: Microprocessors and SoCs require multiple temperature sensors to prevent overheating and ensure reliable operation. With the ever-growing demands for high performance within a small form-factor, local sensors for hot-spots monitoring with ultra-compact size and moderate accuracy are becoming a necessity within dense layouts.

### TECHNICAL HIGHLIGHTS

- Peking University describes a CMOS humidity sensor for power-aware IoT nodes achieving the best FoM among the state-of-the-art humidity sensing systems.
  - This work enables highly flexible signal readout through dynamic OTA adjustment and adaptive range-shifting techniques. With a ~1ms conversion time, it achieves the lowest power of 1.5µW and a high resolution of 0.0094% RH, featuring a FoM of 0.135 pJ·%RH<sup>2</sup> which is >6× better than the state of the art.
- Delft University of Technology introduces a resistor-based CMOS temperature sensor featuring a competitive 1-point trimmed inaccuracy of ±1.3°C (3σ) over the military range with a highly compact area of 2210µm<sup>2</sup>.
  - Fabricated in 65nm CMOS, this work demonstrates a highly digital phase-domain delta-sigma readout with ~10× better resolution and ~50× lower power than similar works. It also achieves a ~3× area reduction compared to prior resistorbased CMOS temperature sensors.

#### APPLICATIONS AND ECONOMIC IMPACT

- Advances in energy-efficient interface circuits with improved flexibility and performance enable many new sensing systems for autonomous IoT applications.
- Self-contained temperature sensors with high energy efficiency and compactness can prevent over-temperature-induced instability while boosting the system performance.

# Session 6 Overview: High Performance Receivers and Transmitters for Sub-6GHz Radios

## **Wireless Subcommittee**

Session Chair: Yiwu Tang, Qualcomm Technologies, San Diego, CA

Session Co-Chair: Yuan-Hung Chung, Media Tek, Hsinchu, Taiwan

Session Moderator: Sudhakar Pamarti, University of California, Los Angeles, CA

In sub-6GHz radios, in-band signal fidelity, out-of-band interference tolerance, large signal bandwidth and low power consumption are highly desirable for 5G and WiFi applications. This session will demonstrate advancements of the state-of-the-art in a high-drain-efficiency transmitter, in blocker-resistant receivers by N-path filtering, in spatial and spectral suppression, and in filtering by aliasing.

- In Paper 6.1, Samsung Electronics presents a 14nm FinFET CMOS transceiver supporting legacy and New-Radio (FR1) cellular and dual-mode GNSS standards with dynamic PA biasing and adaptive supply-voltage regulation to deliver low-power operation.
- In Paper 6.2, Delft University of Technology presents a +27dBm peak-output-power digital I/Q transmitter in 40nm CMOS and demonstrates 4-way Doherty power amplification to achieve 40% efficiency at 10dB power back-off.
- In Paper 6.3, the University of California, Los Angeles presents a filtering-by-aliasing receiver front-end that enables intra- and inter-band carrier aggregation with low LO leakage re-radiation. At a supply of 0.9V, +35dBm OOB IIP3 and <-81dBm LO leakage power are achieved.
- In Paper 6.4, the University of Minnesota presents a 4×4 MIMO receiver in 65nm CMOS utilizing on-chip spatial and spectral filters for improved in-band and in-beam blocker tolerance. The measured in-band / in-beam & in-band / in-notch IIP3 are 5.48dBm and 22.81dBm respectively.
- In Paper 6.5, Delft University of Technology presents a 0.4-to-3.2GHz blocker-tolerant receiver with 3<sup>rd</sup>-order filtering that supports 160MHz RF BW for 5G NR applications while achieving +13dBm of OOB IIP3.
- In Paper 6.6, Columbia University presents a 0.1-to-1GHz full-duplex receiver with 54dB self-interference cancellation across an 80MHz bandwidth. Multi-tap cancelers with switched capacitor true time delay lines that span 0.25 to 1.75ns at RF and 12.5 to 87.5ns at baseband are used.
- In Paper 6.7, Nanyang Technological University presents a 5GHz noise-canceling LNA with 1.75dB NF and 25mW power consumption using a transformer to extract the difference between drain and gate voltages of a common-source amplifier. The associated parallel resonance absorbs parasitic capacitances and provides selectivity.

# Session 6 Highlights: High Performance Receivers and Transmitters for sub-6GHz Radios

[6.1] A Low-Power and Low-Cost 14nm FinFET RFIC Supporting Legacy Cellular and 5G FR1

[6.2] A 4-Way Doherty Digital Transmitter Featuring 50%-LO Signed IQ Interleave Upconversion with more than 27dBm Peak Power and 40% Drain Efficiency at 10dB Power Back-Off Operating in the 5GHz Band

#### [6.4] A 1-to-3GHz Co-Channel Blocker Resistant, Spatially and Spectrally Passive MIMO Receiver in 65nm CMOS with +6dBm In-Band / In-Notch B<sub>1dB</sub>

**Paper 6.1 Authors:** Jongsoo Lee, Byoungjoong Kang, Seongwon Joo, Seokwon Lee, Joongho Lee, Seunghoon Kang, Ikkyun Jo, Suseop Ahn, Jaeseung Lee, Jeongyeol Bae, Wonjun Jung, Sangho Lee, Sangsung Lee, Euiyoung Park, Sungjun Lee, Jeongkyun Woo, Jaehoon Lee, Yanghoon Lee, Kyungmin Lee, Jongwoo Lee, Thomas Byunghak Cho, Inyup Kang

Paper 6.1 Affiliation: Samsung Electronics, Hwaseong-si, Korea

Paper 6.2 Authors: Mohammadreza Beikmirza, Yiyu Shen, Dieuwert Mul, Mohammadreza Mehrpoo, Mohsen Hashemi, Leonardus de Vreede, Morteza S. Alavi

Paper 6.2 Affiliation: Delft University of Technology, Delft, The Netherlands

Paper 6.4 Authors: Jitesh Poojary, Ramesh Harjani

Paper 6.4 Affiliation: University of Minnesota, Minneapolis, MN

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

## CONTEXT AND STATE OF THE ART

- The much-anticipated introduction of 5G calls for low-cost, low-power transceivers that could support the increased data rates of the new standard and be backward compatible to existing 2G/3G/4G infrastructure.
- Operating in a dense spectrum environment, the new generation of transceivers needs to withstand strong unwanted signals coming from their own or nearby transmitters without a noticeable drop in receive signal quality.
- To extend the battery life and support high data rates, advanced transmitter techniques should improve system efficiency while scaling down with technology.

## TECHNICAL HIGHLIGHTS

- Samsung introduces a highly integrated lost-cost, low-power transceiver IC supporting legacy 2G/3G/4G and newradio (FR1) communications with dual-mode global navigation satellite system (GNSS).
  - The 14nm IC employs extensive integration and on-chip power-supply-rails consolidation to reduce BOM cost, a dynamic biasing and adaptive-voltage-supply scheme to reduce power consumption, and a fully digital modem interface.
- Delft University of Technology introduces a high-output-power and high-efficiency digital I/Q transmitter
  - By employing the Doherty architecture with 4-way power combining, the transmitter achieves 40% drain efficiency at 10dB back-off from peak output power, enabling substantial reduction in power consumption when transmitting complex modulation signals with large peak-to-average power ratio.
  - Integration of 4 parallel signal paths within a small form factor is achieved using a novel area-optimized matching network/combiner design.
- The University of Minnesota presents a 4×4 MIMO receiver in 65nm CMOS utilizing on-chip spatial and spectral filters for improved in-band and in-beam blocker tolerance.
  - Blockers coming from the same direction as wanted signals are spectrally filtered, while blockers in the same band of the wanted signal are spatially filtered. The proposed architecture based on N-path mixer-first front-ends provides both types of filtering using all passive techniques for superior blocker resistance
  - In comparison to prior MIMO receivers, this design provides the best in-band / in-notch linearity with an 18dB improvement of the in-band / in-notch B<sub>1dB</sub>.

## APPLICATIONS AND ECONOMIC IMPACT

- 5G radio technology promises tens of Gb/s data-rates with a 10× reduction of latency. This will enable applications in enhanced mobile broadband, massive internet of things and mission-critical services.
- Advances in WiFi 6/6E offer higher data throughput for immersive experience applications, complement wireline and cellular technologies, and enable the delivery of services such as mesh networking and cloud management.
- Wide data bandwidth and high-reliability wireless communication facilitates health monitoring and tracking, distance learning, remote working, and virtual conferences.

# **Session 7 Overview: Imagers and Range Sensors**

## Imagers, Medical, MEMS and Displays Subcommittee

Session Chair: Vyshnavi Suntharalingam, MIT Lincoln Laboratory, Lexington, MA

Session Co-Chair: Calvin Yi-Ping Chao, TSMC, Hsinchu, Taiwan

#### Session Moderator: Bruce Rae, ST Microelectronics, Edinburgh, United Kingdom

This session covers a wide variety of imagers and range sensors for different applications. For imagers, innovations are reported achieving smaller pixel pitch, higher frame rate or increased on-board intelligence. For ranging, improved SPAD and MEMS based LIDAR are presented and improved depth sensing is achieved. The first paper describes a photodiode-based indirect Time-of-Flight (iToF) depth sensor, followed by three Single Photon Avalanche Diodes (SPAD) based range sensors: a direct Time-of-Flight (dToF) flash LiDAR, a LiDAR system using MEMS mirrors for scanning, and a flash LiDAR with smaller pitch and advanced process node. The next paper presents a SPAD-based photon-counting imager to eliminate the SNR dip problem in photodiode-based high-dynamic-range imagers, followed by a conventional color imager with high resolution, larger pixels, and low noises for digital cameras. The last three papers describe a programmable convolutional imager with near-sensor processing for embedded computer vision applications, a large-format imager for computational imaging with adaptive dynamic range control, and a conventional color imager with high resolution and smaller pixels for smartphone and mobile applications.

- In Paper 7.1, Samsung presents a 1.2Mpixel stacked iToF depth sensor with 4-tap 3.5µm pixels. It uses multiple interleaving to reduce peak current with minimal demodulation contrast (DC) degradation and multi-user interference cancellation by pseudo-random modulation. The design achieves QE of 38% at 940nm, DC of 96% at 100MHz and 80% at 200MHz modulation with a depth noise less than 0.35% at the 2×2 binning.
- In Paper 7.2, Ulsan National Institute of Science and Technology presents a 48×40 SPAD-based flash LiDAR achieving 45m detectable range and 13.5mm depth resolution. The zoom histogramming TDC incorporates a coarse SAR TDC in a long distance with a fine phase-domain depth extraction in a short distance.
- In Paper 7.3, Sony demonstrates a MEMS-based LiDAR system for autonomous driving, using a 189×600 back-illuminated, stacked SPAD sensor to measure up to 150m with 0.1% accuracy for a 10%-reflectivity and 200m with 0.1% accuracy for a 95%-reflectivity target. The sensor employs passive quenching and recharge front-end circuitry, time-correlated single-photon counting, and digital signal processing.
- In Paper 7.4, EPFL uses 7-level coincidence detection and progressive gating techniques to design a flash LiDAR with a 256×128, 7µm-pitch SPAD array stacked on a circuit layer in a 22nm process. The sensor features a shared 14b TDC with 60ps resolution, reaching 7cm depth accuracy at 100m range under 10klux background light.
- In Paper 7.5, Sony introduces a 250fps and 124dB dynamic-range, 160×264 12.24µm-pitch SPAD photon-counting image sensor with motion artifact suppression, no SNR dip, and efficient power reduction under high-light conditions. The sub-frame extrapolating architecture sufficiently decreases the number of counter bits and reduces the power consumption by a factor of 100 under high-light conditions.
- In Paper 7.6, Sony presents a 50.1Mpixel, 4.16µm-pitch, back-illuminated stacked CIS with a pipelined column-parallel kT/C noise-cancelling sample-and-hold circuit and a 14b delta-sigma ADC achieving 1.18e-rms random noise at 250fps. The design splits the pixel signal line to lower the wiring load and increase the operation speed.
- In Paper 7.7, the Université catholique de Louvain presents a 160×128 imager using in-sensor current-domain ternary-weighted MAC operations and reaching a minimum energy of 2.5pJ/pixel·frame·filter and a peak efficiency of 3.6TOPS/W. The chip can produce 8b outputs or thresholded 1b outputs for feature extraction and region-of-interest detection. It features a pixel-level current gain tunable between 40 and 100dB, which yields an inter-scene input dynamic range >73.4dB.
- In Paper 7.8, Nikon demonstrates a 4k×4k, 2.7µm pixel, back-illuminated stacked CIS with controllable integration time for each pixel block or each pixel for coded exposure applications. This design achieves a dynamic range of 110dB at 1000fps and higher than 134dB at 60fps, when block-coded exposure is performed over a period of multiple frames.
- In Paper 7.9, Samsung continues going down the pixel-scaling path and presents a back-illuminated, stacked 32Mpixel CIS with a 0.64µm pixel, isolated by full-depth deep trench isolation. Various pixel design and fabrication process improvements are developed such that the 0.64µm pixels show equivalent or even better performance comparable to the 10% larger pixels despite a 25% reduction in photodiode volume.

# **Session 7 Highlights: Imagers and Range Sensors**

# [7.1] A 4-tap 3.5µm 1.2Mpixel Indirect Time-of-Flight CMOS Image Sensor with Peak Current Mitigation and Multi-User Interference Cancellation

# [7.3] A 189×600 Back-Illuminated Stacked SPAD Direct Time-of-Flight Depth Sensor for Automotive LiDAR Systems

**Paper 7.1 Authors:** Min-Sun Keel, Daeyun Kim, Yeomyung Kim, Myunghan Bae, Myoungoh Ki, Bumsik Chung, Sooho Son, Hoyong Lee, Heeyoung Jo, Seung-Chul Shin, Sunjoo Hong, Jaeil An, Yonghun Kwon, Sungyoung Seo, Sunghyuck Cho, Youngchan Kim, Young-Gu Jin, Youngsun Oh, Yitae Kim, JungChak Ahn, Kyoungmin Koh, Yongin Park

Paper 7.1 Affiliation: Samsung Electronics, System LSI, South Korea

**Paper 7.3 Authors:** Oichi Kumagai<sup>1</sup>, Junichi Ohmachi<sup>1</sup>, Masao Matsumura<sup>1</sup>, Shinichiro Yagi<sup>1</sup>, Kenichi Tayu<sup>1</sup>, Keitaro Amagawa<sup>2</sup>, Tomohiro Matsukawa<sup>1</sup>, Osamu Ozawa<sup>1</sup>, Daisuke Hirono<sup>1</sup>, Yasuhiro Shinozuka<sup>1</sup>, Ryutaro Homma<sup>1</sup>, Kumiko Mahara<sup>2</sup>, Toshio Ohyama<sup>1</sup>, Yosuke Morita<sup>1</sup>, Shohei Shimada<sup>1</sup>, Takahisa Ueno<sup>3</sup>, Akira Matsumoto<sup>1</sup>, Yusuke Otake<sup>1</sup>, Toshifumi Wakano<sup>1</sup>, Takashi Izawa<sup>1</sup>

**Paper 7.3 Affiliation:** <sup>1</sup>Sony Semiconductor Solutions, Atsugi, Japan, <sup>2</sup>Sony LSI Design, Atsugi, Japan, <sup>3</sup>Sony Depthsensing Solutions, Brussels, Belgium

Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

## CONTEXT AND STATE OF THE ART

- Direct and indirect Time of Flight (dToF and iToF) are two major techniques for depth sensing using near-infrared light sources. Both [7.1] and [7.3] continue the trend of using backside-illuminated wafer stacking process technologies.
- The iToF design presented in [7.1] uses a 4-tap photodiode-based pixel, leading the industry towards higher resolution, smaller pitches, faster speed, lower power, and greater depth accuracy at the same time.
- The dToF design presented in [7.3] is based on the Single Photon Avalanche Diode (SPAD). It pushes the detection range towards 300m, also demonstrating a compact and fully integrated LiDAR system using MEMS mirrors.

## TECHNICAL HIGHLIGHTS

- Samsung presents a 4-tap, 3.5µm pixel iToF depth sensor using multiple interleaving to reduce peak current with minimal demodulation-contrast degradation, and multi-user interference cancellation by pseudo-random modulation.
  - The chip shows 38% QE at 940nm, DC of 96% at 100MHz and 80% at 200MHz modulation; depth noise is <0.35% at the 2×2 binning. Peak current is less than 0.9A, and depth distortion by two interferers can be rejected.
- Sony presents a SPAD dToF depth sensor employing passive quenching and recharge front-end circuitry, timecorrelated single-photon counting, and digital signal processing.
  - Under 117klux sunlight conditions, this LiDAR can measure over ranges up to 150m with 0.1% accuracy for a 10%-reflectivity target and 200m with 0.1% accuracy for a 95%-reflectivity target.

## APPLICATIONS AND ECONOMIC IMPACT

- 3D depth sensors are already incorporated into smartphones and tablets. They will drive the next wave of new consumer applications such as augmented reality, gesture control, gaming, and face recognition beyond megapixel resolutions.
- Long-range LiDAR is one of the key enabling components that will lead to autonomous driving in the foreseeable future.
#### **WLN Subcommittee**

Session Chair: Yohan Frans, Xilinx, San Jose

Session Co-Chair: Patrick Yue, Hong Kong University of Science and Technology, Hong Kong

Session Moderator: Thomas Toifl, Cisco Systems, Wallisellen, Switzerland

Subcommittee Chair: Frank O'Mahony, Intel, Hillsboro OR

As data center and telecommunication infrastructure bandwidth requirements continue to increase, networking products with 112Gb/s electrical and optical transceivers are beginning to ramp up to support 400GE and beyond. At the same time, the industry is starting to explore paths of scaling high-speed links with data rates greater than 200Gb/s.

This session starts with two papers describing the design of  $\geq$ 200Gb/s PAM-4 transmitters, one using a DSP/DAC approach in 10nm CMOS and another using an analog approach in 28nm CMOS. Another paper pushes the energy efficiency and chip area of a 112Gb/s DSP/DAC-based transmitter. Three papers in this session describe complete 112Gb/s PAM-4 electrical transceivers with emphasis on reconfigurability, power efficiency improvements, link robustness over voltage/temperature variations, and new techniques to support higher channel loss. One paper addresses DAC and ADC design for 400Gb/s coherent optical links. The session concludes with a paper describing a method to implement a large number of DFE taps in an ADC/DSP-based 112Gb/s PAM-4 receiver.

- In Paper 8.1, Intel demonstrates a 224Gb/s 4-way interleaved 7b DAC-based PAM-4 transmitter with 8-tap reconfigurable FFE in 10nm CMOS. Using a low-noise on-chip LC-PLL, inductive clock distribution network, two-stage 4:1 MUX with active peaking, and group-delay-optimized output matching network, it achieves 65fs<sub>rms</sub> random jitter and RLM/SNDR of 0.99/33.3dB at 1.9pJ/b energy efficiency.
- In Paper 8.2, University of California, Berkeley presents a 200Gb/s PAM-4 transmitter in 28nm CMOS. The design incorporates
  pull-up current sources to improve output bandwidth and swing, achieving >52.9mV eye height, 0.36UI eye width, and ~99%
  RLM under ~6dB channel loss at 50GHz.
- In Paper 8.3, IBM describes a 112Gb/s PAM-4 SST transmitter with 8b DAC driver in 7nm CMOS with up to 16-tap digital FFE in NRZ mode. Using metal gate resistors, a DSP equalizer with gated FFE LUT logic, and reference DAC weight scaling for clock power reduction, it occupies 0.032mm<sup>2</sup> and consumes 1.4pJ/b.
- In Paper 8.4, Huawei presents a reconfigurable PAM-4/Duo-PAM-4/NRZ ADC/DSP-based long-reach transceiver. It achieves BER ≤1E-05 at 112Gb/s in PAM-4 or Duo-PAM-4 across a 45dB loss channel, and <1E-15 at 56Gb/s PAM-2 over a 52dB loss channel without FEC, while consuming <6pJ/b.</li>
- In Paper 8.5, eTopus demonstrates a fully integrated, adaptive 1.25-to-56/112Gb/s PAM-4 ADC/DSP-based transceiver in 16nm and 7nm CMOS. Using decision-directed MMSE CDR, it achieves CDR lock at 2E-2 BER at 56.25Gb/s PAM-4 over 51.9dB channel loss and was tested over temperature cycles between 0°C and 100°C at 10°C/minute.
- In Paper 8.6, Inphi presents reconfigurable 40-to-97GS/s 8b DACs and ADCs which are fully integrated in a 7nm FinFET DSP chip targeting 400Gb/s coherent optical links. It achieves 40GHz AFE bandwidth and a Walden FOM <35fJ/conv-step.</li>
- In Paper 8.7, Inphi presents a 112Gb/s ADC/DSP-based PAM-4 transceiver in 7nm FinFET. Incorporating a 64-way timeinterleaved SAR ADC, it consumes 6.51pJ/b including analog and digital power while operating over a channel with >40dB insertion loss.
- In Paper 8.8, Huawei presents a 112Gb/s PAM-4 ADC/DSP-based receiver in 7nm FinFET with 9 DFE taps in the DSP. The DSP uses a method for pipelining DFE operations which allows the implementation of a large number of DFE taps, improving link performance beyond the traditional many-tap FFE + 2-tap DFE while simultaneously reducing power and area.

### Session 8 Highlights: Ultra-High-Speed Wireline

#### [8.1] A 224Gb/s DAC-Based PAM-4 Transmitter with 8-Tap FFE in 10nm CMOS

## [8.4] A 116Gb/s DSP-Based Wireline Transceiver in 7nm CMOS achieving 6pJ/b at 45dB Loss in PAM-4/Duo-PAM-4 and 52dB in PAM-2

**Paper 8.1 Authors:** Jihwan Kim<sup>1</sup>, Sandipan Kundu<sup>1</sup>, Ajay Balankutty<sup>1</sup>, Matthew Beach<sup>2</sup>, Bong Chan Kim<sup>1</sup>, Stephen Kim<sup>1</sup>, Yutao Liu<sup>1</sup>, Savyassachi Keshava Murthy<sup>1</sup>, Priya Wali<sup>1</sup>, Kai Yu<sup>1</sup>, Hyung Seok Kim<sup>1</sup>, Chuan-chang Liu<sup>1</sup>, Dongseok Shin<sup>1</sup>, Ariel Cohen<sup>3</sup>, Yongping Fan<sup>1</sup>, Frank O'Mahony<sup>1</sup>

Paper 8.1 Affiliation: <sup>1</sup>Intel, Hillsboro, OR, <sup>2</sup>Foundation Devices, Boston, MA, <sup>3</sup>Intel, Jerusalem, Israel

**Paper 8.4 Authors:** Marc-Andre LaCroix, Euhan Chong, Weilun Shen, Ehud Nir, Faisal Ahmed Musa, Haitao Mei, Mohammad-Mahdi Mohsenpour, Semyon Lebedev, Babak Zamanlooy, Carlos Carvalho, Qian Xin, Dmitry Petrov, Henry Wong, Huong Ho, Yang Xu, Sina Shahi, Peter Krotnev, Chris Feist, Howard Huang, Davide Tonietto

Paper 8.4 Affiliation: Huawei Technologies, Ottawa, ON, Canada

Subcommittee Chair: Frank O'Mahony, Intel, Hillsboro, OR

#### CONTEXT AND STATE OF THE ART

- After the emergence of 112Gb/s wireline links in the past three years, the first components are demonstrated to enable 224Gb/s data rate
- 112Gb/s DSP-based wireline links were demonstrated to work at high channel loss (>45dB) and low power consumption (6pJ/b).

#### TECHNICAL HIGHLIGHTS

- Intel presents a first DAC-based PAM-4 transmitter operating at 224Gb/s.
  - The transmitter is based on a 4-way interleaved 7b DAC-based PAM-4 was implemented in 10nm CMOS technology. The TX achieves 65fs<sub>rms</sub> RJ from a 56GHz clock pattern with 1<sup>st</sup>-order 4MHz CDR, and RLM/SNDR of 0.99/33.3dB at 224Gb/s PAM-4 operation with an energy efficiency of 1.9pJ/b.
- Huawei Technologies demonstrate a DSP-based 116Gb/s transceiver for long reach channels with <6pJ/b energy efficiency.
  - The transceiver is capable of operating with BER ≤1E-05 at 112Gb/s in PAM-4 or Duobinary PAM-4 across a 45dB loss channel, and 58Gb/s NRZ at <1E-15 over a 52dB loss channel without FEC. The combined power consumption of the analog part and the DSP is below 6pJ/b. The transceiver achieves high TX swing (1.23V) and 22dB peaking in the RX CTLE.</p>

- With the demand for bandwidth in data centers, and cloud and high-performance computing still exponentially growing in the 5G era, doubling the data rate of wireline links from 112Gb/s to 224Gb/s is key to enable next-generation systems.
- At the same time, the energy efficiency (pJ/b) has to be improved to stay within the thermal budget of the chip and the system. Advanced circuit techniques and DSP-based transceivers in FinFET technologies are key components to achieve low BER over high-loss channels with low power consumption.

### **Session 9 Overview: ML Processors From Cloud to Edge**

Session Chair: SukHwan Lim, Samsung, Suwon, South Korea

Session Co-Chair: Luca Benini, ETH Zurich, Zurich, Switzerland

#### Session Moderator: Vivienne Sze, MIT, Cambridge, MA

Significant progress has been made in machine learning processor design in two different but important topic areas. The first addresses flexible accelerators for inference and training in the most advanced CMOS technology nodes (e.g. 5nm and 7nm) for mobile and the cloud. The second topic area covers application-specific acceleration engines for ultra-low-power applications, including wearable devices. This session comprises nine papers, covering a diverse set of neural networks targeted at a wide range of applications, including gesture recognition, smart cameras, speech-to-text and keyword spotting.

- In Paper 9.1, IBM Research presents a 7nm 4-core AI chip that offers separate floating-point and fixed-point pipelines to enable
  performance improvements without model accuracy degradation, while a high-bandwidth on-chip ring maintains high compute
  utilization. Workload-aware throttling is used to maximize performance within a specified power envelope. Their 19.6mm<sup>2</sup> chip
  demonstrates up to 3.5TFLOPS/W and up to 25.6TFLOPS hybrid fp8 iso-accuracy training, up to 16TOPS/W and 102TOPS
  int4 inference, as well as support for fp16, fp32 and int2 computation.
- In Paper 9.2, Tsinghua University describes a 28nm 12.1TOPS/W CNN processor employing effective-weight convolution and error-compensation prediction, eliminating >90% multiplications compared to prior CNN implementations with <1% additional overhead. A residual pipeline mode is used to avoid the need for off-chip memory accesses within residual blocks, allowing hardware utilization to be maintained near 100%. The processor consumes 1.9mm<sup>2</sup> and 131.6mW at 470MHz, achieving improvements in TOPS/W energy-efficiency, GOPS/mm<sup>2</sup> area-efficiency, and energy-per-frame with respect to state-of-the-art 8b CNN processors.
- In Paper 9.3, Seoul National University presents an 8b floating point training processor in 40nm technology for state-of-the-art non-sparse neural networks, achieving 69.0% ResNet-18 Top-1 accuracy on ImageNet and requiring 43% less memory access by use of 8b tensors. By combining a 24-way fused multiply-add tree with a flexible 2D routing scheme, their 7.29mm<sup>2</sup> chip achieves 4.81TFLOPS/W energy efficiency and 2.48× higher training efficiency than prior work.
- In Paper 9.4, KU Leuven introduces a 3.8mm<sup>2</sup> probabilistic inference unit (PIU) in 28nm technology, targeted at exact inference of irregular probabilistic sum-product networks (SPNs). To accelerate these workloads, PIU decouples compute/load/store streams with a co-optimized memory hierarchy, employs precision-scalable posit arithmetic for accurate manipulation of low probability values, and aligns the parallel 8,16, and 32b floating-point computations with compiler optimizations. The resulting energy efficiency is 1173× and 271× higher than RTX2080 GPU and CPU, respectively, reaching a peak of 248GOPS/W, 33.7GOPS and 8.9GOPS/mm<sup>2</sup>.
- In Paper 9.5, Samsung Electronics presents a 5.46mm<sup>2</sup> reconfigurable neural processing unit (NPU) in 5nm technology for mobile SoCs. Their NPU has 3 cores with 6k MACs in total, boosting performance and energy-efficiency by feature-map zero skipping, a reconfigurable adder-tree-based datapath, feature map compression and fast resource scheduling. With all three NPU cores deployed, their NPU achieves high performance (623 inferences/s) and high energy efficiency (13.6TOPS/W) on an 8b Inception-V3 model.
- In Paper 9.6, Sony Semiconductor Solutions describes a 62mm<sup>2</sup> stacked-chip solution, combining a back-illuminated (BI) pixel CMOS image sensor at 4056×3040 resolution and 1.55µm pixel-pitch, together with 4.97TOPS/W of 22nm digital signal processing of 8b, 16b, or 32b integers in two different DSP cores, multiple tensor direct memory access (TDMA) engines and a scheduler optimized for convolutional neural network (CNN) processing. Capable of 120fps with parallel image readout and CNN inference, their system can offer AI processing capability at reduced size, power and cost, while addressing privacy concerns.
- In Paper 9.7, Nanyang Technological University presents a real-time ultra-low-power hand-gesture recognition system for wearable and IoT devices, combining an Edge-CNN core, a decision tree core and an error-tolerant sequence analyzer. Their 1.5mm<sup>2</sup> chip in 65nm technology achieves 184µW at 0.6V, and can recognize 24 dynamic gestures with an accuracy of 92.6% for hand-motion speed within 30-40cm/s.

- In Paper 9.8, Harvard University describes a 21.8mm<sup>2</sup> SoC in 16nm technology for noise-robust speech recognition that includes a Bayesian source separation engine optimized for unsupervised speech-denoising via Gibbs sampling using 32b fixed-point, as well as a reconfigurable processor optimized for whole-model acceleration of large vocabulary bidirectional attention-based DNNs using 8b floating-point. The proposed speech-enhancing automated speech recognition (ASR) pipeline denoises noisecorrupted speech with 7.3dB signal-to-distortion-ratio (SDR), while achieving 18ms end-to-end latency and consuming 2.24mJ of energy per frame.
- In Paper 9.9, Columbia University presents a 0.72mm<sup>2</sup> always-on keyword spotting (KWS) chip in 65nm which pairs a normalized acoustic feature extractor (NAFE) featuring divisive energy normalization (DN) together with a spiking neural network classifier. Their chip exhibits 570nW power dissipation and robustness to process variation, while maintaining high accuracy (96.5% for 1 keyword, 90.2% for 4 keywords) across a variety of strong background-noise environments.

### Session 9 Highlights: ML Processors from Cloud to Edge

#### [9.1] A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling

#### [9.5] A 6k-MAC Feature-Map-Sparsity-Aware Neural Processing Unit in 5nm Flagship Mobile SoC

**Paper 9.1 Authors:** Ankur Agrawal<sup>1</sup>, Sae Kyu Lee<sup>1</sup>, Joel Silberman<sup>1</sup>, Matthew Ziegler<sup>1</sup>, Mingu Kang<sup>1</sup>, Swagath Venkataramani<sup>1</sup>, Nianzheng Cao<sup>1</sup>, Bruce Fleischer<sup>1</sup>, Michael Guillorn<sup>1</sup>, Matthew Cohen<sup>1</sup>, Silvia Mueller<sup>2</sup>, Jinwook Oh<sup>1</sup>, Martin Lutz<sup>1</sup>, Jinwook Jung<sup>1</sup>, Siyu Koswatta<sup>1</sup>, Ching Zhou<sup>1</sup>, Vidhi Zalani<sup>1</sup>, James Bonanno<sup>3</sup>, Robert Casatuta<sup>4</sup>, Chia-Yu Chen<sup>1</sup>, Jungwook Choi<sup>5</sup>, Howard Haynie<sup>6</sup>, Alyssa Herbert<sup>1</sup>, Radhika Jain<sup>1</sup>, Monodeep Kar<sup>1</sup>, Kyu-Hyoun Kim<sup>1</sup>, Yulong Li<sup>1</sup>, Zhibin Ren<sup>1</sup>, Scot Rider<sup>7</sup>, Marcel Schaal<sup>1</sup>, Kerstin Schelm<sup>3</sup>, Michael Scheuermann<sup>1</sup>, Xiao Sun<sup>1</sup>, Hung Tran<sup>1</sup>, Naigang Wang<sup>1</sup>, Wei Wang<sup>1</sup>, Xin Zhang<sup>1</sup>, Vinay Shah<sup>8</sup>, Brian Curran<sup>7</sup>, Vijayalakshmi Srinivasan<sup>1</sup>, Pong-Fei Lu<sup>1</sup>, Sunil Shukla<sup>1</sup>, Leland Chang<sup>1</sup>, Kailash Gopalakrishnan<sup>1</sup>

**Paper 9.1 Affiliation:** <sup>1</sup>IBM Research, Yorktown Heights, NY, <sup>2</sup>IBM, Boeblingen, Germany, <sup>3</sup>IBM, Austin, TX, <sup>4</sup>IBM, Hopewell Junction, <sup>5</sup>Hanyang University, Seoul, Korea, <sup>6</sup>IBM, Poughkeepsie, NY, <sup>7</sup>IBM, Poughkeepsie, <sup>8</sup>IBM, Hursley, United Kingdom

**Paper 9.5 Authors:** Jun-Seok Park<sup>1</sup>, Jun-Woo Jang<sup>2</sup>, Heonsoo Lee<sup>1</sup>, Dongwoo Lee<sup>1</sup>, Sehwan Lee<sup>2</sup>, Hanwoong Jung<sup>2</sup>, Seungwon Lee<sup>2</sup>, Suknam Kwon<sup>1</sup>, Kyungah Jeong<sup>1</sup>, Joon-Ho Song<sup>2</sup>, SukHwan Lim<sup>1</sup>, Inyup Kang<sup>1</sup>

Paper 9.5 Affiliation: <sup>1</sup>Samsung Electronics, Hwaseong-si, Korea, <sup>2</sup>Samsung Advanced Institute of Technology, Suwon, Korea

Subcommittee Chair: Marian Verhelst, KU Leuven - MICAS, Leuven, Belgium

#### CONTEXT AND STATE OF THE ART

- Machine learning is becoming increasingly important, both in the cloud, as well as within mobile devices at the edge. It is being
  used in many different applications, such as image processing, object/face detection and recognition, speech/audio processing
  and user interactions such as gesture recognition.
- Gains in area and power efficiency are critical enablers for allowing machine learning to be used in various application domains due to thermal envelope and battery lifetime constraints. Efficiency is pursued through a combination of advanced nodes (e.g. 5nm) and architectural/circuit techniques, such as exploiting input, weight and output sparsity, low precision (e.g. int8, fp8, int4, binary) and heterogeneous cores.
- While performance and power requirements for cloud, mobile and always-on wearable devices tend to differ by a few orders of magnitude, there is a consistent across-the-board push toward extreme energy efficiency (over 10TOPS/W).

#### TECHNICAL HIGHLIGHTS

- IBM Research presents a 7nm 4-core AI chip that offers separate hybrid fp8 floating-point pipelines for iso-accuracy training, and int4 fixed-point pipelines for fast inference without model accuracy degradation, both at high compute utilization.
  - Workload-aware throttling is used to maximize performance available within a specified power envelope.
  - Their chip demonstrates up to 3.5TFLOPS/W and up to 25.6TFLOPS hybrid fp8 iso-accuracy training, up to 16TOPS/W and 102TOPS int4 inference, as well as support for fp16, fp32 and int2 computation.
- Samsung presents a reconfigurable 3-core neural processing unit (NPU) in 5nm mobile SoC offering 6,000 MACs in total.
  - Feature-map zero skipping, reconfigurable adder-tree-based datapath, feature map compression, and fast resource scheduling boost both performance and energy-efficiency.
  - With all three NPU cores deployed, their NPU achieves high performance (623 inferences/s) and high energy efficiency (13.6TOPS/W) on an 8b Inception-V3 model.

- Deep-learning accelerators for training and inference presented by IBM Research enable performance improvements without
  model accuracy degradation, while a high-bandwidth on-chip ring maintains high compute utilization. The flexible heterogeneous
  architecture achieves high performance on a wide range of numerical precisions, while meeting stringent thermal design power
  constraints for dense datacenter deployment.
- The 3-core deep-learning accelerator demonstrated by Samsung enhances versatile AI applications on 5G smartphones, with higher energy efficiency helping to run a larger mix of heavier workloads in real-time.

#### Data Converter Subcommittee

Session Chair: Seyfi Bazarjani, Qualcomm Technologies, San Diego, CA

Session Co-Chair: Jongwoo Lee, Samsung Electronics, Korea

Session Moderator: Marco Corsi, Texas Instruments, Parker, TX

Continuous-time  $\Delta\Sigma$  ADCs offer compact silicon area and low power consumption for various applications. The first two papers in this session present different techniques in continuous-time  $\Delta\Sigma$  ADCs for audio applications and achieve impressive figures of merit. The third paper describes a new hybrid CT/DT loop architecture for a  $\Delta\Sigma$  modulator without requiring any calibration or tuning. The fourth paper proposes a CT loop filter with DT noise shaping achieving 4<sup>th</sup>-order noise shaping using a single OTA. The fifth paper introduces a pipelined ADC with a 1<sup>st</sup> stage SAR and 2<sup>nd</sup> stage 2×-time-interleaved CT incremental  $\Delta\Sigma$  ADC. The last two papers in this session describe techniques utilized to achieve high-linearity multi-GHz DACs. The sixth paper describes a 16GS/s DAC for software radio base stations. The last paper presents a 64GS/s DAC for BIST of RF sampling ADCs.

- In Paper 10.1, Samsung Electronics presents a 116µW audio-band CT ΔΣ ADC using a tri-level current-steering DAC with a gate-leakage-compensated noise filter that achieves 104.4dB dynamic range in 24kHz bandwidth resulting in a 187.5dB Schreier FoM.
- In Paper 10.2, University of California San Diego reports a CT ΔΣM for audio applications with chopped AC-coupled OTAstacking and FIR DACs, which achieves 104.8dB dynamic range in 24kHz bandwidth and consumes 139µW resulting in a 187.2dB Schreier FoM.
- In Paper 10.3, University of Michigan describes a tuning-free hybrid-loop ΔΣ modulator with an interleaved bandpass noiseshaping SAR quantizer that achieves 68dB SNDR in a 100MHz bandwidth.
- In Paper 10.4, University of Texas at Austin presents a 4th-order CT ΔΣ modulator with single-OTA and 2<sup>nd</sup>-order NS-SAR that achieves 81dB-SNDR in 12.5MHz bandwidth and consumes 3.7mW.
- In Papers 10.5, Samsung Electronics reports a 12b 600MS/s pipelined SAR and 2×-interleaved Incremental ΔΣ ADC in 7nm FinFET.
- In Paper 10.6, Intel describes a 12b 16GS/s RF-sampling capacitive DAC for multi-band soft-radio base-stations in 16nm FinFET.
- In paper 10.7, Intel presents a 64GS/s 4×-interpolated 1b semi-digital FIR DAC for wideband calibration and BIST of RFsampling A/D converters.

### Session 10 Highlights: Continuous-Time ADCs and DACs

[10.1] A 116µW 104.4dB-DR 100.6dB-SNDR CT ΔΣ Audio ADC using Tri-Level Current-Steering DAC with Gate-Leakage-Compensated Off-Transistor-Based Bias Noise Filter

#### [10.6] A 12b 16GS/s RF-Sampling Capacitive DAC for Multi-Band Soft-Radio Base-Station Applications with On-Chip Transmission-Line Matching Network in 16nm FinFET

Paper 10.1 Authors: Chilun Lo, Jongmi Lee, Yong Lim. Younghyun Yoon, Hyunseok Hwang, JaeHoon Lee, MooYeol Choi, Myung-Jin Lee, Seunghyun Oh, Jongwoo Lee

Paper 10.1 Affiliation : Samsung Electronics, Hwasung-si, Gyeonggi-do, Korea

**Paper 10.6 Authors:** Daniel Gruber<sup>1</sup>, Martin Clara<sup>2</sup>, Ramon Sanchez<sup>3</sup>, Yu-shan Wang<sup>4</sup>, Christoph Duller<sup>1</sup>, Gerald Rauter<sup>1</sup>, Patrick Torta<sup>1</sup>, Kamran Azadet<sup>2</sup>

Paper 10.6 Affiliation: 1Intel, Villach, Austria, 2Intel, Santa Clara, CA, 3Intel, Madrid, Spain, 4Intel, Hillsboro, OR

Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI

#### CONTEXT AND STATE OF THE ART

- A high-dynamic-range audio ADC with minimized power consumption and reduced BoM cost is essential for True Wireless Stereo applications. However, a dedicated reference generator for a resistive DAC consumes extra power, area, and might still need external components for filtering.
- Wideband RF-sampling DACs are implemented in current-steering architectures with the transistor stack limited by supply
  voltage and with extensive calibration infrastructure. The current-source array makes true wideband matching at the RF output
  difficult. Digital transmitters have used CDACs for synthesis of high-power RF signals of moderate bandwidth of up to 160MHz.

#### TECHNICAL HIGHLIGHTS

- Samsung introduces a low-power, high-DR audio ADC for flicker-noise-sensitive applications.
  - $_{\odot}$  The ADC achieves 100.6dB SNDR and 104.4dB DR operating at 6.144MS/s while consuming 116  $\mu$ W.
- Intel presents the first RF sampling capacitive DAC with state-of-the-art performance. It is scaling-friendly and is
  operated from a single supply while having the smallest reported area of published direct-sampling RF DACs.
  - A capacitive DAC as a direct RF-sampling DAC with a moderate output power level for direct signal synthesis over a bandwidth from 0.5GHz up to at least 8GHz. The DAC achieves the first demonstration of a wideband DAC with digital pre-distortion (DPD).

- The advanced circuit design techniques without extra external components enables compact PCB design, high energy efficiency that allows for longer battery life or the operation of more channels for active-noise-cancelling (ANC) features, makes it suitable for True Wireless Stereo applications.
- The RF sampling capacitive DAC with wideband DAC DPD can be applied to 5G base-station applications with scaling-friendly process.

# Session 11 Overview: Advanced Wireline Links and Techniques

#### Wireline Subcommittee

Session Chair: Mike Shuo-Wei Chen, University of Southern California, Los Angeles, CA

Session Co-Chair: Wei-Zen Chen, National Chiao Tung University, Hsinchu, Taiwan

Session Moderator: Amir Amirkhany, Samsung Electronics, San Jose, USA

Subcommittee Chair: Frank O'Mahony, Intel, Hillsboro, USA

High-performance SerDes with both high area efficiency (mm<sup>2</sup>/lane) and energy efficiency (pJ/b) are driven by the ever-increasing demands of bandwidth and capacity in data centers. They also enable chiplets, multi-die, and silicon-photonics integration for a low cost, high yield, and high throughput solution. Besides, low-power SerDes is essential to overall system power savings by reducing the power overhead and cost for cooling. This session introduces advanced wireline techniques that support both high-speed and energy-efficient data transmission over electrical, fiber, and dielectric waveguide channels. The first three papers of the session describe short-reach power- and density-optimized transceivers in state-of-the-art 7nm FinFET technology. The next two describe low-power clock generators for high-speed transceivers. The remaining four papers of the session focus on design solutions to enable future high-speed link scaling, including optical and dielectric waveguides, ultra-low power CDRs, and the potential for >50Gb/s simultaneous, bidirectional signaling over high-loss channels

- In Paper 11.1, MediaTek describes a short-reach (XSR) PAM-4 transceiver operating at 112Gb/s with 1.7pJ/bienergy efficiency. It features a delay-line based continuous-time linear equalizer and automatic calibration algorithms for performance optimization and low power operation.
- In Paper 11.2, Rambus shows an eight-lane 106.25Gb/s/lane XSR SerDes macro aimed at enabling co-packaged optics. The corresponding beachfront throughput is 722Gb/s/mm and energy efficiency is 1.55pJ/b.
- In Paper 11.3, Cadence proposes a short-reach single-ended signaling macro that uses spatial 6b/7b encoding to minimize ground reference noise. The link operates at 40Gb/s/pin with a 1.7pJ/b energy efficiency and provides 480Gb/s/mm bandwidth density.
- In Paper 11.4, Columbia University describes a 7b phase interpolator (PI) using a quadrature delay line coupled with a multiphase injection-locked ring oscillator. The measured linearity of the PI is 1.76LSB INLpp and 1.13LSB DNLpp at 7GHz, which can be utilized in low-jitter high-phase-accuracy clock and data recovery.
- In Paper 11.5, Intel presents a digital PLL with a current-reuse LC-VCO coupled into a frequency doubler. A frequency tracking loop is proposed to optimize phase noise performance across 23.9-29.4GHz, and it achieves a 65fs<sub>rms</sub> random jitter at a transmitter output.
- In Paper 11.6, Intel demonstrates a single chip 100Gb/s PAM-4 optical receiver in a 28nm bulk CMOS process. It employs a 2tap FFE and 2-tap direct feedback DFE to achieve -8.3dBm optical sensitivity at 2.4E-4 BER.
- In Paper 11.7, UCLA demonstrates a high-performance NRZ CDR combined with a high-pass feedforward CTLE and a dual loop DFE in 28nm CMOS technology. It is capable of equalizing 25dB channel loss at 28GHz while providing a bathtub eye opening of 0.4UI at BER<1E-12.</li>
- In Paper 11.8, Marvell demonstrates a TX signal and echo cancellation scheme to enable simultaneous bidirectional signaling across a lossy channel at up to 56Gb/s PAM-4 in each direction, or 112Gb/s aggregate per channel. A novel analog hybrid canceler suppresses the TX signal and its dominant reflections by 26dB across the 14GHz signal band.
- In Paper 11.9, MIT shows a 105Gb/s link operating over a 30cm-long dielectric ribbon waveguide. The link, implemented in 130nm BiCMOS, modulates and demodulates data across three channels covering a 220-to-340GHz frequency band with each channel carrying 35Gb/s.

# Session 11 Highlights: Advanced Wireline Links and Techniques

#### [11.1] A 1.7pJ/b 112Gb/s XSR Transceiver for Intra-Package Communication in 7nm FinFET Technology

### [11.6] A 100Gb/s -8.3dBm-Sensitivity PAM-4 Optical Receiver with Integrated TIA, FFE and Direct-Feedback DFE in 28nm CMOS

**Paper 11.1 Authors:** Ramy Youstry<sup>\*1</sup>, Ehung Chen<sup>\*1</sup>, Yu-Ming Ying<sup>1</sup>, Mohammed Abdullatif<sup>1</sup>, Mohammad Elbadry<sup>1</sup>, Ahmed ElShater<sup>1</sup>, Tsz-Bin Liu<sup>2</sup>, Joonyeong Lee<sup>1</sup>, Dhinessh Ramachandran<sup>1</sup>, Kaiz Wang<sup>2</sup>, Chih-Hao Weng<sup>2</sup>, Mau-Lin Wu<sup>2</sup>, Tamer Ali<sup>1</sup>,

Paper 11.1 Affiliation: 1MediaTek, Irvine, CA, 2MediaTek, Hsinchu, Taiwan

Paper 11.6 Authors: Hao Li, Jahnavi Sharma, Chun-Ming Hsu, Ganesh Balamurugan, James Jaussi

Paper 11.6 Affiliation: Intel, Hillsboro, OR

Subcommittee Chair: Frank O'Mahony, Intel, Hillsboro, OR

#### CONTEXT AND STATE OF THE ART

- The speed of the state-of-the-art extra-short reach (XSR) transceivers is extended to 112Gb/s with energy efficiency below 2pJ/b to meet the demand of low-power and high-speed on-package chip-to-chip interconnect for data center or telecommunication infrastructure. The speed, power-efficiency, and area are required to keep pace with the increasing demand for affordable solutions for the exponentially increasing data traffic in heavy computing systems.
- Integration of energy-efficient optical transceivers at 100Gb/s in CMOS can provide a low-cost and power-efficient solution for 400G Ethernet standards. Adoption of more advanced technology and continuing circuit innovation will further improve power, area, and cost of optical interconnects.

#### TECHNICAL HIGHLIGHTS

- MediaTek presents a first XSR transceiver operating at 112Gb/s while consuming only 1.7 pJ/b.
  - The transceiver achieves a data rate of 112Gb/s over a 50mm package trace while consuming only 1.7pJ/b and occupying 0.228mm<sup>2</sup>. The energy-efficient PAM-4 short-reach link was enabled by a 5b SST DAC transmitter with 5-tap DSP-based FIR, a receiver with a delay-line based CTLE and two-stage sense-amplifier, and various calibration algorithms in 7nm FinFET technology.
- Intel demonstrates a first single-chip solution of a PAM-4 optical receiver in 28 nm CMOS, achieving 100Gb/s and 3.9pJ/b.
  - The first single-chip PAM-4 optical receiver implemented in 28nm CMOS technology achieves the fastest data rate of 100Gb/s at the lowest power of 3.9pJ/b with -8.3dBm sensitivity at 2.4x10<sup>-4</sup> bit-error-rate. The performance was achieved at low power without DSP while incorporating advanced analog circuit techniques including a shunt-feedback linear TIA, 2-tap FFE, and 2-tap direct-feedback DFE that utilizes an integrating S/H and summer.

- As bandwidth demand for medium-reach and long-reach interconnect increases in data center or 5G switch applications, the
  data traffic crowded in a small package area also need to support ultra-high bandwidth under limited power budget and hardware
  resources. Extreme short-reach links that bring energy-per-bit down below 2pJ/b as well as the chip area below 0.250mm<sup>2</sup> will
  be the one of the key enablers of affordable next-generation data centers and 5G switches.
- 100Gb/s PAM-4 optical links are necessary to meet 400G Ethernet standards such as 400G-DR4/FR4 while data center
  applications require reduced power consumption and hardware cost in such applications. A single-chip solution of a PAM-4
  optical receiver integrated in 28nm bulk CMOS technology will substantially reduce the hardware cost while the energy-efficient
  analog equalization, such as a 100Gb/s direct feedback PAM-4 DFE, will further bring down the power and area compared with
  DSP-based designs.

#### **Technology Directions Subcommittee**

Session Chair: Sriram Vangal, Intel Corporation, Hillsboro, USA

Session Co-Chair: Long Yan, Samsung Electronics, Hwaseong-Si, Korea

Session Moderator: Frederic Gianesello, STMicroelectronics, Crolles, France

Emerging IoT systems demand higher levels of energy-efficiency and security for mobile applications. The first paper describes a lowpower event-driven wake-up IC, followed by a paper proposing reflective and MIMO antenna arrays to improve the efficiency of Wi-Fi backscattering systems. The final paper addresses key security challenges by presenting an advanced PUF solution using spectral regrowth of a power amplifier as the RF fingerprint for IoT devices.

- In Paper 12.1, Peking University presents a 148nW general-purpose event-driven intelligent wake-up chip. An asynchronous spike-based feature extractor and CNN-based intelligent inference engine achieves a keyword hit rate of up to 94.1% and 99.7% abnormal ECG wake-up hit rate.
- Paper 12.2, the University of California, San Diego introduces a 38µW IC that performs both fully reflective single-side-band (SSB) Wi-Fi backscattering for single antenna designs, and a retro-reflective SSB Wi-Fi backscattering using a MIMO antenna array, providing 4 and 15 dB improvements over prior-art, resulting in communication ranges beyond 20m.
- Paper 12.3, Rice University demonstrates a 2.4GHz PA whose spectral regrowth is used as the RF fingerprint for robust physical-layer identification. A reliable >11.5dB out-of-band leakage power variation and <1.5dB in-band variation is achieved. Sixteen unique PUF settings are measured per chip, showing a close-to-uniform distribution and 5% false identification rate.

### Session 12 Highlights: Innovations in Low-power & Secure IoT

#### [12.1] A 148nW General-Purpose Event-Driven Intelligent Wake-Up Chip for AloT Devices Using Asynchronous Spike-Based Feature Extractor and Convolutional Neural Network

**Paper Authors:** Zhixuan Wang<sup>1</sup>, Le Ye<sup>1,2</sup>, Ying Liu<sup>1</sup>, Peng Zhou<sup>2</sup>, Zhichao Tan<sup>3</sup>, Haitao Fan<sup>2</sup>, Yihan Zhang<sup>1</sup>, Jiayoon Ru<sup>4</sup>, Yangyuan Wang<sup>1</sup>, Ru Huang<sup>1</sup>

**Paper Affiliation:** <sup>1</sup>Peking University, Beijing, China, <sup>2</sup>Advanced Institute of Information Technology of Peking University, Hangzhou, China, <sup>3</sup>Zhejiang University, Hangzhou, China, <sup>4</sup>XINYI Information Technology, Shanghai, China

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan, Technology Directions

#### CONTEXT AND STATE OF THE ART

- Ultra-low power is a strong requirement for smart wearables and edge IoT devices.
- Emerging event-driven approaches use intelligent wake-up circuitry to enable ultra-low power and intelligent event detection for IoT devices operating in random-sparse-event (RSE) environments.

#### TECHNICAL HIGHLIGHTS

- A novel, general-purpose event-driven intelligent wake-up chip demonstrates a best-in-class low power of 148nW. Additional innovations include:
  - o User event-based processing for power saving in low-signal-rate edge applications.
  - The IC architecture features a level-crossing ADC feeding an asynchronous instant rate of change feature extractor that activates a CNN-based intelligent Inference Engine.
  - Excellent hit rates are achieved both for arrhythmia detection (92-99%) and keyword spotting (82-94%) applications.

- Intelligent nano-watt ICs can open up new health-monitoring applications like abnormal heart rate detection and continuous epilepsy monitoring.
- To scale to a trillion IoT nodes, devices must untether from batteries. Such intelligent nano-watt ICs, when augmented with energy harvesting circuitry, can help usher in the next generation of innovative, battery-less IoT sensor nodes.

# Session 13 Overview: Cryo-CMOS for Quantum Computing

#### **Technology Directions Subcommittee**

Session Chair: Denis Daly, Apple, Cambridge, MA

Session Co-Chair: Shawn S.H. Hsu, National Tsing Hua University, Hsinchu, Taiwan

#### Session Moderator: Edoardo Charbon, EPFL, Switzerland

Cryogenic CMOS (cryo-CMOS) support of quantum processors is becoming a necessity to ensure the continuous growth of qubit count, so as to achieve scalable, fault-tolerant quantum computers. The first paper of the session describes an integrated controller for spin qubits that performs state manipulation, readout, and gate pulsing fabricated in 22nm FinFET CMOS technology. The second paper presents a fully integrated SoC for spin qubit interface based on RF reflectometry of quantum dots, all implemented in 40nm CMOS technology, for scalable quantum systems operating at 3.5K. The third paper also focuses on scalable RF based readout of spin qubits for a record 0.17mW/qubit power requirement, operating at 4.2K. The fourth paper proposes a 1GS/s A/D converter for low-power digitization of the signal measured from a qubit readout that achieves a FOM of 15fJ/conv.-step at 4K.

- In Paper 13.1, Intel describes an integrated control/readout SoC to drive up to 16 qubits and read up to 6 qubits, and pulse up to 22 gates simultaneously. The chip, fabricated in 22nm FinFET technology, operates at 4K.
- In Paper 13.2, EPFL presents a SoC for the readout of spin qubits based on an intermediate-IF I/Q receiver operating at 5-to-6.5GHz with a 70dB gain and 0.55dB noise figure. The chip allows to read up to 70 qubits dissipating 1.5mW/qubit.
- In Paper 13.3, Delft University of Technology proposes a cryo-CMOS readout chip for spin qubits, achieving 58dB gain and 0.6dB noise figure, all at 4.2K.
- In Paper 13.4, Delft University of Technology presents an A/D conversion applied to measurements of qubits after readout. The proposed 1GS/s ADC achieves 36.2dB SNDR at 4K, supporting multiple qubit readout at less than 0.5mW/qubit.

# Session 13 Highlights: Cryo-CMOS for Quantum Computing

#### [13.1] A Fully-Integrated Cryo-CMOS SoC for Qubit Control in Quantum Computers Capable of State Manipulation, Readout and High-Speed Gate Pulsing of Spin Qubits in Intel 22nm FFL FinFET Technology

**Paper Authors:** Jong-Seok Park<sup>1</sup>, Sushil Subramanian<sup>1</sup>, Lester Lampert<sup>1</sup>, Todor Mladenov<sup>1</sup>, Ilya Klotchkov<sup>1</sup>, Dileep J. Kurian<sup>2</sup>, Esdras Juarez-Hernandez<sup>3</sup>, Brando Perez-Esparza<sup>3</sup>, Sirisha Rani Kale<sup>1</sup>, Asma Beevi K. T.<sup>4</sup>, Shavindra Premaratne<sup>1</sup>, Thomas Watson<sup>1</sup>, Satoshi Suzuki<sup>1</sup>, Mustafijur Rahman<sup>1</sup>, Jaykant B. Timbadiya<sup>2</sup>, Saksham Soni<sup>2</sup>, Stefano Pellerano<sup>1</sup>

Paper Affiliation: <sup>1</sup>Intel Corp., Hillsboro, OR, <sup>2</sup>Intel Corp., Bengalore, India, <sup>3</sup>Intel Corp., Guadalajara, Mexico. <sup>4</sup>Intel Corp., Santa Clara, CA

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan, Technology Directions

#### CONTEXT AND STATE OF THE ART

- The core of a quantum computer is an array of quantum bits (qubits) that are controlled by bulky room-temperature instrumentation today.
- Such a construction is not compact and may be prone to unreliable operation.

#### TECHNICAL HIGHLIGHTS

- Intel presents a fully integrated gate pulsing and readout for spin qubits for fault-tolerant quantum computing cores.
  - o Intel delivers a SoC for the gate pulsing and readout of spin qubits implemented in 22nm FinFET CMOS technology.
  - The chip integrates a micro-controller core and can drive up to 16 spin qubits, read up to 6 qubits, and pulse up to 22 gates simultaneously.
  - o Measurements of the chip at 4K are in line with predicted performance.

- Quantum computing holds the promise of solving some of today's intractable problems using superposition and entanglement, two important properties of quantum mechanics.
- Industry's investments in quantum computing research has led to a number of notable developments, creating significant economic activity in this field.
- It is expected that quantum computers will soon surpass classical computers in terms of computing power, thus reaching socalled quantum practicality. This chip represents an important milestone in this direction.

# Session 14 Overview: mm-Wave Transceivers for Communication and Radar

#### **Wireless Subcommittee**

Session Chair: Bodhisatwa Sadhu, IBM T. J. Watson Research Center, New York, NY

Session Co-Chair: Matteo Bassi, Infineon Technologies AG, Villach, Austria

Session Moderator: Vito Giannini, Uhnder Inc., Austin, TX

The session is focused on key advances in mm-wave wireless communication and radar systems. It features papers describing a stateof-the-art multi-user beamforming receiver, an early fusion radar-LiDAR system, self-interference cancellation techniques, FMCW radar MIMO transceivers, along with temperature-healing techniques and crystal-less transceivers.

- In Paper 14.1, the University of California, Berkeley, presents a 16-element by 16-beam multi-user beamforming integrated receiver in 28nm CMOS with on-chip LO generation and baseband analog BF matrix. The proposed chip supports up to 2Gb/s/user wireless links, and handles 16 concurrent user streams over the whole band.
- In Paper 14.2, Nanyang Technological University, Shanghai Jiao Tong University, and Singapore University of Technology and Design present an early fusion complementary RADAR-LiDAR TRX in 65nm CMOS supporting gear-shifting for hierarchy sensing and imaging with sub-cm resolution for smart sensing and imaging.
- In Paper 14.3, Oregon State University demonstrates a 26GHz full-duplex circulator receiver with 53dB over 400MHz (40dB over 800MHz) self-interference cancellation for mm-wave repeaters that can enhance coverage by addressing challenges of path loss/shadowing for 5G mm-wave radios.
- In Paper 14.4, TU Delft presents a 24-to-30GHz double-quadrature direct-upconversion transmitter with mutual-couplingresilient series-Doherty balanced PA for 5G MIMO Arrays. It achieves more than 30% drain efficiency over 6dB PBO while its S22 is less than -18dB at 24 to 30GHz. Without any calibration the measured IQ image (at 100MHz) is less than -54dBc in the operation frequency band.
- In Paper 14.5, the Institute of Microelectronics of Tsinghua University presents a 1V W-band bidirectional transceiver front-end in 65nm CMOS. The coupled-line-based T/R switches, phase shifters and attenuators were integrated in the TRX FE, achieving <1dB T/R switch IL, >12.3% peak PAE at 15.1dBm output power and <1°/1dB phase/gain resolution with <±2.1dB/±6° gain/phase variation.
- In Paper 14.6, the East China Research Institute of Electronic Engineering, Eindhoven University of Technology, Southwest Integrated Circuit Design and the University of Science and Technology of China, present a 76-to-81GHz 2×8 FMCW MIMO radar transceiver with fast chirp generation and multi-feed antenna-in-package array. The radar is tailored for short- and ultrashort-range radar detection applications and is packaged with eGFO technology. The on-field measured detection range is over 36.4 meters.
- In Paper 14.7, Zhejiang University and the University of California, San Diego, present an adaptive analog temperature-healing low-power 17.7-to-19.2GHz RX front-end with ±0.005dB/°C gain variation, <1.6dB NF variation, and <2.2dB IP<sub>1dB</sub> variation across -15 to 85°C for phased-array receivers.
- In Paper 14.8, the University of Michigan presents a fully integrated 62-to-69GHz crystal-less transceiver designed to eliminate the bulky off-chip frequency reference. An on-chip transmission-line-referenced frequency-locked loop (FLL) allows locking to the desired channel frequency within 4325ppm chip-to-chip variation, and supports 12 non-overlapping channels across the 62to-69GHz band.

## Session 14 Highlights: mm-Wave Transceivers for Communication and Radar

### [14.1] A 71-to-86GHz Packaged 16-Element by 16-Beam Multi-User Beamforming Integrated Receiver in 28nm CMOS

#### [14.2] An Early Fusion Complementary RADAR-LiDAR TRX in 65nm CMOS Supporting Gear-Shifting Sub-cm Resolution for Smart Sensing and Imaging

Paper 14.1 Authors: Emily Naviasky, Lorenzo lotti, Greg LaCaille, Elad Alon, Ali Niknejad

Paper 14.1 Affiliation: University of California at Berkeley, Berkeley, CA

**Paper 14.2 Authors:** Liheng Lou<sup>1,2</sup>, Kai Tang<sup>1</sup>, Zhongyuan Fang<sup>1</sup>, Yisheng Wang<sup>3</sup>, Bo Chen<sup>1</sup>, Ting Guo<sup>1</sup>, Xiaohua Feng<sup>1</sup>, Wensong Wang<sup>1</sup>, Yuanjin Zheng<sup>1</sup>

**Paper 14.2 Affiliation:** <sup>1</sup>Nanyang Technological University, Singapore, Singapore,<sup>2</sup>Shanghai Jiao Tong University, Shanghai, China,<sup>3</sup>Singapore University of Technology and Design, Singapore, Singapore

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

#### CONTEXT AND STATE OF THE ART

- There is increasing interest in next-generation high-data-rate beamforming systems to enable massive Multi-User MIMO.
- Smart sensing and imaging require comprehensive and low-latency environment information with sub-cm resolution.

#### TECHNICAL HIGHLIGHTS

- The University of California Berkeley presents a 16-element by 16-beam multi-user beamforming integrated receiver in 28nm CMOS.
  - The proposed chip supports up to 2Gb/s/user wireless links, and handles 16 concurrent user streams over the whole band. Local oscillator generation and baseband analog beamforming matrix are included. The Sub-array independently steers 8× more beams with 4× more elements than state-of-the art. Power/element/beam is only 7mW.
- Nanyang Technological University, Shanghai Jiao Tong University and Singapore University of Technology and Design
  present an early fusion complementary RADAR-LiDAR TRX in 65nm CMOS supporting gear-shifting for hierarchy
  sensing and imaging with sub-cm resolution for smart sensing and imaging.
  - In the proposed early fusion complementary RADAR-LiDAR architecture, phased-array RADAR fast scans across a wide FoV for long range with adaptive beamforming to extract the angle-of-arrival (AoA) and locate the targets, followed by LiDAR that senses within the narrowed-down window based on the detected AoA, attaining sub-cm resolution. This fusion is achieved in the analog domain with total power consumption of 971mW.

- High-performance MIMO chipset solutions enable the development of next-generation communication systems through multiuser beamforming capabilities. They address the need for faster networks deployed in denser environments where multi-user access by spatial diversity is required to mitigate interference.
- High-accuracy early fusion radar scenarios are paramount in the development of smart sensing, to provide comprehensive and
  effective environment information through complementary signals in data acquisition. Fusion of 77GHz radar with LiDAR
  technology allows for optimized sensing performance in many different use cases and environmental conditions.

### Session 15 Overview: Compute-in-Memory Processors for Deep Neural Networks

#### **Machine Learning Subcommittee**

Session Chair: Jun Deguchi, Kioxia Corporation, Kawasaki, Japan

Session Co-Chair: Yongpan Liu, Tsinghua University, Beijing, China

Session Moderator: Yan Li, Western Digital, Milpitas, CA

Compute-in-memory (CIM) processors for deep neural networks continue to expand their capabilities, and to scale to larger datasets and more complicated models. All four of the papers in this session have integrated CIM into their system setup, and comprehensively evaluate a variety of ML models in high bit precision. The first paper demonstrates a large CIM array with 4.5Mb with bit precision of 1-to-8b. The second paper reduces system energy by using zero skipping, a shared ADC using ping-pong CIM, and digital-predictor-assisted adaptive bit-precision to save power in the ADC. The third paper reduces memory-device footprint by replacing 6T SRAM with 3T plus capacitor. The final paper in the session applies the tensor-train method to decompose and compress neural networks so that they fit within on-chip memory.

- In Paper 15.1, Princeton University describes a scalable neural-network (NN) inference processor based on a 4×4 array of
  programmable cores combining precise mixed-signal capacitor-based in-memory-computing (IMC) with digital SIMD nearmemory computing, interconnected with a flexible on-chip network. Implemented in 16nm with 1-to-8b configurable precision,
  their 25mm<sup>2</sup> chip achieves 30TOPS/W in 8b mode.
- In Paper 15.2, Tsinghua University, Pi2star Technology and National Tsing Hua University describe an energy-efficient computing-in-memory (CIM) neural-network processor. Innovations include a set-associate block-wise zero-skipping (SABZA) and a ping pong-CIM (PP-CIM) architecture using a digital-predictor-assisted adaptive 0/2/4b ADC. Their 65nm, 12mm<sup>2</sup> chip supports the ImageNet dataset (8b activations, 4b weights) with 2.75TOPS/W system energy efficiency, and can reach a peak system energy efficiency of 75.9TOPS/W with 2b activations and 1b weights.
- In Paper 15.3, Northwestern University presents a dynamic analog RAM-based computing-in-memory macro and associated CNN accelerator, leading to an effective bit size of only 75% of 6T foundry SRAM. Using special analog sparsity and retention enhancement techniques, their 3.3mm<sup>2</sup> test chip in 65nm technology achieves state-of-art energy efficiency of 217TOPS/W at CIM macro level and 44TOPS/W at system level, for 4b weight/input operation.
- In Paper 15.4, Tsinghua University, University of Electronic Science and Technology of China and National Tsing Hua University
  present a computing-in-memory (CIM) processor that exploits a tensor-train (TT) decomposition method to significantly
  compress neural networks. By storing all neural network layers on-chip, their TT@CIM processor in 28nm technology achieves
  5.99-to-691.1TOPS/W energy efficiency using 4b or 8b computations, referenced to the operations in the original uncompressed
  network.

## Session 15 Highlights: Compute-in-Memory Processors for Deep Neural Networks

#### [15.1] A Programmable Neural-Network Inference Accelerator Based on Scalable In-Memory Computing

Paper Authors: Hongyang Jia, Murat Ozatay, Yinqi Tang, Hossein Valavi, Rakshit Pathak, Jinseok Lee, Naveen Verma

Paper Affiliation: Princeton University

Subcommittee Chair: Marian Verhelst, KU Leuven - MICAS, Leuven, Belgium

#### CONTEXT AND STATE OF THE ART

- Improvements in the area and power efficiency of machine learning are crucial to enable more compute within a given power envelope, particularly for extending the battery life of mobile, IoT and other devices at the edge.
- Compute-in-memory where computation is performed at the location of stored data is an emerging trend for reducing computational energy.

#### TECHNICAL HIGHLIGHTS

- Princeton University describes a scalable neural network inference processor combining precise mixed-signal capacitor-based in-memory-computing (IMC) together with digital SIMD near-memory computing, interconnected with a flexible on-chip network.
  - With a 4×4 array of programmable cores offering 1-to-8b configurable precision, their 16nm chip achieves 30TOPS/W in 8b mode.

- To date, compute-in-memory (CIM) macros tend to offer high energy efficiency only when computing small neural network models, on datasets with small images, using computation of limited bit-precision (1-to-2b).
- Compute-in-memory processors that can offer high energy efficiency for large and complex networks and datasets, while supporting computation at higher bit-precision (4-to-8b) will enable a broader set of practical applications.

#### **Memory Subcommittee**

Session Chair: Meng-Fan Chang, National Tsing Hua University, Hsinchu, Taiwan

Session Co-Chair: Ru Huang, Peking University, Beijing, China

Session Moderator: Seung-Jun Bae, Samsung, Hwaseong, Korea

Computation in memory (CIM) continues to diversify to cover various memory technologies using computations performed in different signal domains. This session covers CIM designs using ReRAM, eDRAM, and SRAM with computations in both analog and digital domains.

Paper 16.1 describes a high-performance 22nm ReRAM design using a hybrid-precision technique that supports up to an 8b-input and an 8b-weight MAC operations, while achieving 11.91TOPS/W for an 8b-input, 8b-weight and 14b-output, and 195.7TOPS/W for a 1b-input, 2b-weight and 4b-output. 16.2 describes the first 1T1C eDRAM design supporting analog 8b-input, 8b-weight and 8b-output computations at 4.76 TOPS/W in a 65nm technology. In 16.3 a 28nm SRAM CIM macro with up to 22.75TOPS/W for a 4b-input, 4b-weight and 12b-output and 94.31TOPS/W for a 8b-input, 8b-weight and 20b-output. 16.4 takes a different approach and focuses on an all-digital SRAM area-efficient CIM macro design achieving up to 89TOPS/W with 4b-input, 4b-weight and 16b-output.

- In Paper 16.1, National Tsing Hua University presents a 22nm 4Mb 8b-Precision ReRAM CIM macro with 11.91-195.7 TOPS/W for tiny AI edge devices.
- In Paper 16.2, University of Texas at Austin shows an eDRAM-CIM design with reconfigurable embedded dynamic memory array realizing adaptive data converters and charge-domain computing.
- In Paper 16.3, National Tsing Hua University presents a 28nm 384Kb 6T SRAM CIM macro with 8b-precision for edge AI chips.
- In Paper 16.4, TSMC introduces an 89 TOPS/W and 16.3 TOPS/mm<sup>2</sup> all-digital SRAM-based full-precision CIM macro in 22nm for edge machine-learning applications.

### **Session 16 Highlights: Compute-in-Memory**

#### [16.1] A 22nm 4Mb 8b-Precision ReRAM Computing-in-memory Macro with 11.91-195.7 TOPS/W for Tiny AI Edge Devices

Paper Authors: C-X. Xue, J-M. Hung, H-Y. Kao, Y-H. Huang, S-P. Huang, F-C. Chang, P. Chen, T-W. Liu, C-J. Jhang, C-I. Su, W-S. Khwa, C-C. Lo, R-S. Liu, C-C. Hsieh, K-T. Tang, Y-D. Chih, T-Y. J. Chang, M-F. Chang

Paper Affiliation: National Tsing Hua University, Hsinchu, Taiwan

Subcommittee Chair: Jonathan Chang, TSMC, Hsinchu, Taiwan

#### CONTEXT AND STATE OF THE ART

- Battery-powered tiny-AI edge devices require large-capacity nonvolatile computing-in-memory (nvCIM) with flexible input, weight, and output precision to support a wide range of applications.
- Affected by the read-disturbance voltage of nonvolatile memory devices and the large parasitic bitline load, most existing Mblevel nvCIM macros use a current-mode read scheme and achieve low input-weight precision.

#### TECHNICAL HIGHLIGHTS

- The proposed 22nm 4Mb ReRAM macro fabricated using foundry SLC 1T1R ReRAM technology is the first nvCIM macro to support 8b-input and 8b-weight MAC operations.
- Among silicon-verified nvCIMs, it achieved the fastest access time (4.9-14.8ns) and best energy efficiency (195.7-11.91 TOPS/W) with precision from binary (1b) input and weights to 8b in, 8b weight, and 14b output. It also improves the FoM (EFMAC × input-precision × weight-precision × output-ratio / computing latency) by 3.16-4.1× for binary and up to 4b input, 4b weight configurations.

- High-performance 22nm 4Mb ReRAM computing-in-memory macro with configurable precision, covering up to 8b-input and 8b-weight MAC operation.
- Functionality demonstrated on CIFAR-10 and CIFAR-100 image classification datasets using a ResNet-20 model.

### **Session 17 Overview: DC-DC Converters**

#### **Power Management Subcommittee**

Session Chair: Li Geng, Xi'an Jiaotong University, China

Session Co-Chair: Harish Krishnamurthy, Intel, Beaverton, OR

#### Session Moderator: Gaël Pillonnet, CEA-Léti, France

Elegant integrated circuit design techniques are required to further enhance the performance of the next generation of DC-DC converters for various applications such as multi-core microprocessors, energy harvesting, automotive electronics and LED drivers. Novel hybrid combinations of inductors and capacitors, minimizing passive components sizes by operating at GHz, recycling energy from gate drivers using bond wire inductors, and using a single inductor for multiple output power paths for minimal cross-regulation, are some of the techniques presented in this session demonstrating the latest in DC-DC converters at both system and circuit levels.

- In Paper 17.1, Dartmouth College presents a step-down hybrid switched capacitor converter in a 0.18µm CMOS technology that uses a self-start and self-balance flying capacitor technique and fully integrated drivers. Managing 0-to-5V input transient in less than 10µs, the converter achieves up to 85% power efficiency while maintaining <36mV for 1A/µs load step.</li>
- In Paper 17.2, the University of Toronto and NXP Semiconductors show a fault-tolerant hybrid Dickson converter for 48V automotive application. The power stage and the driver are fully integrated in a 0.13µm BCD process with 95.3% efficiency at 0.8A and 1/6 step-down ratio. A master-less control allows multi-phase configuration and guaranties safe operation.
- In Paper 17.3, ETH Zürich introduces a fully integrated self-oscillating DC-DC topology based on coupled Class-D oscillator delivering up-to 0.82W/mm<sup>2</sup> at 2.5GHz operating frequency in a 0.18µm CMOS technology.
- In Paper 17.4, Intel proposes an integrated voltage regulator in a 22nm CMOS technology operating at 60MHz that equilibrates the current sharing between multiple tiles of converters with accuracy as tight as 1.2%. Providing 1A per tile, showing 89.1% power efficiency with down-to 2nH inductor, a peak-current-controlled gang-able buck converter is dedicated to granular power management of multi-core processors.
- In Paper 17.5, the University of California, San Diego presents a 3<sup>rd</sup>-order inductor-first step-down converter in a 0.13µm CMOS technology, achieving 0.72W/mm<sup>2</sup> power density at 98.2% peak efficiency thanks to a reciprocal gate energy recycling. Using only a single 4nH PCB trace inductor, the pragmatic resonant gate driver enables 80% recycling efficiency.
- In Paper 17.6, KAIST shows a reconfigurable converter in a 65nm CMOS process with a flexible TEG/battery connection through a SIMO buck/boost structure with a single inductor, achieving 88.5% and 93.3% efficiency in each mode. The battery TEG pileup configuration saves up-to 44% energy from battery by maximizing the TEG power extraction.
- In Paper 17.7, National Chiao Tung University and Realtek Semiconductor propose a single-inductor four-output converter in a 0.153µm CMOS process that includes a shared feedback amplifier to minimize the cross-regulation to 0.03mV/mA and achieve 3W driving capability with 94.3% peak efficiency. A 185nA quiescent mode enables 87.5% power efficiency at light load (10mW).
- In Paper 17.8, KAIST shows ±15V bipolar step-up converter from battery with an energy-recycled fine regulation scheme, followed by post linear-regulator with a controlled dropout voltage for reducing the power loss and the switching ripple noise. The chip, fabricated in a 0.18µm BCD process, offers 28.7µV<sub>rms</sub> output noise and 90.5% peak efficiency.
- In Paper 17.9, National Chiao Tung University and Realtek Semiconductor introduce a 3-switch boost converter to allow 97.4% power efficiency, 7.5 step-down ratio, and 1.2A drive current for driving MiniLED arrays. In this structure, the flying capacitor charging duration increases along the load current, thus expanding the load range and reducing AC and DC inductor currents by 25% and 20%, respectively.

### Session 17 Highlights: DC-DC Converters

#### [17.1] A Two-Stage Cascaded Hybrid Switched Capacitor DC-DC Converter With 96.9% Peak Efficiency Tolerating 0.6V/µs Input Slew Rate During Startup

Paper Authors: Ziyu Xia, Jason Stauth

Paper Affiliation: Dartmouth College, Hanover, NH

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA, Power Management

#### CONTEXT AND STATE OF THE ART

- Aggressive power management in portable hand-held applications requires efficient high-conversion-ratio power delivery in a small form factor (i.e., small volume, area, and height) while providing fast transient response and ultra-fast start-up.
- Hybrid switched-capacitor (SC) converters can deliver high efficiency while reducing the size of needed inductor(s) but they fail to ensure a fast startup while also maintaining capacitor voltage balance.

#### **TECHNICAL HIGHLIGHTS**

- A two-stage cascaded hybrid SC converter featuring fast nonlinear control with automatic flying capacitor balancing is presented that offers both rapid start-up and fast load transient response.
  - The converter achieves 96.9% peak efficiency for 5V-to-1.2V step down and maintains >85% efficiency for 5V-to-0.4V conversion (VCR=12.5)
  - With all gate-drive supplies and bootstrapping integrated on chip, the converter self-starts from 0-to-5V input in <10µs and achieves <36mV over/undershoot for ~1A/µs load current steps...</li>

#### APPLICATIONS AND ECONOMIC IMPACT

• Power delivery technologies are a key enabler of small-form-factor mobile platforms with excellent battery longevity. With its nonlinear control approaches to compliment the advantages of the more traditional hybrid switched-capacitor topologies, this paper's contributions are bound to shake up the Buck converter's dominance in future mobile power delivery applications.

# Session 18 Overview: Biomedical Devices, Circuits, and Systems

#### **Technology Directions Subcommittee**

Session Chair: Rabia Tugce Yazicigil, Boston University, Boston, MA

Session Co-Chair: Milin Zhang, Tsinghua University, Beijing, China

Session Moderator: Patrick P. Mercier, University of California San Diego, La Jolla, CA

This session covers biomedical systems with innovations that traverse device, circuit, and system-level design. The first paper describes a retinal prosthetic that utilizes an optically-addressed nanowire array in conjunction with a transmitter-offloaded wireless neural stimulation approach for efficient operation near sensitive retinal tissue. A pneumatic-free fully-CMOS-controlled microfluidics platform for label-free cellular and bio-molecular sensing comes next, followed by a CMOS microscopic-scale thermal actuation and sensing array for localized heating of magnetic nanoparticles for hyperthermia cancer therapy. The final paper showcases a wireless multimode IC integrating electrochemical sensors, a temperature sensor, and a current stimulator for monitoring chronic wound healing processes.

- In Paper 18.1, the University of California San Diego presents a retinal prosthetic that enables scaling to 1512 channels via an
  optically-addressed nanowire array. The system achieves an RF-to-stimulation efficiency of 73% by off-loading chargebalancing and regulation to a wearable transmitter via an integrated charge-metering feedback approach.
- In Paper 18.2, Princeton University presents a CMOS-microfluidic bio-sensing platform integrating cytometry sensors and actuators with an AC electrokinetic fluid flow eliminating the need for a pressure-driven flow. This system achieves a velocity up to 160µm/s by driving the bulk fluid at 100kHz, while it can precisely control the cell focus within +/- 3µm of a central flow.
- In Paper 18.3, Rice University presents a 1.18-to-2.62GHz thermal actuation and sensing array for localized heating of magnetic nanoparticles. This array consists of 12 pixels with 0.6mm × 0.7mm spatial resolution and an embedded electro-thermal feedback loop to regulate the temperature with 0.53/0.29°C maximum/rms error.
- In Paper 18.4, National Chiao Tung University presents a wireless system supporting C-reactive protein, uric acid, and temperature readout while providing current stimulation for chronic wound healing process. This multi-mode readout IC achieves electrochemical sensing at a scanning rate ranging from 0.08 to 400V/s, while providing a resolution of 2pA for a current range of 12µA.

# Session 18 Highlights: Biomedical Devices, Circuits, and Systems

[18.1] An Optically-Addressed Nanowire-Based Retinal Prosthesis with 73% RF-to-Stimulation Power Efficiency and 20nC-3µC Wireless Charge Telemetering

#### [18.2] CMOS Driven Pneumatic-free Scalable Microfluidics and Fluid Processing with Label-free Cellular and Bio-molecular Sensing Capability for an End-to-End Point-of-Care System

Paper 18.1 Authors: Abraham Akinin<sup>1</sup>, Jeremy Ford<sup>1</sup>, Jiajia Wu<sup>1</sup>, Chul Kim<sup>2</sup>, Hiren Thacker<sup>3</sup>, Patrick P. Mercier<sup>1</sup>, Gert Cauwenberghs<sup>1</sup>

Paper 18.1 Affiliation: <sup>1</sup>University of California, San Diego, La Jolla, CA, <sup>2</sup>KAIST, Daejeon, Korea, <sup>3</sup>Nanovision Biosciences, La Jolla, CA

Paper 18.2 Authors: Chengjie Zhu, Jesus Manuel Maldonado Vazquez, Hao Tang, Suresh Venkatesh, Kaushik Sengupta

Paper 18.2 Affiliation: Princeton University, Princeton, NJ

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan, Technology Directions

#### CONTEXT AND STATE OF THE ART

- Retinal prosthetics help restore vision to patients who are blind, but scaling to many pixels to enable vision better than 20/400 has major technical challenges. An optically-addressed approach can help enable scaling to thousands of channels in a hermetically-compatible package.
- Current platforms employing pneumatic flow control with bulky pumps to perform complex bio-sample preparation and fluid processing have been the bottleneck for enabling handheld point-of-care molecular diagnostic platforms. All-in-one integration of sample processing and label-free cell/bio-molecule sensing enables rapid and scalable analysis platforms.

#### TECHNICAL HIGHLIGHTS

- University of California, San Diego introduces an optically-addressed nanowire-based retinal prosthesis.
  - The first integrated circuit built for a scalable retinal prosthetic system using optical addressing via a co-fabricated nanowire array to enable scaling to thousands of pixels and helping to restore vision to patients who are blind. A wireless charge metering stimulator enables direct conversion of RF energy into stimulation pulses without an on-chip regulator by pushing regulation to the wearable device, enabling an RF-to-stimulation efficiency of 73%. It is important to be efficient, as the chip will be placed near heat-sensitive retinal tissue.
- Princeton University presents a CMOS-driven large-scale microfluidics platform eliminating pressure-driven flows to enable label-free cellular and bio-molecular sensing for an end-to-end point-of-care system.
  - The first all-in-one pneumatic-free, fully CMOS-controlled microfluidic and biosensing platform with controlled AC electrokinetic fluid flow, integrated multiplexed cell manipulation, detection and separation, and label-free bio-molecular detection capability, under a 250µW/pixel power consumption.

- There are millions of people around the world who are blind, or who have severely degraded vision due to Retinitis Pigmentosa (RP) or Age-Related Macular Degeneration (AMD). The proposed retinal prosthesis technique is a step in the right direction towards restoring vision to these patients.
- An end-to-end, low-cost, handheld point-of-care platform plays a significant role in the emergence of the pandemic.

# Session 19 Overview: Optical Systems for Emerging Applications

Session Chair: Munehiko Nagatani, NTT, Atsugi, Japan

Session Co-Chair: Nick Van Helleputte, imec, Heverlee, Belgium

Session Moderator: Naveen Verma, Princeton University, Princeton, USA

Optical technologies bring a new sensing and actuation modality critical to several emerging applications. The papers in this session demonstrate the progression of such technologies for increased robustness and system-level integration. The co-integration of optical and photonic technologies with CMOS offers advancements in application domains such as automation/autonomy and biomedical. This session demonstrates the proliferation of different technologies including silicon-photonics, MEMS and flexible electronics.

- In Paper 19.1, University of Southern California describes a 256-element optical phased array for FMCW lidar with on-chip selfcalibration capability. The FMCW lidar consists of an optical front-end chip in 220nm silicon-photonics technology and two 180nm CMOS chips each with 136 Class-D pulse-width modulation (PWM) drivers that time-share a 10b current-steering DAC.
- In Paper 19.2, Columbia University demonstrates a mechanically flexible, 250µm thin lens-less neural device with integrated blue and green µLED arrays for fluorescence computational imaging and optogenetic stimulation. This chip achieves 125fps frame-rate with 60µm imaging resolution at 200µm distance consuming 40mW total power.
- Finally, in Paper 19.3, University of California at Berkeley presents a system for axial light focussing in scan-based optogenetics systems. A 23,852 element MEMS array with phase modulating piston-motion MEMS mirrors achieves a 10kHz frame-rate spatial light modulation by employing a driver ASIC with linearized DACs.

# Session 19 Highlights: Optical Systems for Emerging Applications

#### [19.1] Optical Phased Array FMCW Lidar with On-Chip Calibration

Paper Authors: SungWon Chung, Makoto Nakai, Samer Idres, Yongwei Ni, Hossein Hashemi

Paper Affiliation: University of Southern California, Los Angeles, CA

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan, Technology Directions

#### CONTEXT AND STATE OF THE ART

- Light detection and ranging (lidar) sensors offer high resolution and high accuracy for diverse applications such as autonomous vehicles and three-dimensional imagers.
- Large-scale optical phased array for lidar systems requires an external detector at a far-field distance for calibrating the inevitable mismatches across the array, resulting in large system size.

#### **TECHNICAL HIGHLIGHTS**

- A 256-element optical phased array for FMCW lidar with on-chip self-calibration capability.
  - By sharing a 256-element optical phased array for both transmit and receive paths, self-calibration can be achieved by monitoring an integrated FMCW receiver output.
  - The FMCW lidar consists of an optical front-end chip in 220nm silicon photonics technology and two 180nm CMOS chips each with 136 Class-D pulse-width modulation (PWM) drivers that are time-sharing a 10b current-steering DAC.

- An optical-phased array with on-chip calibration capability eliminates external optical detectors at the far-field distance.
- Functionality demonstrated opens the door to commercial applications with a compact form factor.

#### **RF Subcommittee**

Session Chair: Andrea Bevilacqua, University of Padova, Padova, Italy

Session Co-Chair: Salvatore Levantino, Politecnico di Milano, Milan, Italy

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF

Session Moderator: Hua Wang, Georgia Tech, Atlanda, GA

The quest for voltage-controlled oscillators (VCOs) with lower phase noise and higher efficiency continues. In this session, the first paper describes a technique to obtain a wideband common-mode resonance and minimize the 1/f noise upconversion into phase noise in a broadband fashion. Next, a distributed multi-core oscillator, which achieves very low phase noise and an improved FoM with respect to the state-of-the-art of multi-core oscillators, is presented. The final paper showcases a quad-core millimeter-wave oscillator with low phase noise.

- In Paper 20.1, the University of Macau presents a 5.0-to-6.36GHz VCO in 65nm CMOS with a wideband common-mode resonance, achieving a peak FoM of 196.9dBc/Hz at 10MHz offset from the 6.36GHz carrier and the minimum phase noise of -146.1dBc/Hz.
- In Paper 20.2, the University of Electronic Science and Technology of China and Zhejiang University present a distributed multicore oscillator in 40nm CMOS with a 26.6% tuning range and a minimum phase noise of -138.9dBc/Hz and an FoM of 195.1dBc/Hz at 1MHz offset from the 3.09GHz carrier.
- In Paper 20.3, Tsinghua University presents a triple-coupled-transformer quad-core VCO that is suitable for mm-wave operation. The VCO prototype in 65nm CMOS exhibits a -104.7dBc/Hz phase noise at 1MHz offset from the 59.12GHz carrier and a 186.5dBc/Hz FoM.

### Session 20 Highlights: High-Performance VCOs

## [20.2] A 3.09-to-4.04GHz Distributed-Boosting and Harmonic-Impedance-Expanding Multi-Core Oscillator with -138.9dBc/Hz at 1MHz offset and 195.1dBc/Hz FoM

Paper Authors: Y. Shu<sup>1</sup>, H. J. Qian<sup>1</sup>, X. Gao<sup>2</sup>, X. Luo<sup>1</sup>

Paper Affiliation: <sup>1</sup>University of Electronic Science and Technology of China, Chengdu, China; <sup>2</sup>Zhejiang University, Hangzhou, China

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF Subcommittee

#### CONTEXT AND STATE OF THE ART

- Lowering the phase noise of local oscillators is essential to improve the spectral efficiency of future communication systems and the spatial resolution of radar systems.
- Multi-core oscillators, which offer the opportunity to scale down the phase noise of traditional oscillators, typically show a degraded noise-power trade-off.

#### TECHNICAL HIGHLIGHTS

- The University of Electronic Science and Technology of China and Zhejiang University present a 3.6GHz distributed multi-core oscillator in 40nm CMOS.
  - The minimum achieved phase noise is -138.9dBc/Hz at 1MHz offset and the minimum flicker corner frequency is 100kHz over the 26.6% tuning range.
  - The oscillator reaches an FoM of 195.1dB/Hz which is 6dB higher than in other published quad-core oscillators.

#### APPLICATIONS AND ECONOMIC IMPACT

• This record FoM will reduce the power consumption of very-low-noise oscillators for wireless applications.

#### **Wireless Subcommittee**

Session Chair: Hiroyuki Ito, Tokyo Institute of Technology, Yokohama, Japan

Session Co-Chair: Renaldi Winoto, Mojo Vision, Saratoga, CA, USA

Session Moderator: Yao-Hong Liu, imec, Eindhoven, The Netherlands

This session presents two different approaches to highly efficient wireless communication. The first two papers focus on impulse-radio ultra-wideband communication, whereas the following three papers present different implementations of wake-up receivers with low power consumption, high-sensitivity and improved interference rejection.

- In Paper 21.1, Yonsei University and Ewha Womans University present an IR-UWB radio that uses advanced modulation techniques to achieve a very high data rate of 1.25Gb/s while consuming only 28mW for 2m range.
- In Paper 21.2 imec-Netherlands targets the IEEE 802.15.4a/4z IR-UWB standard and presents a coherent polar transmitter with injection-locking phase modulation and an asynchronous pulse-shaping technique to meet worldwide emission requirements.
- In Paper 21.3 Everactive presents a charge-domain wake-up receiver that achieves a high level of integration, as well as wide dynamic and temperature ranges. A sensitivity of -70.2dBm is achieved at 2.7µW power consumption.
- In Paper 21.4 Oregon State University and Columbia University present a 171µW wake-up receiver and a 440µW primary receiver front-end. The wake-up receiver uses a code-modulated 3-tone signal to provide 6 to 8dB improvement in CW and modulated interferer tolerance.
- In Paper 21.5 the University of Virginia presents a 2.4GHz wake-up receiver with -91.5dBm sensitivity that utilizes within-packet duty-cycling and channel-embedded OOK modulation to achieve 2× lower power and up to 10× lower latency than state of the art.

### Session 21 Highlights: UWB Systems and Wake-Up Receivers

#### [21.1] [21.1] A 1.125Gb/s 28mW 2m-Radio-Range IR-UWB CMOS Transceiver

# [21.2] A 3-to-10GHz 180pJ/bit IEEE802.15.4a/4a IR-UWB Coherent Polar Transmitter in 28nm CMOS with Asynchronous Amplitude Pulse-Shaping and Injection-Locked Phase Modulation

Paper 21.1 Authors: Geunhaeng Lee<sup>1</sup>, Sanghwa Lee<sup>1</sup>, Ji-Hoon Kim<sup>2</sup>, Tae-Wook Kim<sup>1</sup>

Paper 21.1 Affiliation: 1Yonsei University, Seoul, Korea, 2Ewha Womans University, Seoul, Korea

Paper 21.2 Authors: Erwin Allebes, Gaurav Singh, Yuming He, Evgenii Tiurin, Paul Mateman, Ming Ding, Johan Dijkhuis, Gert-Jan van Schaik, Elbert Bechthum, Johan van den Heuvel, Mohieddine El Soussi, Arjan Breeschoten, Hannu Korpela, Yao-Hong Liu, Christian Bachmann

Paper 21.2 Affiliation: imec-Netherlands, Eindhoven, The Netherlands

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

#### CONTEXT AND STATE OF THE ART

- Impulse-radio (IR) UWB relies on fine-time-resolution pulses to encode information. This can lead to high-frequency clocks and/or high-frequency baseband circuits implying high-power consumption to implement.
- Asynchronous pulse-shaping and pulse-position modulation techniques are used to improve data rate, while maintaining compliance with spectral emission requirements, without increasing power consumption. These techniques are primarily enabled by fine-time resolution delay-lines available in modern CMOS processes.

#### TECHNICAL HIGHLIGHTS

- Yonsei University and Ewha Womans University introduces an IR-UWB radio with 1.25Gb/s data rate over 2m range while only consuming 28mW
  - The proposed multi-pulse position modulation scheme improves the system spectral efficiency and allows to increase the data-rate of this IR-UWB system
  - The higher data-rate achieved with 28mW power consumption makes this transceiver the best among state-of-the-art for energy efficiency at 2m range.
- imec-Netherlands presents a coherent IR-UWB polar transmitter, targeting the 802.15.4a/4z IR-USB standard
  - An asynchronous amplitude pulse-shaping and injection-locked phase modulation is used to achieve good energy efficiency and worldwide spectrum emission compliance
  - This is the first IEEE802.15.4z/4a-compliant IR-UWB coherent polar transmitter implemented in 28nm CMOS technology.

- UWB promises high data rate and precise positioning over a small distance, which can be useful in many consumer electronic
  applications.
- The same UWB technology is already adopted in some cellular phones today to provide spatial awareness, for locating nearby objects more precisely.
- IEEE802.15.4a and 4z standardizes impulse radio for data communication and indoor local positioning.

# Session 22 Overview: Terahertz for Communication and Sensing

#### **Wireless Subcommittee**

Session Chair: Q. Jane Gu, University of California, Davis

Session Co-Chair: Byung-Wook Min, Yonsei University, Seoul, South Korea

Session Moderator: Maryam Tabesh, Google Inc., Mountain View, CA

With the continuing advancements of THz technologies in silicon processes, this year's papers further push the technique frontiers in circuit performances and system demonstrations. This session features four THz papers describing THz Prism for spectrum-to-space mapping, 300GHz wideband communication, a 0.42THz coherent transceiver for phase-contrast imaging, and a phase-processing-based micrometer-range-resolution radar at 250GHz.

- In Paper 22.1, Princeton University demonstrates THz Prism: one-shot simultaneous multi-node angular localization using spectrum-to-space mapping and dual-port integrated leaky-wave antennas in 65nm CMOS. A one-shot direction-finding across 1D with an accuracy of 0.95deg and 2.1deg with an integration time of 5ms and 50µs respectively is demonstrated.
- In Paper 22.2, Tokyo Institute of Technology and NTT present a 300GHz phased-array transceiver using outphasing and Hartley
  architecture in 65nm CMOS. This work demonstrates the first implementation of a wideband CMOS phased array that operates
  at a frequency higher than 200GHz.
- In Paper 22.3, KU Leuven presents a 0.42THz coherent TX-RX system for phase-contrast imaging implemented in 40nm CMOS, achieving 52dB SNR (100kHz RBW) at 25cm distance thanks to the 10dBm EIRP TX and the 27dB NF RX.
- In Paper 22.4, the University of Michigan and STMicroelectronics report a 250GHz autodyne FMCW radar in 55nm BiCMOS with micrometer-range resolution using a phase processing method. With +17dBm maximum TX EIRP and 66.7GHz bandwidth, a range resolution of 54µm is achieved for targets at 25.4cm distance, with an overall measured range error better than 0.025%.

# Session 22 Highlights: Terahertz for Communication and Sensing

# [22.1]: THz Prism: One-Shot Simultaneous Multi-Node Angular Localization Using Spectrum-to-Space Mapping with 360-to-400GHz Broadband Transceiver and Dual-Port Integrated Leaky-Wave Antennas

Paper Authors: Hooman Saeidi, Suresh Venkatesh, Xuyang Lu, Kaushik Sengupta

Paper Affiliation: Princeton University, Princeton, NJ

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

#### CONTEXT AND STATE OF THE ART

- The spectrum above 100 GHz is expected to spawn a generation of ultra-high-speed wireless links and intelligent sensing and imaging applications. Wireless communication and sensing applications require rapid localization and direction finding of mobile nodes.
- The current schemes for direction finding and beam alignment are often non-scalable, time consuming, and computationally
  expensive thus posing serious challenges for low-latency applications. Hence there is a compelling need to process such
  direction-finding methods with very low latencies.

#### **TECHNICAL HIGHLIGHTS**

- THz Prism, an integrated leaky-wave antenna-based spectrum-to-space mapping scheme, is developed to realize oneshot simultaneous angular localization.
  - This work establishes a unique frequency-to-angular map with two dual-port integrated leaky-wave radiators interfacing with a scalable 360-to-400GHz transceiver.
  - One-shot direction-finding is demonstrated across 1D with an accuracy of 0.95degrees and 2.1degrees with integration times of 5ms and 50µs respectively. One-shot 2D direction-finding is also demonstrated with a standard deviation for both angles of about 1.9degrees for a measurement time of 50ms.

- The frequency-dependent spatial map response enables spatial localization in a fast and effective manner without the need for scanning or mapping.
- The proposed 400GHz radio transceiver with integrated antennas is the first attempt at demonstrating a monolithic implementation of such one-shot angular localization system in a main stream, low-cost CMOS process.

#### **RF Subcommittee**

Session Chair: Swaminathan Sankaran, Texas Instruments Inc., Dallas, TX

Session Co-Chair: Patrick Reynaert, KU Leuven - MICAS, Leuven, Belgium

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF

Session Moderator: Shuhei Amakawa, Hiroshima University, Japan

The THz frontier continues to be pushed by mainstream-CMOS circuits with excellent performance. The papers present diverse circuits and front-ends extending state of the art on linearity, signal generation/steering, and detection sensitivity. The first paper presents a THz upconverter leveraging the benefit of parametric gain using optimally biased varactors for enhanced even-order harmonic generation. A THz 2D beam-steering pixel-array source with the capability to spatially and electronically steer is presented in the second paper. The session continues with an ultra-low power and area THz detector with leading-edge noise performance and concludes with a W-band PLL demonstrating best-in-class jitter and FoM.

- In Paper 23.1, the University of Texas, Dallas, presents a 270-to-300GHz double-balanced parametric upconverter based on asymmetric MOS varactors and a power-splitting-transformer hybrid in 65nm CMOS. Among upconverters operating near 300GHz, the paper reports the highest conversion gain and 1<sup>st</sup>-order linearity.
- In Paper 23.2, the University of California, Davis, presents a reconfigurable lens-integrated 436-to-467GHz radiating source with the peak directivity of 26dBi, 51-to-95mW power consumption, and continuous uninterrupted 2D electronic beam scanning leveraging multiple steering methods. The circuit is capable of supporting high-resolution and fast imaging for THz applications.
- In Paper 23.3, KU Leuven presents a 605GHz sub-1mW harmonic injection-locked receiver achieving 2.3pW/√Hz noiseequivalent power (NEP) in a 28nm CMOS process. The paper introduces a new approach for THz detection with the lowest published NEP for CMOS detectors operating above 500GHz.
- In Paper 23.4, KAIST presents a PLL that can directly generate an ultra-low-jitter W-band signal. The 65nm-CMOS 102GHz Wband PLL uses a gated injection-locked frequency-multiplier-based phase detector to achieve 82fs<sub>rms</sub> jitter.

### Session 23 Highlights: THz Circuits and Front-Ends

## [23.1] 270-to-300GHz Double-Balanced Parametric Upconverter Using Asymmetric MOS Varactors and a Power-Splitting-Transformer Hybrid in 65nm CMOS

### [23.3] A 605GHz 0.84mW Harmonic Injection-Locked Receiver Achieving 2.3pW/ $\sqrt{Hz}$ NEP in 28nm CMOS

Paper 23.1 Authors: Z. Chen<sup>1</sup>, W. Choi<sup>2</sup>, K. O<sup>1</sup>

Paper 23.1 Affiliation: <sup>1</sup>University of Texas at Dallas, Richardson, TX, <sup>2</sup>Oklahoma State University, Stillwater, OK

Paper 23.3 Authors: A. De Vroede<sup>1</sup>, P. Reynaert<sup>1</sup>

Paper 23.3 Affiliation: 1KU Leuven - MICAS, Leuven, Belgium

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF Subcommittee

#### CONTEXT AND STATE OF THE ART

- Efficient THz circuits and front-ends in mainstream, production CMOS platforms allow superior levels of integration with lowcost adoption enabling fine-grain THz sensing and imaging for object classification, medical, and security applications.
- Maximum operable frequency limits with CMOS technology processes critically constrain generation of power as well as sensitivity, thereby fundamentally limiting use.
- Innovations in circuit topology and novel use of existing integrated components help break the frequency barrier with improved performance over the state of the art.

#### TECHNICAL HIGHLIGHTS

- The University of Texas, Dallas, introduces a THz upconverter in 65nm CMOS achieving higher conversion gain and 1<sup>st</sup>-order linearity than previous state-of-the-art upconverters operating near 300GHz.
  - $\circ$  A 270-to-300GHz double-balanced parametric upconverter based on asymmetric MOS varactors and a power-splitting-transformer achieves the maximum conversion gain of -11.2dB, OP<sub>1dB</sub> of -6.2dBm, P<sub>sat</sub> of greater than -3dBm, and the 3dB bandwidth of ~25GHz.
- KU Leuven presents a novel, ultra-low-power approach for above-f<sub>max</sub> THz detection achieving best-in-class noise performance in 28nm CMOS.
  - A 605GHz harmonic injection-locked receiver using the same passive component for fundamental oscillation and 3<sup>rd</sup>harmonic reception achieves 2.3 pW/√Hz noise equivalent power (NEP) while consuming 0.84mW.

- Efficient signal generation and improved sensitivity while consuming low power allow adoption for portable/green THz imaging.
- Useable THz performance in commercial CMOS platforms enables massive integration and inexpensive adoption for a variety
  of cost-sensitive, previously unchartered non-invasive applications.
#### **Memory Subcommittee**

Session Chair: Eric Karl, Intel, Hillsboro, OR, USA

Session Co-Chair: Shinichiro Shiratake, Kioxia, Yokohama, Japan

Session Moderator: Jonathan Chang, TSMC, Hsinchu, Taiwan

Subcommittee Chair: Jonathan Chang, TSMC, Hsinchu, Taiwan

Advancements in embedded memories continue to enable improvements in system designs spanning from automotive to highperformance computing markets. SRAM continues to be a critical technology enabler across the full spectrum of the semiconductor industry. RRAM has emerged as a promising technology for scaling embedded non-volatile memory into advanced nodes; with applications in consumer electronics, self-driving vehicles, intelligent edge devices and more. This session details recent advancements in SRAM and RRAM technology and circuits. The first paper describes an ultra-high-performance and low-power sensing technique for single-ended 8T SRAM arrays. The second paper outlines a state-of-the-art RRAM reporting record bit density. The third paper showcases a GAA SRAM design and its supporting assist circuitry. The final paper outlines a standard-cell based SRAM design for small macros: demonstrating high-performance and ultra-low operating voltages.

- In Paper 24.1, IBM Research describes a single-ended current-sense amplifier implemented for an 8T SRAM array with 7nm FinFETs. The array operates at 6.2GHz from a 1.0V supply and at 2GHz from a 0.5V supply, while saving 10-12% of power compared to conventional domino read circuits.
- In Paper 24.2, the Chinese Academy of Sciences presents a 14nm FinFET 1Mb RRAM with the smallest published bitcell size of 0.022um<sup>2</sup>. A new self-adaptive delayed termination (SADT) write scheme improves the creation of robust conductive filaments and enables up to an 87% reduction in R<sub>ON</sub> related failures.
- In Paper 24.3, Samsung demonstrates a 3nm Gate-All-Around SRAM featuring improved ADBL and ACP circuit assists to increase cell operating margins. ADBL and ACP combine to improve minimum operating voltage by 230mV, based on results from a GAA SRAM testchip.
- In Paper 24.4, TSMC presents a 5nm standard-cell based 16T SRAM macro targeted for small capacity macro applications with improved integration to memory periphery circuits. A 4kb array with this implementation achieves 5.7GHz operation from a 1.0V supply, with a minimum operating voltage floor of 0.35V.

### **Session 24 Highlights: Advanced Embedded Memories**

# [24.1] A 6.2GHz Single-Ended Current-Based Sense Amplifier (CSA) Compileable 8T SRAM in 7nm FinFET

#### [24.3] 3nm Gate-All-Around SRAM Featuring an Adaptive Dual-BL and an Adaptive Cell-Power Assist Circuit

**Paper 24.1 Authors:** Alexander Fritsch<sup>1</sup>, Rajiv Joshi<sup>2</sup>, Sudipto Chakraborty<sup>2</sup>, Holger Wetter<sup>1</sup>, Uma Srinivasan<sup>1</sup>, Matthew Hyde<sup>3</sup>, Otto Torreiter<sup>1</sup>, Michael Kugel<sup>1</sup>, Dan Radko<sup>3</sup>, Hyong Kim<sup>3</sup>, Daniel Friedman<sup>2</sup>

Paper 24.1 Affiliation: <sup>1</sup>IBM, Boeblingen, Germany, <sup>2</sup>IBM Research, Yorktown Heights, NY, <sup>3</sup>IBM, Poughkeepsie

**Paper 24.3 Authors:** Taejoong Song, Woojin Rim, Hoonki Kim, Keun Hwi Cho, Taeyeong Kim, TaeJung Lee, Geumjong Bae, Dong-Won Kim, SD Kwon, Soon-Moon Jung, Sanghoon Baek, Jonghoon Jung, Jongwook Kye, Jaehong Park

Paper 24.3 Affiliation: Samsung Electronics, Gyeoggi-Do, Korea

Subcommittee Chair: Jonathan Chang, TSMC, Hsinchu, Taiwan

#### CONTEXT AND STATE OF THE ART

- Advances in compute performance across a range of product applications are driving the need for improved bandwidth and energy efficiency from the memory hierarchy.
- On-die SRAM is increasingly important to provide low-latency and energy-efficient caches to further drive compute performance improvements.
- Multi-port memories remain an important part of high-performance system design in order to provide improved bandwidth per clock cycle.

#### TECHNICAL HIGHLIGHTS

- IBM showcases a novel current-mode sense amplifier in a multi-port 8T SRAM array: enabling 6.2GHz operation at 1V that delivers a 10-12% power reduction compared to an array using a traditional ripple-domino read architecture. The sensing architecture is demonstrated to reach down to a minimum operating voltage (V<sub>MIN</sub>) of 0.5V.
- Samsung describes a 3nm GAA SRAM array design that supports increased transistor sizing flexibility compared to a FinFET technology; leading to an improved read margin for their cell designs. An adaptive dual-BL (ADBL) and an adaptive cell power (ACP) techniques are described to improve the write margin: enabling a 230mV reduction in V<sub>MIN</sub> with next-generation GAA transistors.

- SRAM is a foundational component of the memory hierarchy in all modern system-on-chip designs: ranging from mobile processors to servers, and-high performance compute accelerators.
- Continued SRAM performance and area scaling paves the way for improved products ranging from mobile to high performance computing applications.

#### **Memory Subcommittee**

Session Chair: Dong Uk Lee, SK hynix, Icheon, Korea

Session Co-Chair: Bor-Doou Rong, Etron, Hsinchu, Taiwan

Session Moderator: Kyu-Hyoun (KH) Kim, IBM T. J. Watson, Yorktown Heights, NY

DRAM memories continue to have a significant impact on a wide range of applications, including high-performance graphics, smartphones, server applications, and machine learning. Two 8Gb GDDR6 DRAM papers for next-generation graphics applications show an increase to maximum pin speed to 22-24Gb/s/pin, and a 16Gb LPDDR5 DRAM with sub-1V operation and a 7.14Gb/s/pin IO speed is introduced for low-power mobile applications. Function-in-memory DRAM based on HBM2 is also introduced, which achieves a 1.2TFLOPS programmable computing unit (PCU).

- In Paper 25.1, SK hynix presents a 24Gb/s/pin 8Gb GDDR6 DRAM with half-rate daisy-chain-based WCK tree: with an LC PLL, and I/O circuits optimized for low-noise operation and a wide range supply voltages. Based on its configuration, it performs 1.3× faster than the previously published GDDR6 DRAM.
- In Paper 25.2, Samsung Electronics presents a 16Gb sub-1V 7.14Gb/s/pin LPDDR5 DRAM. An innovative mosaic architecture
  is introduced to increase the density within a limited package size. I/O performance is improved by shortening the length of the
  top metal and a newly introduced short-feedback SA with dedicated V<sub>REF</sub>s for a 1-tap DFE. The 1.64× long bus line is managed
  by a fully-source-synchronous (FSS) bus and a low-level swing (LLS) scheme. To enhance power efficiency and yield, an
  adaptive body bias (ABB) scheme is used.
- In Paper 25.3, Micron shows an 8Gb GDDR6X DRAM that features a PAM4 encoded, single-ended I/O to double the per-pin bandwidth for a given data-clock. Three CTLE pre-amplifiers shield the four-phase clocked latches from the pad. A de-emphasis PAM4 transmitter is implemented for read operations. The package is optimized for crosstalk to address the decreased voltage margins of PAM4 signaling. Full device operation exceeding 22Gb/s/pin is demonstrated.
- In Paper 25.4, Samsung Electronics shows a HBM2-based function-in-memory DRAM that improves system performance by on-chip data processing and reduces system power consumption by inter-chip data movement. The measurement results show a 2.1× improvement in system performance with a 71% power reduction compared to a conventional system.

### Session 25 Highlights: DRAM

## [25.1] A 24Gb/s/pin 8Gb GDDR6 with Half-Rate Daisy-Chain-Based Clocking Architecture with IO circuitry for Low-Noise Operation

#### [25.2] A 16Gb Sub 1V 7.14Gb/s/pin LPDDR5 SDRAM Applying Mosaic Architecture with Short-Feedback 1-tap DFE, FSS Bus with Low Level Swing, and Adaptively Controlled Body Biasing in 3rd Generation of 10nm DRAM

**Paper 25.1 Authors:** Kyunghoon Kim, Joo-Hyung Chae, Jaehyeok Yang, Jihyo Kang, Gangsik Lee, Sangyeon Byeon, Youngtaek Kim, Boram Kim, Dong-Hyun Kim, Yeongmuk Cho, Kangmoo Choi, Hyeongyeol Park, Junghwan Ji, Sera Jeong, Yongsuk Joo, Jaehoon Cha, Minsoo Park, Hongdeuk Kim, Sijun Park, Kyubong Kong, Sunho Kim, Sangkwon Lee, Junhyun Chun, Hyungsoo Kim, Seonyong Cha

Paper 25.1 Affiliation: SK hynix, Icheon, Korea

Paper 25.2 Authors: Yong-Hun Kim, Hyung-Jin Kim, Jaemin Choi, Min-Su Ahn, Dongkeon Lee, Seung-Hyun Cho, Dong-Yeon Park, Young-Jae Park, Min-Soo Jang, Yong-Jun Kim, Jinyong Choi, Sung-Woo Yoon, Jae-Woo Jung, Jae-Koo Park, Jae-Woo Lee, Dae-Hyun Kwon, Hyung-Seok Cha, Si-Hyeong Cho, Seong-Hoon Kim, Jihwa You, Kyoung-Ho Kim, Dae-Hyun Kim, Byung-Cheol Kim, Young-Kwan Kim, Jun-Ho Kim, Seouk-Kyu Choi, Chan-Young Kim, Byong-Wook Na, Hye-In Choi, Reum Oh, Jeong-Don Ihm, Seung-Jun Bae, Nam Sung Kim, Jung-Bae Lee

Paper 25.2 Affiliation: Samsung Electronics, Hwaseong, Korea

Subcommittee Chair: Jonathan Chang, TSMC, Hsinchu, Taiwan

#### CONTEXT AND STATE OF THE ART

- GDDR6 is a JEDEC DRAM standard on the basis of discrete components for high-speed graphics DRAM.
- LPDDR5 is a JEDEC DRAM standard on the basis of discrete components for low-power and high-speed mobile DRAM.

#### TECHNICAL HIGHLIGHTS

- SK hynix introduces a 24Gb/s/pin GDDR6 DRAM, the highest pin rate DRAM reported so far.
  - A half-rate daisy-chain-based WCK tree with an LC PLL and IO circuits optimized for low-noise operation and a wide range of supply voltages.
- Samsung introduces a 16Gb sub 1V 7.14Gb/s/pin LPDDR5, the highest density mobile DRAM reported to date.
  - An innovative mosaic architecture is introduced to increase density. Long bus lines are managed by fully-sourcesynchronous and low-level swing signaling. An adaptive body bias (ABB) scheme is used to enhance power efficiency and yield.

- GDDR6 is intended for use in a large variety of high-speed applications such as graphics and artificial intelligence. GDDR6 offers an increased per-pin bandwidth over any DRAM. It promises a major boost in memory bandwidth with lower cost.
- LPDDR5 provides the performance and power efficiency for the next generation 5G communication, on-device artificial intelligence, advanced driver assistance systems (ADAS), and high-resolution displays and for mobile communication devices

# Session 26 Overview: RF Power-Amplifier and Front-End Techniques

#### **RF Subcommittee**

Session Chair: Hongtao Xu, Fudan University, Shanghai, China

Session Co-Chair: Toshiya Mitomo, Toshiba, Kanagawa, Japan

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF

Session Moderator: James Buckwalter, University of California, Santa Barbara, CA

RF PAs and front-end building blocks are critical for the output power, noise, power-efficiency, out-of-band rejection, and antenna impedance tuning of wireless transceivers. The first three papers in this session cover mm-wave PAs with efficiency enhancement techniques for 5G applications. The next paper presents a 28GHz reflection-coefficient sensor for mm-wave antenna-tuner applications. The session continues with two digital-PA papers in sub-6GHz frequency bands with back-off efficiency enhancement. The session concludes with an N-path filter exhibiting impedance transformation and passive voltage gain.

- In Paper 26.1, the Georgia Institute of Technology describes a continuous Marchand Doherty PA. The proposed PA demonstrates the broadband saturation performance of P<sub>sat</sub>>18.3dBm and peak PAE>23.3% from 26 to 60GHz.
- In Paper 26.2, KU Leuven presents a Doherty-like load-modulated balanced amplifier achieving 15.5dBm of average output power and 20% average efficiency at the data-rate of 18Gb/s in 28nm bulk CMOS.
- In Paper 26.3, the Georgia Institute of Technology shows a dual-drive PA core architecture that exploits the concurrent driving of the gate/source terminals. This design achieves the maximum PAE of 50% and maximum DE of 60% at 30GHz.
- In Paper 26.4, imec presents a sensor in 22nm FDSOI that measures complex reflection coefficients for VSWR values up to 5.7. The sensor costs 0.024mm<sup>2</sup> in die area, 13mW in power consumption, and up to 0.2dB in extra insertion loss.
- In Paper 26.5, the University of Electronic Science and Technology of China introduces a watt-level quadrature switched/floatedcapacitor PA with a hybrid Doherty topology and impedance boosting. The chip achieves 30.3dBm peak output power with 36.5% PAE and efficiency enhancement at 3/6/9/12/15dB PBOs.
- In Paper 26.6, the University of Southern California describes a current-mode subharmonic switching (SHS) digital-PA architecture for PBO efficiency enhancement. The 65nm CMOS PA achieves P<sub>sat</sub> of 27dBm and 40.1/29.2% efficiency at peak and -9dB PBO.
- In Paper 26.7, Columbia University describes an N-path filter that combines impedance-transformation and passive voltage amplification. The two-stage (three-stage) design achieves the maximum OOB rejection of 27dB (33dB) with the 3dB BW of 71MHz (83MHz).

# Session 26 Highlights: RF Power-Amplifier and Front-End Techniques

# [26.4] A Reflection-Coefficient Sensor for 28GHz Beamforming Transmitters in 22nm FD-SOI CMOS

Paper Authors: Y. Zhang, G. Mangraviti, J. Nguyen, Z. Zong, P. Wambacq

Paper Affiliation: imec, Heverlee, Belgium

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF Subcommittee

#### CONTEXT AND STATE OF THE ART

- Mm-wave communication and radar sensors use beamforming in antenna arrays to increase the range and the directionality.
- The antennas modify each other impedance under sharp scanning angles, and that drives PA load matching away from the optimal condition; thus, there is a need to use a tunable matching network that is controlled by a VSWR sensor.

#### **TECHNICAL HIGHLIGHTS**

- A CMOS reflection-coefficient sensor, which measures antenna VSWR by sensing the voltage over a capacitor in the PA output matching network, is proposed for 28GHz beamforming transmitters
  - The 22nm FD-SOI sensor measures complex reflection coefficients for VSWR values up to 5.7.
  - The sensor costs only 0.024mm<sup>2</sup> in die area, 13mW in power consumption, and up to 0.2dB in extra insertion loss.

- Low-loss and low-cost mm-wave VSWR sensor embedded in PA matching network for antenna arrays.
- The wide VSWR detection range makes self-healing mm-wave PAs suitable for commercial applications.

#### **Data Converters Subcommittee**

Session Chair: John Keane, Keysight Technologies, Santa Clara, CA

Session Co-Chair: Chih-Cheng Hsieh, National Tsing Hua University, Taiwan

Session Moderator: Bob Verbruggen, Xilinx, Dublin, Ireland

The ubiquitous SAR ADC continues to evolve with emerging noise-shaping variants enabling higher resolutions, while maintaining its power efficiency and fully dynamic nature. The first three papers demonstrate high-resolution SAR ADCs for precision applications using noise-shaping techniques and a charge-injection cell-based DAC. The next three papers describe higher bandwidth SAR ADCs using time-interleaving and pipelining techniques. The final paper is a bandpass ADC that combines an N-path filter with a noise-shaping SAR ADC.

- In Paper 27.1, Tsinghua University presents a 4<sup>th</sup>-order noise-shaping SAR that achieves 93dB SNDR over 250kHz bandwidth with 340uW power, leading to a Schreier FoM of 182dB. This NS-SAR has low complexity and is PVT-robust.
- In Paper 27.2, the University of Michigan introduces a Nyquist-rate capacitor-array-assisted cascaded charge-injection SAR ADC with 17b resolution. It achieves 14.14b ENOB and a 184.9dB Schreier FoM.
- In Paper 27.3, Georgia Institute of Technology describes a 3<sup>rd</sup>-order NS-SAR ADC featuring an EF-CIFF hybrid structure with kT/C noise cancelling. Fabricated in 65nm, the prototype achieves 13.8b ENOB consuming 119uW, leading to a 182dB Schreier FoM.
- In Paper 27.4, the University of Texas at Austin presents a pipelined SAR ADC equipped with a 3-stage fully dynamic floating inverter amplifier that achieves consistent 75.7dB SNDR and linearly scaled power from 0.4MS/s to 40MS/s.
- In Paper 27.5, MediaTek introduces a fully passive, time-interleaved noise-shaping SAR ADC in 22nm FDSOI. This ADC achieves a peak SNDR of 66.3dB over 80MHz bandwidth at 640MS/s and consumes 2.56mW.
- In Paper 27.6, the University of Macau describes a noise-shaping SAR-assisted pipeline ADC. With a partial-interleaving first stage and 4b DWA, it runs at 400MHz and achieves 75dB SNDR with 25MHz bandwidth.
- In Paper 27.7, the University of Texas at Austin presents a an open-loop bandpass ΔΣ modulator that combines an N-path filter with a noise-shaping SAR ADC. This work achieves 78.7dB SNDR and a 167.1dB Schreier FoM.

### **Session 27 Highlights: Discrete-Time ADCs**

# [27.1] A 250kHz-BW 93dB-SNDR 4<sup>th</sup>-Order Noise-Shaping SAR Using Capacitor Stacking and Dynamic Buffering

Paper Authors: Jiaxin Liu<sup>1</sup>, Dengquan Li<sup>2</sup>, Yi Zhong<sup>1</sup>, Xiyuan Tang<sup>3</sup>, Nan Sun<sup>1,3</sup>

Paper Affiliation: 1Tsinghua University, Beijing, China, 2Xidian University, Xi'an, China, 3University of Texas at Austin, Austin, TX

Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI

#### CONTEXT AND STATE OF THE ART

- The noise-shaping (NS) SAR is an emerging hybrid architecture aiming to combine the benefits of both SAR and  $\Delta\Sigma$  ADCs.
- The NS filters are mainly implemented using the power-consuming amplifier-based integrator with PVT variation, or passive charge sharing with signal attenuation and mild NTF.

#### TECHNICAL HIGHLIGHTS

- The first noise-shaping SAR ADC that achieves >90dB SNDR with >100kHz bandwidth consuming 340uW
  - A passive NS filter architecture uses stacking residue capacitors as an integrator and a simple source follower as a unity-gain buffer to achieve a PVT-robust 4<sup>th</sup>-order noise shaping without signal attenuation.
  - o Differential integration and chopping operations are implemented to address the buffer offset and flicker noise issues.
  - The 1.1V NS-SAR achieves a dynamic range of 95dB and an SNDR of 93.3dB with a 0.3dB variation against 10% supply variation.

- Advanced circuit techniques for low-power high-resolution ADCs make biomedical and AIoT applications more efficient.
- Compact and PVT-robust ADCs enable applications using multiple ADCs and parallel processing.

#### Imagers, Medical, MEMS and Displays Subcommittee

Session Chair: Joonsung Bae, Kangwon National University, Korea

Session Co-Chair: Jennifer Lloyd, Analog Devices, Santa Clara, CA

Session Moderator: Chris Van Hoof, IMEC, Leuven, Belgium

Health-monitoring capabilities continue to expand, with increasingly low-power and artifact-tolerant operation for both non-invasive and immersive applications. The first paper describes a high-performance artifact-tolerant ECG/EOG system, followed by papers demonstrating in-ear and on-chest PPG measurement, dry-electrode tolerant ECG recording, bioimpedance measurement, and long-term ECG recording capability. Two neural-recording papers demonstrate very low power and area for immersive design.

- In Paper 28.1, the University of California, San Diego presents a sensor interface front-end for wearable applications with 95dB dynamic range and 128dB linearity in 1.68µW in 65nm CMOS. The front-end is demonstrated through in-vivo recordings of ECG and EOG in the presence of motion and other artifacts.
- In Paper 28.2, Samsung presents a monolithic PPG sensor for in-ear monitoring. An integrated near-IR sensor achieves 4-fold increase in spectral efficiency from 400 to 1000nm over conventional photodiodes. The system provides 90dB dynamic range and consumes 24µW in 5.5mm<sup>2</sup> in 65nm BSI CMOS, enabling in-vivo measurement of heart rate and blood pressure in the presence of motion artifacts.
- In Paper 28.3, IMEC introduces a sensor for chest-based PPG monitoring utilizing a 2<sup>nd</sup>-order light-to-digital converter with an on-chip DC compensation loop. The system achieves 134dB dynamic range and consumes 28µW in 0.18µm CMOS, demonstrating both chest PPG and SpO2 in vivo.
- In Paper 28.4, the University of California, San Diego introduces a VCO-based ΔΣ AFE for biopotential measurement, achieving 92.3dB SNDR in 1kHz, and enabling ECG signal recording in the presence of motion artifacts. The 65nm CMOS system includes an impedance booster to maintain >50MΩ input impedance for dry electrode measurement, while consuming <5.8µW.</li>
- In Paper 28.5, the Institute of Microelectronics, Singapore describes a bioimpedance measurement scheme, leveraging firstorder noise-shaped ΔΣ and a technique to modulate flicker noise to high frequencies for improved noise performance. The circuit achieves 101.9dB SNR within a 4Hz bandwidth, and measures bioimpedance from 20Ω to 20kΩ with a maximum of 119.3µW in 0.6mm<sup>2</sup> in 40nm CMOS, and demonstrates respiration and heart-rate measurement in vivo.
- In Paper 28.6, KAIST presents a biopotential amplifier utilizing an adaptive loop to achieve a total CMRR of >100dB. This 180nm CMOS, 22.6µW circuit demonstrates long-term recording of ECG in the presence of electrode mismatch of up to 40% and common-mode interference of up to 10V.
- In Paper 28.7, the University of Freiburg IMTEK reveals a neural recording front-end for immersible probe designs based on a continuous-time g<sub>m</sub>-C incremental ΔΣ ADC design. The modular system enables a flexible number of recording sites and circuit sharing to achieve very low per-channel area (<0.0005mm<sup>2</sup> total with electrode offset compensation) and power of <15µW per channel in 180nm CMOS.</li>
- In Paper 28.8, the University of Toronto introduces an SoC for peripheral nerve stimulation and recording. This 64-channel neural-recording system includes cuff imbalance compensation and per-channel ΔΣ ADC consuming 140nW in 130nm CMOS. The system demonstrates nerve stimulation and fascicle selectivity in vivo.

### **Session 28 Highlights: Biomedical Systems**

#### [28.1] A Distortion-Free VCO-Based Sensor-to-Digital Front-End Achieving 178.9dB FoM and 128dB SFDR with a Calibration-Free Differential Pulse-Code Modulation Technique

## [28.8] Multi-Modal Peripheral Nerve Active Probe and Microstimulator with On-Chip Dual-Coil Power/Data Transmission and 64 2<sup>nd</sup>-order Opamp-Less ΔΣ ADCs

Paper 28.1 Authors: Jiannan Huang, Patrick P Mercier

Paper 28.1 Affiliation: University of California, San Diego, La Jolla, CA

**Paper 28.8 Authors:** Maged ElAnsary, Jianxiong Xu, José Sales Filho, Gairik Dutta, Liam Long, Aly Shoukry, Camilo Tejeiro, Chenxi Tang, Enver Kilinc, Jaimin Joshi, Parisa Sabetian, Samantha Unger, José Zariffa, Paul Yoo, Roman Genov

Paper 28.8 Affiliation: University of Toronto, Toronto, ON, Canada

Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

#### CONTEXT AND STATE OF THE ART

- Low-power, low-area, and high-dynamic range analog front-ends (AFEs) enable biopotential measurements such as heart rate, heart rate variability, oxygen saturation (SpO2), blood pressure, respiration, and ECG recording.
- Application issues such as large common-mode signals created by motion artifacts, high impedances presented by dry electrodes, and tissue heating in immersive situations, are examples of challenges that limit the ability to provide robust, reliable data over the duration of the measurement.
- Innovations in biomedical system design, ADC conversion techniques, calibration/cancellation of common-mode signals, and circuit reuse, are advancing the state of the art in biomedical systems.

#### TECHNICAL HIGHLIGHTS

- The University of California, San Diego presents a calibration-free sensor interface front-end for wearable applications and demonstrates in-vivo recordings of ECG and EOG in the presence of large motion artifacts.
  - The state-of-the-art VCO-based ADC achieves 95dB dynamic range and 128dB linearity in 1.68µW, with a total chip area of 0.055mm<sup>2</sup> in 65nm CMOS.
- The University of Toronto introduces an SoC for peripheral nerve stimulation and recording, demonstrating nerve stimulation and fascicle selectivity in vivo.
  - This 64-channel neural-recording system includes cuff-imbalance compensation, and a per-channel ΔΣ ADC consuming 140nW in 0.01mm<sup>2</sup> in 130nm CMOS.

- Advanced circuit techniques for artifact-tolerant, low-power, and robust sensor interfaces are enabling new non-invasive and immersive biomedical measurements.
- Small area and low power consumption, while preserving dynamic range and linearity performance, enable new modalities such as in-capsule endoscopy, high-selectivity peripheral nerve stimulation, and in-ear and on-chest PPG measurement.

# Session 29 Overview: Digital Circuits for Computing, Clocking, and Power Management

#### **Digital Circuits Subcommittee**

Session Chair: Ping-Hsuan Hsieh, National Tsing Hua University, Hsinchu, Taiwan

Session Co-Chair: Mingoo Seok, Columbia University, New York City, NY

Session Moderator: Keith Bowman, Qualcomm, Raleigh, NC

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

In this session, eight papers highlight advancement in digital circuits for computing, clocking and power management. For computing, two papers demonstrate: 1) RRAM-based in-memory computing for convolutional deep neural networks, and 2) a dynamic-precision bitserial spatial accelerator for solving differential equations. For clocking, three papers demonstrate: 1) a fast-lock wide-range clock generator with asynchronous adaptive-droop mitigation, 2) a fractional-N MDLL with a background DTC calibration, and 3) a fractional output divider with replica-DTC-free background calibration. The last three papers focus on digital power management, demonstrating: 1) a distributed digital low-dropout (DLDO) voltage regulator with a high power density and a wide load current range in 5nm FinFET technology; 2) a single-inductor 4-output power converter with dynamic droop allocation and adaptive clocking, and 3) an ultra-low-power SoC with integrated dynamic voltage stacking.

- In Paper 29.1, Georgia Institute of Technology and TSMC present a hybrid compute-in-memory/digital, 65Kb 0.437mm<sup>2</sup> RRAM macro in 40nm CMOS with active-feedback-based read and in-situ write verification. The macro design highlights how digital circuit techniques can alleviate technology challenges. The in-situ write verification reduces the resistance variation of an RRAM array to 1/3 and for compute-in-memory applications has an average (peak) energy efficiency of 4.15 (56.67)TOPS/W at 100MHz.
- In Paper 29.2, Nanyang Technological University presents a graph accelerator in 65nm CMOS for solving partial differential equations using the finite difference method (FDM). Energy efficiency is improved using bit-serial communication and a residuebased FDM, while performance is improved using a checkerboard update method to maximize parallelism. The graph accelerator integrates 21×21 PEs in 0.462mm<sup>2</sup> and consumes 1.59nJ per iteration at 16b precision, 1V, and 25.6MHz.
- In Paper 29.3, Intel presents a 80ns fast-lock 0.4-to-6.5GHz 0.01mm<sup>2</sup> clock generator in 10nm CMOS with self-referenced, asynchronous adaptive droop mitigation for uninterrupted, overshoot-free clocks for DVFS. When di/dt constraints prevail, measurements show gradual frequency transitions up to 650MHz per 100ns, and when unconstrained, within 80ns with <1% exit frequency error.</li>
- In Paper 29.4, the University of South California presents a 0.18mm<sup>2</sup> fractional-N MDLL in 65nm CMOS with a background DTC calibration. The DTC gain and offset errors are estimated and corrected in the analog and digital domains, respectively. TDC dithering and adaptive comb-filter-assisted dither cancellation are used to further enhance calibration accuracy. Measurement results show >25dB improvement that results in -60dBc fractional spur and 1.67ps RMS jitter.
- In Paper 29.5, National Taiwan University presents a fractional output divider in 90nm CMOS with replica-DTC-free background calibration. The demonstrate 0.625-to-200MHz divider frequency range, with 120fs RMS jitter, occupying 0.008mm<sup>2</sup> area and consuming 1.5mW power. The frequency switching time is less than 100ns and the spurs below -65dBc.
- In Paper 29.6, Samsung Electronics introduces 16 distributed digital LDOs with a global controller in 5nm FinFET CMOS. The
  proposed time-multiplexing calibration loop achieves a 46% reduction in output mismatch and maintains 20mV droop under a
  1A load step with peak current efficiency of 99.89%. The maximum load current is 6.4A in 0.16mm<sup>2</sup>, resulting in a current density
  of 40A/mm<sup>2</sup>.

- In Paper 29.7, the University of Washington presents a combined implementation of dynamic droop allocation and adaptive clocking for a single-inductor 4-output SoC in 65nm CMOS to improve performance and system efficiency, with 0.25mm<sup>2</sup> on-chip area. Measurements demonstrate a 97% average margin reduction over the baseline design and a cycle-loss reduction from 271 to 77 as compared to the adaptive clock scheme.
- In Paper 29.8, Nanjing Low Power IC Technology Institute presents a dynamic voltage-stacking scheme to reduce the sleep current of a 2.38mm<sup>2</sup> IoT microcontroller in 40nm CMOS. Measurements demonstrate a 115nA sleep state leakage at 3V with a 32% reduction compared to a conventional flat architecture and achieves a ULPMark-CP score of 1205.

# Session 29 Highlights: Digital Circuits for Computing, Clocking, and Power Management

# [29.1] A 40nm 64Kb 56.67TOPS/W Read-Disturb-Tolerant Compute-in-Memory/Digital RRAM Macro with Active-Feedback-Based Read and In-Situ Write Verification

Paper 29.1 Authors: Jong-Hyeok Yoon<sup>1</sup>, Muya Chang<sup>1</sup>, Win-San Khwa<sup>2</sup>, Yu-Der Chih<sup>3</sup>, Meng-Fan Chang<sup>2</sup>, Arijit Raychowdhury<sup>1</sup>

**Paper 29.1 Affiliation:** <sup>1</sup>Georgia Institute of Technology, Atlanta, GA, <sup>2</sup>TSMC Corporate Research, Hsinchu, Taiwan, <sup>3</sup>TSMC Design Technology, Hsinchu, Taiwan

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

#### CONTEXT AND STATE OF THE ART

- As memory-centric workloads such as artificial intelligence continue to gain momentum, technology solutions that provide higher on-die memory capacity/bandwidth can provide scalability beyond SRAM.
- Resistive RAM (RRAM) is one of the strong candidates, but it suffers from key technological challenges such as high-resistancestate (HRS) to low-resistance-state (LRS) variation, HRS/LRS resistance distribution, and temperature-induced drift.

#### TECHNICAL HIGHLIGHTS

- Georgia Institute of Technology jointly with Taiwan Semiconductor Manufacturing Company presents a 40nm 64Kb 56.67TOPS/W read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and insitu write verification.
  - Voltage-mode sensing with RRAM-cell paired with a current source can keep a constant sampling/sensing margin, addressing the challenge related to the high-resistance-state (HRS) to low-resistance-state (LRS) variation of RRAM bitcells.
  - An in-situ and low-overhead write verification technique can reduce a wide LRS/HRS resistance distribution, improving data resolution.
  - o An in-situ read-disturb monitor manages the health of each RRAM cell to limit temperature-induced drift.
  - The 64Kb RRAM macro is the first voltage-mode sensing RRAM macro overcoming a low HRS/LRS ratio within a highendurance RRAM array. In the compute-in-memory mode, it achieves energy efficiency of 56.67TOPS/W.

#### APPLICATIONS AND ECONOMIC IMPACT

 RRAM macros can be a critical building block for future digital processors for emerging memory-intensive workloads, such as artificial intelligence.

# Session 29 Highlights: Digital Circuits for Computing, Clocking, and Power Management

# [29.3] 80ns Fast-Lock 0.4-to-6.5GHz Clock Generator with Self-Referenced, Asynchronous Adaptive Droop Mitigation

Paper 29.3 Authors: P. Mosalikanti, Q. S. Wang, K.-Y. J. Shen, M. Neidengard, S. F. S. Farooq, V. Grossnickle, N. Kurd

Paper 29.3 Affiliation: Intel Corporation, Portland, OR

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

#### CONTEXT AND STATE OF THE ART

- Uninterrupted, over-shoot free clocks for dynamic voltage and frequency scaling (DVFS) are needed for high-performance SoCs.
- Dynamic clock slowdown is the remedy to supply droops in compute subsystems that can result in timing failures due to uneven slowdown between clock and data paths. Implementations using a fixed-PLL, followed by a droop-actuated DLL are area and power costly.

#### TECHNICAL HIGHLIGHTS

- Intel presents an 80ns fast-lock 0.4-to-6.5GHz clock generator with self-referenced, asynchronous adaptive droop mitigation in 10nm CMOS achieving uninterrupted, overshoot-free clocks for DVFS.
  - Adaptive frequency synthesis (AFS) helps mitigate guardbands induced by computation-dependent supply droop. Selfreferenced droop detection with asynchronous operation reduces latency and allows for a larger F<sub>max</sub>/V<sub>min</sub> guardband reduction.
  - A 0.01mm<sup>2</sup> clock generator accomplishes all the integer-N frequency transitions within single-digit reference clock cycles between 0.4GHz and 6.5GHz without any frequency overshoot. Measurements further show gradual frequency transitions up to 650MHz per 100ns, where di/dt constraints prevail; and when unconstrained, within 80ns with <1% exit frequency error.

- Frequency-locked loops with self-referenced asynchronous adaptive droop mitigation provide a compact solution to DVFS with uninterrupted, over-shoot free clocks.
- Self-referenced droop detection with asynchronous operation reduces latency and achieves larger F<sub>max</sub>/V<sub>min</sub> guardband reduction.

#### **Memory Subcommittee**

Session Chair: Yasuhiko Taito, Renesas Electronics, Kodaira, Japan

Session Co-Chair: Violante Moschiano, Micron Semiconductor, Avezzano, Italy

Session Moderator: Shinichiro Shiratake, Kioxia, Yokohama, Japan

A 3D NAND flash memory continues to increase in bit density and performance for both local and cloud data storage applications. The number of WL layers increases to more than 170 layers, up from 96-128 layers presented previously at ISSCC. A floorplanning technique used to put page buffer circuits into a small area under a highly-stacked memory array is shown in paper 30.1. Paper 30.2 and 30.4 present independent multi-plane read techniques to improve random read performance. Paper 30.3 and 30.4 reveal high-speed 2.0Gbps interfaces.

- In Paper 30.1, SK hynix presents a 176-stacked 512Gb 3b/cell 3D NAND-flash memory that realizes a 11.0Gb/mm<sup>2</sup> bit density via an optimized floorplan and a high-efficiency charge pump, in addition to, using a peripheral-circuit-under-cell-array architecture. This design achieves a 168MB/s program throughput and a 50us read time.
- In Paper 30.2, Intel shows a 144-tier 1Tb 4b/cell 3D NAND-flash memory with a 13.8Gb/mm<sup>2</sup> bit density via a CMOS-underarray technique; achieving a 40MB/s program throughput and a 85us read time. Independent multi-plane reads, which double random read performance, and a block-by-deck technique, to reduce block size, are also implemented.
- In Paper 30.3, Samsung presents a 512Gb 3b/cell 3D NAND-flash memory featuring the 7<sup>th</sup> generation of cell-over-peri (COP) 3D NAND technology; achieving a 184MB/s program throughput and a 40us read time. Low-tapped termination-type circuits that support a 2.0Gbps interface are introduced.
- In Paper 30.4, KIOXIA describes a 170-stacked 1Tb 3b/cell 3D-flash memory with a 10.4Gb/mm<sup>2</sup> bit density. An asynchronous and independent plane read is introduced to increase random access performance. An enhanced read scheme and an IO duty-cycle correction technique are introduced to achieve a 50us read time and a 2Gbps IO throughput.

### Session 30 Highlights: Non-Volatile Memory

## [30.1] A 176-stacked 512Gb 3b/Cell 3D NAND Flash with 11.0Gb/mm<sup>2</sup> Density Using Peripheral Circuit under Cell Array

## [30.2] A 1Tb 4b/Cell, 144-Tier, 3D NAND Flash Memory with 40 MB/s Program Throughput and 13.8 Gb/mm<sup>2</sup> Bit Density

**Paper 30.1 Authors**: Jae-Woo Park, Doogon Kim, Sunghwa Ok, Jaebeom Park, Taeheui Kwon, Hyunsoo Lee, Sungmook Lim, Sun-Young Jung, Hyeongjin Choi, Taikyu Kang, Gwan Park, Chul-Woo Yang, Jeong-Gil Choi, Gwihan Ko, Jaehyeon Shin, Ingon Yang, Junghoon Nam, Hyeokchan Sohn, Seok-In Hong, Yohan Jeong, Sung-Wook Choi, Changwn Choi, Junyoun Lim, Dongkyu Youn, Sanghyuk Nam, Juyeab Lee, Myungkyu Ahn, Hoseok Lee, Seungpil Lee, Jongmin Park, Kichang Kwean, Woopyo Jeong, Jungdal Choi, Jinkook Kim, Kyo-Won Jin

Paper 30.1 Affiliation : SK hynix Semiconductor, Icheon-si, Korea

**Paper 30.2 Authors:** Ali Khakifirooz<sup>1</sup>, Sriram Balasubrahmanyam<sup>2</sup>, Richard Fastow<sup>1</sup>, Kristopher H. Gaewsky<sup>2</sup>, Chang Wan Ha<sup>1</sup>, Rezaul Haque<sup>2</sup>, Owen W. Jungroth<sup>2</sup>, Steven Law<sup>1</sup>, Aliasgar S. Madraswala<sup>2</sup>, Binh Ngo<sup>2</sup>, Naveen V. Prabhu<sup>2</sup>, Shantanu Rajwade<sup>1</sup>, Karthikeyan Ramamurthi<sup>2</sup>, Rohit S. Shenoy<sup>1</sup>, Jacqueline Snyder<sup>2</sup>, Cindy Sun<sup>1</sup>, Deepak Thimmegowda<sup>1</sup>, Bharat M. Pathak<sup>2</sup>, Pranav Kalavade<sup>1</sup>

Paper 30.2 Affiliation: <sup>1</sup>Intel, Santa Clara, CA, <sup>2</sup>Intel, Folsom, CA

Subcommittee Chair: Jonathan Chang, TSMC, Hsinchu, Taiwan

#### CONTEXT AND STATE OF THE ART

- The number of WL-layers continues to increase to realize higher bit density in 3D NAND-flash memories. CMOS peripheral circuits are placed under the memory array to increase array efficiency.
- Circuit techniques to keep, or even improve, read and write performance despite the increased number of WL-layers are introduced for TLC and QLC 3D NAND flash memories.

#### TECHNICAL HIGHLIGHTS

- SK Hynix presents a 512Gb 3b/Cell 3D NAND-flash memory with 176 WL layers.
  - 11.0Gb/mm<sup>2</sup> bit density via 176-WL-layers and a compact page buffer under the cell array.
  - A central-WL-driving architecture and a high-efficiency charge pump to cope with the increased WL parasitic load.
- Intel presents a 1Tb 4b/Cell 3D NAND flash memory with 144-word-line-layers.
  - o 13.8Gb/mm<sup>2</sup> bit density for a QLC NAND flash memory with a 144 WL-layers and CMOS under cell array.
  - A 1630us program time is achieved by an improved programming algorithm. A new independent multi-plane read operation doubles the random read performance to 22.5kIOPS.

- Maximum chip density of NAND-flash memory has been around 1Tb~1.33Tb over the last four years. However, bit density has
  been steadily increasing by 3D stacking technology to more than 10Gb/mm<sup>2</sup>, even for TLC NAND-flash memory.
- Performance of both TLC and QLC NAND-flash memories continues to improve, despite physical scaling, to fulfill increasing
  performance demands from data centers, mobile devices and other storage applications.

#### **Analog Subcommittee**

Session Chair: Marco Berkhout, Goodix Technology, The Netherlands

Session Co-Chair: Drew A. Hall, University of California, San Diego, CA

Session Moderator: Jiawei Xu, Fudan University, China

Analog techniques continue to push the frontiers of precision, power, and performance. This year's edition features two amplifiers and two accurate, temperature insensitive frequency references. The first paper describes the first reported closed-loop, filter-less Class-D amplifier and achieves the best output efficiency with low-power operation. The second paper pushes the state-of-the-art with the highest energy efficiency and lowest area. The third paper describes another low-power, low-area frequency reference that requires only a single-point trim to cover the entire industrial temperature range. Lastly, the final paper achieves the lowest chopper-induced intermodulation distortion (IMD) of -125.9dB ever reported.

- In Paper 31.1, Analog Devices presents a digital-input, variable-frequency ΔΣ Class-D amplifier for wireless headphone applications. The amplifier achieves 93dB THD, 113dB SNR, and 82mW non-clipping output power with 93% efficiency.
- In Paper 31.2, Yonsei University reports a 0.9V 28MHz frequency reference with 5pJ/cycle and ±200ppm inaccuracy from -40°C to 85°C. By using dual RC polyphase filters and digital Φ-ΔΣM converters, it occupies only 0.06mm<sup>2</sup>.
- In Paper 31.3, Delft University of Technology describes a 16MHz RC-based frequency reference that achieves an inaccuracy of ±400ppm over the industrial temperature range with a single room-temperature (RT) trim. The chip draws 88µA from a 1.8V supply and occupies 0.14mm<sup>2</sup>.
- In Paper 31.4, Delft University of Technology presents a chopper-stabilized amplifier that uses a novel fill-in technique to mitigate spikes caused by non-zero amplifier delay. The amplifier achieves -107dB IMD and 28dB suppression of chopper-induced IMD for input frequencies near 4× the chopping frequency.

### Session 31 Highlights: ΔΣ Class-D Headphone Amplifier

#### [31.1] An 82mW ΔΣ-Based Filter-Less Class-D Headphone Amplifier with -93dB THD+N, 113dB SNR and 93% Efficiency

Paper Authors: A. Matamura<sup>1</sup>, N. Nishimura<sup>1</sup>, P. Birdsong<sup>2</sup>, A. Bandyopadhyay<sup>2</sup>, A. Spirer<sup>2</sup>, M. Markova<sup>2</sup>, S. Liu<sup>2</sup>

Paper Affiliation: Analog Devices, 1Tokyo, Japan, 2Wilmington, MA

Subcommittee Chair: Kofi A.A. Makinwa, Delft University of Technology, Delft, The Netherlands, Analog

#### CONTEXT AND STATE OF THE ART

- True Wireless Stereo/True Wireless Active-Noise-Canceling (ANC) headphones require low-latency digital-input headphone drivers that consume the lowest possible power to maximize battery life while providing high-fidelity audio playback.
- Typical headphone drivers use power-inefficient Class-A/AB topologies, or Class-G/H drivers with a ground-center operation to improve power efficiency at the expense of needing external components to decouple the required extra supply rails.

#### **TECHNICAL HIGHLIGHTS**

- A ΔΣ variable-frequency operation maintains audio performance at full-scale output signals while minimizing switching at low signal levels to minimize quiescent power.
  - Low supply voltage operation and high performance are achieved using a 5<sup>th</sup>-order ΔΣ digital-input filter-less Class-D architecture with digital feedforward techniques.
  - A low-power 1<sup>st</sup>-order dynamic element matching (DEM) scheme mitigates DAC mismatch.

- The high-fidelity audio amplifier is suitable for battery-powered True Wireless hearables.
- High efficiency extends battery life and enhances user experience.

### **Session 32 Overview: Frequency Synthesizers**

#### **RF Subcommittee**

Session Chair: Wei Deng, Tsinghua University, Beijing, China

Session Co-Chair: Jaehyouk Choi, KAIST, Daejeon, Korea

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF

Session Moderator: Wanghua Wu, Samsung, San Jose, CA

This session presents the latest advances in digital and analog phase-locked loops (PLLs) and FMCW chirp generators from 2.4 to 60GHz for high-performance wireless applications.

- In Paper 32.1, KAIST presents a low-jitter and low-spur fractional-N ring-DPLL that performs a DTC second/third-order nonlinearity cancelation and employs a probability-density-shaping ΔΣM. At near-integer frequencies (worst case), the rms jitter and fractional spurs are less than 365fs and -63dBc, respectively.
- In Paper 32.2, Samsung Semiconductor demonstrates a 6GHz 14nm analog sampling fractional-N PLL with a DTC range reduction technique. It achieves an rms jitter of 80fs (integrated from 10kHz to 40MHz) and -72.4dB fractional spur at 14.2mW power consumption. The PLL also supports a low-power mode with 91.5fsrms jitter and 8.2mW power consumption.
- In Paper 32.3, Politecnico di Milano presents a fractional-N bang-bang digital PLL that overcomes the typical quantization-noise limit of bang-bang phase detectors by employing an adaptively calibrated noise-shaping technique to the single-bit phase detector. Consuming 10.8mW, the PLL achieves a -107.6fs integrated jitter, which is at par with state-of-the-art analog PLLs.
- In Paper 32.4, KAIST presents a 14-to-16GHz low-jitter fractional-N SSPLL that cancels the quantization error (Q-error) in the voltage domain. The measured rms jitter and fractional spur were 104fs and –61dBc near 15GHz.
- In Paper 32.5, Peking University introduces a 24GHz self-calibrated ADPLL-based FMCW synthesizer. The synthesizer uses a self-calibration scheme with adaptive overlap compensation and continuous ramp tracking. The synthesizer achieves a 3.2GHz chirp bandwidth and 320MHz/ms slope.
- In Paper 32.6, Intel presents a 12.1-to-16.6GHz sub-sampling ADPLL that is based on a stochastic flash TDC and a coupled dual-core DCO. It is implemented in 16nm FinFET CMOS and achieves 47.3fs<sub>rms</sub> jitter performance at 56mW power dissipation.
- In Paper 32.7, Tokyo Institute of Technology demonstrates a 32kHz-reference 2.4GHz fractional-N oversampling PLL, which captures amplitude/phase information of the reference and realizes 256 phase detections per reference cycle. It realizes 200kHz loop bandwidth, 5.4ps<sub>rms</sub> jitter, and -78dBc reference spur in the fractional mode with 4.97mW power consumption.
- In Paper 32.8, Politecnico di Milano reports a 12.9-to-15.1GHz digital bang-bang PLL-based LO phase-shifting system. Implemented in 28nm CMOS, it achieves 98.4fs<sub>rms</sub> jitter at 10.8mW power and 0.6° rms accuracy.

### **Session 32 Highlights: Frequency Synthesizers**

[32.1] A 365fs<sub>rms</sub>-Jitter and -63dBc-Fractional Spur 5.3GHz-Ring-DCO-Based Fractional-N DPLL Using a DTC Second/Third-Order Nonlinearity Cancelation and a Probability-Density-Shaping  $\Delta\Sigma M$ 

[32.2] A 14nm Analog Sampling Fractional-N PLL with a Digital-to-Time Converter Range-Reduction Technique Achieving 80fs Integrated Jitter and 93fs at Near-Integer Channels

Paper Authors: H. Park\*1, C. Hwang\*1, T. Seong\*1,2, Y. Lee<sup>3</sup>, J. Choi1

**Paper Affiliation:** <sup>1</sup>KAIST, Daejeon, Korea, <sup>2</sup>Ulsan National Institute of Science and Technology, Ulsan, Korea, <sup>3</sup>Samsung Electronics, Hwaseong, Korea

Paper Authors: W. Wu<sup>1</sup>, C.-W. Yao<sup>1</sup>, C. Guo<sup>1</sup>, P.-Y. Chiang<sup>1</sup>, P.-K. Lau<sup>1</sup>, L. C.<sup>1</sup>, S. W. Son<sup>1</sup>, T. Cho<sup>2</sup>

Paper Affiliation: 1Samsung Semiconductor, San Jose, CA, 2Samsung Electronics, Hwaseong, Korea

Subcommittee Chair: Jan Craninckx, imec, Leuven, Belgium, RF Subcommittee

#### CONTEXT AND STATE OF THE ART

- Ultra-low jitter performance of local oscillators for 5G wireless transceivers is required to enable spectrally efficient high-order modulation schemes.
- At the same time, an increase in the required number of PLLs for 5G wireless transceivers demands effort to reduce the power consumption and the silicon area.

#### TECHNICAL HIGHLIGHTS

- A 5.3GHz-ring-DCO-based fractional-N DPLL performs a DTC second/third-order nonlinearity cancelation
  - A 5-to-6GHz ring-oscillator-based fractional-N digital PLL performs a DTC second/third-order nonlinearity cancelation technique and employs a probability-density-shaping ΔΣM. It achieved the lowest rms jitter (i.e., 365fs) and the lowest worst-case fractional spur (i.e., -63dBc) among the state-of-the-art ring-oscillator-based fractional-N PLLs.
- A 14nm analog sampling fractional-N PLL employs a digital-to-time converter range reduction technique
  - A 6GHz fractional-N sampling PLL with a DTC-range-reduction technique is described. The PLL achieves an rms jitter of 80fs (integrated from 10kHz to 40MHz) and -72.4dB fractional spur at 14.2mW power consumption. It achieves the highest figure-of-merit for fractional-N PLLs using a reference clock below 200MHz.

- Phase noise (jitter), power consumption, and silicon area are essential design metrics that are traded off in the design of CMOS frequency synthesizers for modern wireless communication systems.
- These PLLs present dedicated solutions to overcome the foregoing trade-off in PLL designs.

# Session 33 Overview: High-Voltage, GaN and Wireless Power

#### **Power Management Subcommittee**

Session Chair: Min Chen, Analog Devices, Santa Clara, CA

Session Co-Chair: Bernhard Wicht, University of Hannover, Hannover, Germany

Session Moderator: Kousuke Miyaji, Shinshu University, Nagano, Japan

Power conversion is ubiquitous, and a variety of power-converter topologies have found their ways into both established and emerging applications. This session aims at showcasing some of the best work in the application-specific power conversion field, such as fully integrated power GaN for offline fast-charger applications, high-voltage buck-boost for automotive LED lighting applications, 48V-to-1V hybrid converter for data center applications, fully integrated isolated power for industrial applications, wireless power transfer for medical applications, and a hybrid switching supply modulator for wireless communications.

- In Paper 33.1, National Chiao Tung University and Realtek Semiconductor present a monolithically integrated GaN driver and GaN switch with a temperature-compensated fast turn-on controller for up to 50MHz switching frequency and 118.3V/ns slew rate. A 65W offline active clamp flyback converter using the proposed GaN IC achieves 95.4% peak efficiency.
- In Paper 33.2, Southeast University and Central Semiconductor Manufacturing describe a 600V GaN active gate driver with closed-loop dynamic feedback delay compensation in a 0.5µm 600V SOI process. The proposed chip achieves 22.5% turn-on energy saving with dv/dt levels similar to those of fixed current-drive control.
- In Paper 33.3, Analog Devices presents a 2MHz, 100V<sub>OUT</sub>-capable GaN-based buck-boost LED driver for automotive LED lighting applications. The chip achieves 16.8dBµV radiated EMI noise reduction by using a digitally assisted pseudo-randomized flicker-free frequency modulation and a bootstrap-charge balancing scheme.
- In Paper 33.4, Zhejiang University presents a 12-level series-capacitor 48V-to-1V hybrid converter with on-chip switches and one off-chip GaN FET. The converter achieves 998A/inch<sup>3</sup> power density and 90.2% peak efficiency by exploiting the superior switch performance of on-chip 5V transistors.
- In Paper 33.5, the University of Science and Technology of China, Xiamen University, and Chinese Academy of Sciences, present a transformer-in-package isolated DC-DC converter using glass-based fan-out wafer-level packaging. The proposed converter is embedded in a 5×5mm<sup>2</sup> package and achieves 46.5% peak efficiency and 1.25W maximum output power.
- In Paper 33.6, Iowa State University presents a 6.78MHz wireless power transfer system with fully integrated wireless hysteretic control for bioimplants. The system eliminates the need for off-chip components and achieves 20% light-load efficiency enhancement and instant transient response.
- In Paper 33.7, KAIST, Samsung Electronics, and New York University Abu Dhabi present a wireless power and data transfer system IC over a single frequency-split link for neural prostheses. The proposed system delivers 115mW power with 89.6% power transfer efficiency while simultaneously transmitting data with a rate of 2.5Mb/s.
- In Paper 33.8, Dartmouth College presents a switched-capacitor driver circuit for high-voltage electrostatic and piezoelectric micro-robotic actuators. The system achieves over 300V drive voltage and 400mW peak reactive power delivery with a 10× power reduction factor.
- In Paper 33.9, Samsung Electronics presents a hybrid switching supply modulator for envelope tracking power amplifiers. The
  proposed architecture a combined structure of a linear amplifier and an interleaved hybrid buck-boost converter achieves
  130MHz channel bandwidth and 10W output power for 2G/3G/LTE/NR RF power amplifiers.

# Session 33 Highlights: High-Voltage, GaN and Wireless Power

#### [33.1] A Fully Integrated GaN-on-Silicon Gate Driver and GaN Switch with Temperaturecompensated Fast Turn-on Technique for Improving Reliability

Paper Authors: H.-Y. Chen<sup>1</sup>, K.-H. Chen<sup>1</sup>, Y.-H. Lin<sup>2</sup>, S.-R. Lin<sup>2</sup>, T.-Y. Tsai<sup>2</sup>

Paper Affiliation: 1National Chiao Tung University, Hsinchu, Taiwan, 2Realtek Semiconductor, Hsinchu, Taiwan

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA, Power Management

#### CONTEXT AND STATE OF THE ART

- Size reduction and efficiency improvements are key performance drivers for power conversion. GaN technology is enabling an increased switching frequency and a corresponding size reduction of power converters for a wide variety of applications.
- The integration of drivers and control circuitry on the same die as the GaN power transistor can yield both performance as well
  as cost improvements but is hampered by the lack of PMOS devices.

#### TECHNICAL HIGHLIGHTS

- National Chiao Tung University and Realtek Semiconductor present a monolithically integrated 650V GaN-on-Si gate driver and power switch with internal supporting circuits.
  - Threshold voltage tracking enables a switching frequency up to 50MHz, as well as a dV<sub>DS</sub>/dt slew rate of 118.3V/ns.
  - The supporting analog circuits have been fully integrated in GaN, including a temperature-compensated fast turn-on control, a slew-rate enhancement circuitry for the driver, and an on-chip regulator with integrated voltage reference.

- 650V GaN-on-Si gate driver and switch, enabling small size and highly efficient power conversion for grid-connected loads.
- Fully integrated in a GaN technology, enabling high performance, cost reductions, and mass production.

### **Session 34 Overview: Emerging Imaging Solutions**

#### Imagers, Medical, MEMS and Displays Subcommittee

Session Chair: Kazuko Nishimura, Panasonic, Moriguchi, Japan

Session Co-Chair: Johan Vanderhaegen, Google, Mountain View, CA

Session Moderator: Matteo Perenzoni, Fondazione Bruno Kessler (FBK), Trento, Italy

New imaging solutions are needed for future diagnostics and assistive products. The first paper describes an ultrasound imager, followed by an imager and image processor for an augmented-reality contact lens, and a THz SoC for a light-field camera array. The final paper shows an ultrasound pulser that drastically reduces dynamic power loss caused by the parasitic capacitance of transducers.

- In Paper 34.1, Butterfly Network presents an ultrasound-on-chip for point-of-care ultrasound imagers. The chip integrates 8960 capacitive micromachined ultrasound transducers (CMUTs), analog transceivers, and digital processors.
- In Paper 34.2, Mojo Vision describes an imager and an image processor to be used in an augmented-reality contact lens for low-vision patients, with acuity range from 20/75 to 20/400 Snellen.
- In Paper 34.3, the University of Wuppertal introduces a 32×32 THz SoC in 130nm CMOS with a novel current-mode readout and broadband THz antenna design to support very large light-field camera arrays.
- In Paper 34.4, KAIST demonstrates an ultrasound pulser that reduces dynamic power (CV<sup>2</sup>f) by 73.1% for 820pF of transducer
  parasitic capacitance compared to the conventional class-D pulser, by replenishing the supplied energy from the magnetically
  stored energy in an inductor.

## **Session 34 Highlights: Emerging Imaging Solutions**

#### [34.1] An 8960-Element Ultrasound-on-Chip for Point-of-Care Ultrasound

#### [34.2] A 21pJ/frame/pixel Imager and 34pJ/frame/pixel Image Processor for a Low-Vision

#### Augmented-Reality Smart Contact Lens

**Paper 34.1 Authors:** Nevada Sanchez<sup>1</sup>, Kailiang Chen<sup>1</sup>, Chao Chen<sup>1</sup>, Dan McMahill<sup>1</sup>, Sewook Hwang<sup>1</sup>, Joseph Lutsky<sup>1</sup>, Jungwook Yang<sup>1</sup>, Liewei Bao<sup>1</sup>, Leung Kin Chiu<sup>1</sup>, Graham Peyton<sup>1</sup>, Hamid Soleimani<sup>1</sup>, Bob Ryan<sup>1</sup>, J.R. Petrus<sup>1</sup>, Youn-Jae Kook<sup>1</sup>, Tyler S. Ralston<sup>2</sup>, Keith G. Fife<sup>2</sup>, Jonathan M. Rothberg<sup>2</sup>

Paper 34.1 Affiliation: 1 Butterfly Network, Guilford, CT, 24 Catalyzer, Guilford, CT

Paper 34.2 Authors: Rituraj Singh, Stevo Bailey, Phillip Chang, Ashkan Olayei, Mohammad Hekmat, Renaldi Winoto

Paper 34.2 Affiliation: Mojo Vision, Saratoga, CA

Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

#### CONTEXT AND STATE OF THE ART

• New imaging solutions, such as: an ultrasound imager, an imager for augmented reality contact lens, a THz light-field camera SoC, and an ultrasound pulser, are needed for future diagnostics and assistive products.

#### TECHNICAL HIGHLIGHTS

- Butterfly Network presents an ultrasound-on-chip for point-of-care ultrasound imagers. The chip integrates 8960 capacitive micromachined ultrasound transducers (CMUTs), analog transceivers, and digital processors.
  - This single-probe whole-body ultrasound imager built upon an ultrasound-on-chip (UoC) platform supports 13 key imaging indications, enabling the broadest spectrum of imaging capabilities for POCUS. It is the first design that achieves MEMS-CMOS integration, TX/RX beamforming, multi-level pulsing, on-chip time-gaincompensation/digitization/digital-signal-processing in a single chip.
- Mojo Vision describes an imager and an image processor to be used in an augmented-reality contact lens for lowvision patients, with acuity range from 20/75 to 20/400 Snellen.
  - The imager inside the smart contact lens achieves state-of-the-art energy efficiency of 21pJ/frame/pixel. It utilizes passive column circuitry along a single PGA and ADC to save layout area and achieve 61µW of core power consumption. The image processor achieves versatile image-processing operations such as edge detection, contrast enhancement and zoom, along with an energy efficiency of 34pJ/frame/pixel.

- Point-of-care ultrasound (POCUS) is a whole-body diagnostic tool with the potential to significantly reduce the time between symptom onset and initiation of therapy.
- An augmented-reality smart contact lens is a big candidate to improve vision for people suffering from non-correctable vision impairment such as macular degeneration, glaucoma, diabetic retinopathy, and retinitis pigmentosa.

## Session 35 Overview: Adaptive Digital Techniques for Variation Tolerant Systems

#### **Digital Circuits Subcommittee**

Session Chair: Arijit Raychowdhury, Georgia Institute of Technology, Atlanta, GA

Session Co-Chair: Mijung Noh, Samsung Electronics, Republic of Korea

Session Moderator: Keith Bowman, Qualcomm, Raleigh, NC

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

Processors continue to improve energy efficiency through an integration of on-die sensors, real-time adaptation and closed-loop software management. The three papers in this session exemplify techniques that demonstrate such improvements on large-scale processors. In commercial 7nm processors from MediaTek and Qualcomm, the authors demonstrate 13% power and 11% performance improvements, respectively. Dolphin Design and collaborators demonstrate an adaptive body-biasing technique in 22nm FDSOI technology realizing a 30% power reduction.

- In Paper 35.1, MediaTek introduces several sensor-assisted guard-band reduction techniques applied on a 7nm Octa-Core
  processor with sensor assistance. The CPU operates at 2.8GHz with 13% reduction in power in a fully integrated 5G multi-mode
  smartphone SoC that supports the sub-6GHz band with 2.6Gbps upload and 4.7Gbps download speeds.
- In Paper 35.2, Dolphin Design in collaboration with CEA-Leti and GlobalFoundries demonstrate a PVT-aware, adaptive and scalable back-biasing regulator in 22nm FDSOI technology. Statistical analysis is carried out on 316 dies and the proposed technique brings up to 30% power reduction by decreasing the minimal power supply by 100mV, while maintaining the target operating frequency (50MHz).
- In Paper 35.3, Qualcomm presents a thread-level power management (TPM) technique in a 7nm Hexagon processor. TPM exploits low-power phases during thread execution to appropriately adjust the thread instruction issue to achieve a higher performance at a target power as compared to global throttling. Silicon measurements demonstrate an 11% higher performance for a multi-threaded machine-learning application.

# Session 35 Highlights: Adaptive Digital Techniques for Variation Tolerant Systems

# [35.3] Thread-Level Power Management for a Current and Temperature-Limiting System in a 7nm Hexagon™ Processor

Paper 35.3 Authors: Vijay Kiran Kalyanam<sup>1</sup>, Eric Mahurin<sup>1</sup>, Keith Bowman<sup>2</sup>, Suresh Venkumahanti<sup>1</sup>

Paper 35.3 Affiliation: 1Qualcomm, Austin, TX ; 2Qualcomm Raleigh, NC

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

#### CONTEXT AND STATE OF THE ART

- This design demonstrates thread-level power management in a 7nm commercial compute DSP (CDSP), which integrates a master VLIW scalar processor to interface with a slave vector coprocessor to enable high-performance energy-efficient computing for multimedia, voice, audio, vision, imaging and machine-learning (ML) applications.
- Prior current and temperature limiting systems globally reduce the clock frequency (F<sub>CLK</sub>) or the instruction-issue rate without considering individual thread power or priority.
- This design adapts the instruction-issue rate based on individual thread power consumption and priority for a current and temperature limiting system thereby improving performance at a target power compared to global throttling.

#### TECHNICAL HIGHLIGHTS

- Qualcomm introduces thread-level power management in a 7nm commercial Hexagon™ compute DSP
  - Measurements from the commercial processor demonstrate 5-to-8% performance gain for the low-power threads. For a critical multi-threaded machine-learning application, an 11% higher performance is recorded compared to global throttling.

- Fine-grain thread-level power management enables higher performance at the same target power compared to global throttling.
- Such power-management schemes can reduce design pessimism and enable exciting new applications for the energy-efficient SoC and processor markets.

#### **Digital Architectures and Systems Subcommittee**

Session Chair: Hirofumi Shinohara, Waseda University, Fukuoka, Japan

Session Co-Chair: Massimo Alioto, National University of Singapore, Singapore

Session Moderator: Ingrid Verbauwhede, KU Leuven, Leuven, Belgium

With the proliferation of electronics in intelligent and connected devices, the need for hardware security continues to grow. Security primitives require increasing levels of protection against physical manipulation and passive side-channel attacks. The first paper describes a unified in-memory TRNG/PUF, followed by techniques for power and EM side-channel attack resistance. A strong PUF based on an SPN network and hot-carrier injection is presented. The next two papers cover additional PUFs, which are respectively based on oscillator collapse and a self-checking and self-healing technique.

- In Paper 36.1, the National University of Singapore uses a single 16Kb SRAM for unified dynamic/static entropy generation, achieving 3.6Mbps TRNG throughput and 1.78-to-3.84% PUF BER in 28nm CMOS.
- In Paper 36.2, Purdue in collaboration with Intel, evaluates the resistance against power and EM side-channel attacks of an AES256 implemented in 65nm CMOS combined with two different countermeasures: a digital signal attenuation circuit with a synthesizable current source and digital RO-bleed, and a time-varying transfer function, improving security by 25% over existing work.
- In Paper 36.3, Waseda University shows a modeling-attack-resilient strong PUF robust against 20M training CRPs, featuring less than 0.73% BER through in-cell hot-carrier injection burn-in in 130nm CMOS.
- In Paper 36.4, Pohang University of Science and Technology presents a hybrid PUF combining a process-mismatch amplifier in an oscillator-collapse topology fabricated in 40nm CMOS. The proposed scheme achieves a BER of 0.0019% across process corners
- In Paper 36.5, Rice University presents a self-checking and healing technique to improve the reliability of PUF cells with >70% dark-bit prediction accuracy and BER of 3.34E-8 across -40-125°C and 0.7-1.4 V supply in 65nm CMOS.

### **Session 36 Highlights: Hardware Security**

## [36.1] Unified In-Memory Dynamic (TRNG) and Multi-Bit Static (PUF) Entropy Generation for Ubiquitous Hardware Security

#### [36.2] An EM/Power SCA Resilient AES-256 with Synthesizable Signature Attenuation Using Digital-Friendly Current Source and RO-Bleed-Based Integrated Local Feedback and Global Switched-Mode Control

Paper 36.1 Authors: Sachin Taneja, Viveka Konandur Rajanna, Massimo Alioto

Paper 36.1 Affiliation: National University of Singapore, Singapore, Singapore

Paper 36.2 Authors: Archisman Ghosh<sup>1</sup>, Debayan Das<sup>1</sup>, Josef Danial<sup>1</sup>, Vivek De<sup>2</sup>, Santosh Ghosh<sup>2</sup>, Shreyas Sen<sup>1</sup>

Paper 36.2 Affiliation: <sup>1</sup>Purdue University, West Lafayette, IN, <sup>2</sup>Intel Corp., Hillsboro, OR

Subcommittee Chair: Thomas Burd, AMD, Santa Clara, CA

#### CONTEXT AND STATE OF THE ART

- With the proliferation of electronics in mobile devices, cyber-physical systems, IoT and other devices, the need for hardware security only grows. At the same time, these devices need protection against physical manipulation, especially passive side-channel attacks.
- True random number generators (TRNG) and physically unclonable functions (PUF) are foundational root-of-trust hardware
  primitives in secure platforms. Energy-efficient TRNGs and stable low bit-error-rate PUFs are critical to providing high-entropy
  keys and secure IDs for cryptographic workloads.
- Innovations to mitigate process, voltage and temperature variations with low silicon area overhead are crucial to enable reliable hardware security assurance in low-cost devices.

#### TECHNICAL HIGHLIGHTS

- The National University of Singapore introduces a unified in-memory TRNG and PUF for key generation in 28nm CMOS achieving 8.7×/1.3× lower area than prior art TRNG/PUF designs.
  - An 16kb SRAM is used for dynamic entropy generation (TRNG) by digitization of accumulated-jitter in leakage-driven bitline discharge and static entropy (PUF) by digitization of bitcell read current, achieving 3.6Mbps TRNG throughput and 1.78-to-3.84% PUF BER.
- Purdue University, in collaboration with Intel, provides an in-depth evaluation of an AES256 implementation in 65nm CMOS with multiple countermeasures against power and EM side-channel attacks.
  - An AES256 design utilizing a digital signature attenuation circuit with a synthesizable current source and digital RObased bleed, in combination with a time-varying transfer function, improves security by 25% over existing works.

- Hardware security has become a key aspect of system-on-chip design to avoid exploits of security vulnerabilities in compute and connected systems, protect data and preserve safety.
- Resilience to physical and machine-learning attacks has become a major driver in hardware-security applications, and is being
  pursued through the adoption of specific techniques that increase the effort for a successful attack by several orders of
  magnitude.
- The need for high-quality keys for data encryption and device authentication at low silicon area and low cost is driving innovation in true random number generators, as well as physically unclonable functions with low bit error rate.

# ISSCC 2021 TRENDS



#### PREAMBLE

The Trends to follow serve to capture the context, highlights, and potential impact, of the papers to be presented in each Session at ISSCC 2021 in February.

OBTAINING COPYRIGHT to ISSCC press material is EASY!

You may quote the Subcommittee Chair as the author of the text if authorship is required.

You are welcome to use this material, copyright- and royalty-free, with the following understanding:

- That you will maintain at least one reference to ISSCC 2021 in the body of your text, ideally retaining the date and location. For detail, see the FOOTNOTE below.
- That you will provide a courtesy PDF of your excerpted press piece and particulars of its placement to shahriar@ece.ubc.ca

#### FOOTNOTE

• From ISSCC's point of view, the phraseology included in the box below captures what we at ISSCC would like your readership to know about this, the 68th appearance of ISSCC, on February 13<sup>th</sup> to February 22<sup>nd</sup>, 2021.

This and other related topics will be discussed at length at ISSCC 2021, the foremost global forum for new developments in the integrated-circuit industry. ISSCC, the International Solid-State Circuits Conference, will be held on February 13 - February 22, 2021

ISSCC Press Kit Disclaimer

The material presented here is preliminary.

As of November 6, 2020, there is not enough information to guarantee its correctness.

Thus, it must be used with some caution.

## HISTORICAL TRENDS IN TECHNICAL THEMES ANALOG SYSTEMS ANALOG SUBCOMMITTEE POWER MANAGEMENT SUBCOMMITTEE DATA CONVERTERS SUBCOMMITTEE



### Analog – 2021 Trends

#### Subcommittee Chair: Kofi A. A. Makinwa, Delft University of Technology, Delft, The Netherlands

At ISSCC 2021, new analog circuit techniques have improved the performance of RC frequency references, sensor interfaces (temperature, airflow, humidity, capacitance and magnetic field) and amplifiers (Class-D, chopper, and high-slew rate). The humidity sensor achieves state-of-the-art performance with 0.0094%RH resolution while consuming only 1.5µW using an adaptive zoom CDC and inverter-based amplifiers. The chopper-stabilized amplifier achieves high linearity with -107dB intermodulation distortion (IMD) by using a novel fill-in technique to mitigate the chopping spikes caused by amplifier delay.

In the case of RC frequency references, their temperature stability continues to dramatically improve over time, as shown in Figure 1. Critically, these improvements have not been at the expense of energy efficiency, as indicated in Figure 2. Both designs achieve this high level of performance by digitally combining the characteristics of two different types of resistors. One of these designs only requires a single room-temperature trim to achieve an inaccuracy of  $\pm 400$ ppm from -45°C to 85°C.



Figure 1: Trends in stability over time for non-crystal oscillators.



Figure 2: Trends in stability versus energy efficiency of non-crystal oscillators.

#### Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA

Power management continues to support and enable a diverse array of applications spanning low-power energy harvesting, biomedical systems, and envelope tracking to high-voltage grid interface, wireless power transfer, and automotive systems. Trends seen in this year's ISSCC reflect not only the diversity in the application space for power electronics, but also a transition towards addressing key challenges faced in many industry applications. Automotive and GaN-based power electronics continue to grow in scope and importance along with the focus on their high reliability and EMI compliance. The process technologies supporting power electronics are diversifying to include conventional CMOS, high-voltage SOI-CMOS, GaN-on-Si, and other GaN-based foundry processes with new devices and higher levels of integration, which in turn bring new challenges to the field. While new circuit architectures that mix switched-capacitor and inductor-based power conversion continue to emerge and prove their ability to improve the power density versus efficiency trade-off, many challenges such as how these circuits can address line and load transients, operate at high bandwidth, and achieve robustness continue to be research focus points.

Many of the key trends in power management combine a mixture of innovation in one or more of process technologies, packaging, new architectures, or control techniques:

- Hybrid converters are gaining traction in industry applications from automotive to envelope tracking. In addition to demonstrating
  high performance with small passive components, there is a trend towards reliability, robustness, and addressing system
  challenges such as line transients, fault handling, and startup.
- High-voltage power electronics and galvanic isolation continue to be important drivers for circuit topologies, process technologies, and packaging.
- There is a continued push to leverage high switching frequencies even into the GHz regime combined with resonant or softcharging operation to achieve fully integrated power delivery.
- The high switching speed of GaN transistors is motivating innovation in gate-driver technology to mitigate voltage overshoot and EMI, while reducing power loss. GaN integration and GaN on silicon is enabling new higher density and higher complexity drive circuits.
- A range of emerging applications from biomedical implants to novel energy harvesting devices and micro-robotics is creating opportunities for new circuit architectures and control strategies to address application-specific challenges.

### Data Converters – 2021 Trends

#### Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI

Data converters are a critical link between the analog physical world and the world of digital computing and signal processing prevalent in modern electronics. The need to faithfully preserve the signal across domains continues to pressure data converters to deliver more bandwidth and linearity while continuing to increase power efficiency. This year, ISSCC not only continues the trend of highly energy-efficient analog-to-digital converters (ADCs), but also showcases new and exciting converter architectures, which open new possibilities for data conversion. Successive-approximation-register (SAR), noise-shaping SAR and delta-sigma-based designs are pushing the envelope of the current state-of-the-art in converter design. Floating Inverter Amplifiers (FIA's) are taking a prominent role and show their efficiency in many discrete-time converters.

The three figures below represent traditional metrics that capture the innovative progress in ADC design. The first figure plots power dissipated relative to the Nyquist sampling rate ( $P/f_{snyq}$ ), as a function of signal-to-noise and distortion ratio (SNDR), to give a measure of ADC power efficiency. Note that a lower  $P/f_{snyq}$  metric represents a more efficient circuit on this chart. For low-to-medium-resolution converters, energy is primarily expended to quantize the signal; thus the overall efficiency of this operation is typically measured by the energy consumed per conversion and quantization step. The dashed trend-line represents a benchmark of 1fJ/conversion-step. Circuit noise becomes more significant with higher-resolution converters, necessitating a different benchmark proportional to the square of signal-to-noise ratio, represented by the solid line. Designs published from 1997 to 2020 are shown in circles. ISSCC 2021 designs are shown in black dots.

The second figure plots signal fidelity vs. the Nyquist sampling rate normalized to power consumption. At low sampling rates, converters tend to be limited by thermal noise, independent of the sample rate. Higher speeds of operation present additional challenges in maintaining accuracy in an energy-efficient manner, indicated by the roll-off vs. frequency in the dashed line. The last ten years have resulted in an improvement of over 10dB in power-normalized signal fidelity, or a 10× improvement in speed for the same normalized signal fidelity. In this year's ISSCC, no less than four designs are pushing towards noise limited efficiency, approaching the challenge with discrete-time SAR, noise-shaped SAR and continuous-time delta-sigma architectures. Pipelined SAR and pipelined noise-shaping SAR architectures are setting the trends in the speed vs. efficiency corner of the graph.

The final figure plots ADC bandwidth as a function of SNDR. Sampling jitter or aperture errors coupled with an increased noise bandwidth make achieving both high resolution and high bandwidth a particularly difficult task. While ten years ago, a state-of-the-art data converter showed an aperture error of approximately 1ps<sub>rms</sub>, in recent years, designs with aperture errors below 100fs<sub>rms</sub> have been published, many of which have been published at ISSCC.

Finally, this year's ISSCC presents two converters advancing the state-of-the-art in high-speed digital-to-analog converter (DAC) design. While one design shows the feasibility of a high-speed DAC for wideband self-calibration of ADCs, a second design advances the stateof-the art in capacitive RF DAC design, using integrated matching networks.



Figure 1: ADC power efficiency (P/ $f_{snyq}$ ) as a function of SNDR.



Figure 2: Power normalized noise and distortion vs. the Nyquist sampling rate.


Figure 3: Bandwidth vs. SNDR.

# HISTORICAL TRENDS IN TECHNICAL THEMES COMMUNICATION SYSTEMS RF SUBCOMMITTEE – WIRELESS SUBCOMMITTEE WIRELINE SUBCOMMITTEE



### RF – 2021 Trends

### Subcommittee Chair: Jan Craninckx, imec, Belgium

ISSCC 2021 features record-setting advancements in phase-locked-loop (PLL) prototypes, CMOS power amplifiers (PAs), and Terahertz concepts. RF integrated circuit advances are driven by emerging applications in 1) broadband and fifth-generation (5G) communications using massive multiple-input/multiple-output (MIMO) and millimeter-wave (mm-wave) technologies and 2) sensing and imaging at millimeter-wave and sub-millimeter-wave frequencies.

**Phase-Locked Loop Synthesizers**: ISSCC 2021 highlights new results in voltage-controlled oscillators (VCOs) and PLLs generating RF, microwave, and mm-wave frequency carriers with record-setting low jitter and power consumption and pushing jitter-power figure of merit (FoM) below -250dB. Advanced RF VCO design highlights optimized waveform shaping to improve the VCO phase-noise FoM to 196.9dBc/Hz and reduce the 1/f<sup>3</sup> corner.



PLL advancements continue to feature a variety of sub-sampling and digital approaches in fractional-N architectures. Fig. 1 highlights that ISSCC 2021 is introducing PLLs with record FoMs that factor integrated jitter (jitter variance) and power consumption. Seven outstanding fractional-N PLL papers use a diverse set of approaches. Below 6GHz, PLLs demonstrate 1) a fully digital design with 365fs<sub>rms</sub> jitter and frequency spurs below -63dBc and 2) analog sampling with digital-to-time conversion in a 14nm process that yields 80fs<sub>rms</sub> jitter. For 5G signal generation, fractional-N PLL strategies include 1) digital architectures with noise shaping to cover a 12.9-to-15.1GHz band with 79fs<sub>rms</sub> jitter and 2) digital subsampling to achieve a record 47.3fs<sub>rms</sub> jitter in 16nm FinFET. Innovative approaches to mm-wave applications are also presented including a 24GHz FMCW synthesizer approach offering fast modulation slope and wide bandwidth and a 102GHz PLL with 82fs<sub>rms</sub> jitter. These fractional-N and mm-wave-output PLLs continue to improve power consumption and integrated jitter to keep pace with that advances in communications and sensing applications.

**RF and mm-Wave PAs**: ISSCC 2021 presents exciting PA innovations to improve efficiency, linearity, and wideband operation for PAs in RF and millimeter-wave bands. Research in digital PAs highlights current-mode approaches to operate at higher frequencies and at improved back-off power. ISSCC 2021 is introducing work that 1) explores subharmonic switching for a record 29% drain efficiency at 9dB back-off and 2) demonstrates more than 30dBm output power with 23.7% power-added efficiency (PAE) at 9dB back-off power using tri-state digital PA cells. These results push PA performance operating below 6GHz as shown in Fig. 2.



Fig. 2. Under-6GHz power-amplifier trend.

In mm-wave bands, a record PAE of 50% using a coupling approach between gate and source terminals to increase drain efficiency is demonstrated in 45nm CMOS SOI. This work pushes the bound for PAE and output power for PAs operating between 20 and 50GHz as illustrated in Fig. 3. Additionally, CMOS Doherty amplifiers demonstrate optimization to balance high average efficiency with linear operation. A load-balanced Doherty design achieves a 20% average PAE at 15.5dBm output power. An integrated Marchand balun offers wideband operation for a Doherty PA covering the 26-to-60GHz frequency band.



Fig. 3. 20-to-50GHz power-amplifier trend.

**Emerging Technologies for Communication and Terahertz Sensing/Imaging**: At ISSCC 2021, record-setting components and demonstrations for terahertz-frequency imaging and sensing are presented. A record sensitivity is achieved at 605GHz with a noise equivalent power of  $2.3\sqrt{pW/Hz}$ , thereby constituting a substantial sensitivity improvement relative to the state-of-the-art thus making passive terahertz detection viable. ISSCC 2021 continues to advance the potential of sub-millimeter-wave sensing systems by introducing a 436-to-467GHz lens-integrated radiating source that offers multiple 2D steerable beams.

### Wireless – 2021 Trends

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR

Mobile battery limitations demand low-power cellular SoC implementations and continue to drive the development of high-performance and power-efficient transceivers. This year, at ISSCC 2021, sub-6GHz 5G radios introduced low-cost, low-power transceivers that could support the increased data rates of the new standard and be backward compatible to existing 2G/3G/4G infrastructure, by employing extensive integration techniques, dynamic biasing, and an adaptive voltage supply scheme.

Figure 1 shows the trend in the power consumption per component carrier (CC) for recent cellular SoC implementations, as well as the shift in process nodes. It indicates an increasing push in power reduction to extend mobile battery hours. This year, an advanced cellular transceiver features a low-power and low-cost SoC in 14nm FinFET CMOS with LTE power of 114 mW/CC. The transceiver supports legacy as well as 5G frequency range1 (FR1) cellular and dual-mode global navigation satellite system (GNSS).



Figure 1: Trends in the LTE power consumption/CC and process nodes for recent cellular SoC implementations reported at ISSCC.

With the continuing advancements of THz technologies in silicon processes, a generation of intelligent sensing and imaging applications are being developed at spectrum above 100GHz. THz Prism, a spectrum-to-space mapping scheme utilizing an integrated leaky-wave antenna, is introduced at ISSCC 2021 for rapid one-shot localization and direction finding of mobile nodes with 2D angular accuracy of 1.9deg and 1.95deg for a measurement time of 50ms.

Reported advances in ultra-low-power radio continue the drive towards high sensitivity and energy-efficient wireless nodes as can be seen from Figure 2. This year, a 2.4GHz wake-up receiver is reported enabling 110pJ/b while achieving sensitivity <-90dBm, leveraging both a within-packet duty-cycling technique and an Uncertain-IF front-end without utilizing any off-chip components.



Figure 2: ISSCC published wake-up receiver sensitivity and energy-efficiency trends.

### Wireline – 2021 Trends

### Subcommittee Chair: Frank O'Mahony, Intel, Hillsboro, OR

Over the past few decades, electrical and optical interconnects have been the key components bridging the gap between the exponentially growing demand for data bandwidth across electronic components/systems and the relatively gradual increase in pin/cable density. Ranging from handheld electronics to supercomputers, wireline data communication bandwidth must also grow exponentially to avoid limiting the performance scaling of these systems. By increasing the data per pin or cable of various electronic devices and systems, such as memory, graphics, chip-to-chip fabric, backplane, rack-to-rack, and LAN, wireline I/O has fueled incredible technological innovation in electronic devices and systems over past few decades. Figure 1 shows that data rate per pin has approximately doubled every four years across various I/O standards ranging from DDR, to graphics, to high-speed Ethernet. Figure 2 shows that the data rates for published transceivers have kept pace with these standards while taking advantage of CMOS scaling. Figure 3 shows published transceiver energy efficiency vs. channel losses at the Nyquist frequency in the 40-to-50dB range. In part, this incredible improvement is enabled by the power-performance benefits of process technology scaling. However, sustaining this exponential trend for I/O bandwidth requires more than just transistor scaling. Significant advances in energy efficiency, channel equalization and clocking must be made to enable the next generation of low-power and high-performance computing systems. Papers at ISSCC this year include examples of PAM-4 transmitters at >200Gb/s, short-reach electrical interconnects operating up to 112Gb/s, long-reach copper interconnect transceivers operating up to 116Gb/s, optical transceivers and components operating up to 400 Gb/s (coherent), and a dielectric-waveguide link at 105Gb/s. New techniques for extending data rate, power reduction, channel equalization, and clock recovery are reported. These transceivers and transceiver building blocks are implemented in both CMOS and BiCMOS technologies.

### Scaling Electrical Interconnects to 100Gb/s and Reaching out to 200Gb/s

Bandwidth requirements in data centers and telecommunication infrastructure continue to drive the demand for ultra-high-speed wireline communication. Over the past few years, complete transceivers operating up to 112Gb/s were demonstrated across a variety of channel lengths. Two notable trends in these transceivers, especially for long-reach channels, are the adoption of PAM-4 modulation and a transition to DAC/ADC architectures with DSP-based equalization. Although PAM-4 provides twice the data rate at the same baud rate as conventional NRZ to relax channel loss requirements for bandwidth doubling, it also comes with more stringent requirements for linearity, noise and multi-level signaling. This trend has motivated development of low-power data converters, digital equalization and clock recovery along with linear, high-bandwidth TX and RX analog front ends. Over the past two years, the first transceiver components were demonstrated to extend these transceivers to 112Gb/s. This year, ISSCC includes several implementations of 112Gb/s PAM-4 long-reach transceivers in 7nm CMOS with relatively low power consumption. In paper 8.4, Huawei Technologies describes a 112Gb/s PAM-4 transceiver for long-reach copper interconnects consuming only 6pJ/b while operating over a channel with 45dB loss. In paper 8.7, Inphi presents an 8x112Gb/s PAM-4 retimer macro, where a single lane consumes 6.5pJ/b. While 112Gb/s links are maturing, several papers at ISSCC are directed to doubling the data rate to 224Gb/s. In paper 8.1, Intel demonstrates a first DAC-based TX generating 224Gb/s PAM-4 data while consuming only 1.9pJ/b. The DAC is based on a 4:1 MUX cell using active peaking and uses an inductive clock distribution network to achieve low jitter. In paper 8.2, researchers from UC Berkeley and Hanyang University also demonstrate a 200Gb/s PAM-4 TX in 28nm CMOS. In paper 11.8, Marvell presents an echo cancellation technique to enable simultaneous bi-directional links with 56Gb/s PAM-4 data. Using combined analog and digital echo cancellation the circuit suppresses reflected signals from the TX by 44dB.

### Short Reach Links for Intra-Package Communications

As a consequence of the increasing demand for bandwidth in high-throughput systems used in AI, HPC and switch applications, multiple devices are integrated in the same package, and data is sent between chips on the same package or between a central chip and co-packed optics (CPO). For these applications, relatively short distances (<50mm) have to be bridged with minimum power and at the highest throughput per mm chip-edge (Gb/s/mm). Since the channel attenuation and discontinuity in these links is small, low-power analog-oriented architectures are preferred over the heavy DSP-based solutions for longer channels. In Paper 11.1. MediaTek describes a 112Gb/s XSR transceiver, which is achieves 1E-10 BER over a 50mm package trace while consuming only 1.7pJ/b. In paper 11.2. Rambus presents a 106Gb/s XSR transceiver which achieves 1E-9 BER over a 10dB loss channel while

consuming 1.55pJ/b and enabling 722Gb/s/mm throughput per edge. In paper 11.3 Cadence presents an alternative to differential signaling in a 40Gb/s single-ended 6b/7b-coded NRZ signaling scheme consuming 1.7pJ/b while providing 480Gb/s/mm throughput at <1E-15 BER.

#### **Optical Links for Upcoming 400G Data Center Interconnects**

The explosive growth of data and data-centric computing places stringent demands on the bandwidth and energy efficiency of data center interconnects, spurring the development of several 200-to-400G Ethernet standards. Low-power data converters and optical integration are the two key components for the development of high-performance optical pluggable modules using coherent detection. In Paper 8.6, Inphi present a 40-to-97GS/s DAC and ADC with 40GHz bandwidth for 400Gb/s Coherent Optical applications in 7nm FinFet. The ADC is based on an 128× time-interleaved SAR ADC, while a fractional PLL supports a wide frequency range to cover different modulation formats and speeds. While coherent optical communication covers long distances, demand is also surging for shorter, non-coherent (<2km) optical links. Several 400G Ethernet standards (e.g. 400G-DR4/FR4) target optical line rates of 100Gb/s. Since these links are envisioned to be employed in data centers in high volume, low cost and low power are key requirements. Solutions so far typically employed a standalone BiCMOS TIA IC followed by a 100Gb/s PAM-4 DSP-based SerDes IC, which resulted in high power dissipation and package cost. In paper 11.8. Intel demonstrates a first optical receiver with integrated TIA for 100Gb/s PAM-4. The receiver uses purely analog signal processing to incorporate a 2-tap FFE and a 2-tap direct-feedback DFE and achieves -8.3dBm sensitivity at 2.4e-4 BER. The RX is implemented in 28nm CMOS and consumes only 3.9pJ/b.

#### **Concluding Remarks:**

Continuing to aggressively scale I/O bandwidth is essential for the industry, but the tradeoffs between bandwidth, power, area, cost and reliability are extremely challenging. Advances in circuit architecture, interconnect topologies, transistor scaling and integrated silicon photonics are changing how I/O will be done over the next decade. The most exciting and promising of these emerging technologies for electrical and optical interconnects will be highlighted at ISSCC 2021.



Figure 1: Per-lane data rate vs. Year for a variety of common I/O standards.



Figure 2: Data-rate vs. process node and year.



Figure 3: Transceiver power efficiency vs. channel loss.

# **HISTORICAL TRENDS IN TECHNICAL THEMES**

### **DIGITAL SYSTEMS**

DIGITAL ARCHITECTURES & SYSTEMS SUBCOMMITTEE DIGITAL CIRCUITS SUBCOMMITTEE MACHINE LEARNING & AI SUBCOMMITTEE



## **Digital Architectures & Systems (DAS) – 2021 Trends**

#### Subcommittee Chair: Thomas Burd, Advanced Micro Devices, Santa Clara, CA

This year's selection of processor papers includes contributions across a diverse set of applications and process nodes. The cellphone CPU continues to increase in both frequency and performance approaching laptop levels of performance. Products featuring 5nm CMOS technology are beginning to hit the market. While the industry waits for at-scale quantum computers, advancement in silicon-based simulated-annealing chips continues to increase exponentially to help solve combinatorial optimization problems. Additionally, to highlight processor advances in various sectors of this diverse industry, there are three invited industry papers featuring processors from a leading-edge gaming console processor, a next-generation GPU and communication protocol, and a commercial data-center scale AI processor. Finally, as CMOS process technology and processor designs become increasingly complex, the number of engineers required to handle such complexity continues to grow. Innovation in design productivity is emerging as a key tool to combat the ever-growing development costs of large SoC processors.



Figure 1: Core-count trends (red diamond designates multi-chip module).



Figure 2: Clock-frequency-scaling trends.



Figure 3: Chip-complexity scaling trends (red diamond designates multi-chip module).



Figure 4: On-die cache-size trends (red diamond designates multi-chip module).

The major innovation efforts in mobile phones focus on three key areas: 5G, artificial intelligence (AI) and gaming. 5G cellular technology is becoming more mature and the post-5G era is gaining more focus. The 6G era will feature more antennas, more intelligence, and more use cases. The range of applications of neural network processing units (NPUs) is gradually expanding beyond image and speech recognition, to cellular performance improvement, SoC power and performance optimization. Therefore, not only the performance of the NPU is increasing, but also tiny NPUs for low-power operations are being applied everywhere. For better user experience and game quality, the main concern surrounding the display is moving from resolution to frame rate.

| Graphics           | OpenGL OpenGL/VG/MAX AR VR (Virtual Reality)<br>(ES1.1) (ES2.0) (Augmented Reality) Vulkan                             |
|--------------------|------------------------------------------------------------------------------------------------------------------------|
| Display            | VGA WVGA SXGA WQXGA/WQXGA+ WQXGA/WQXGA+ WQXGA/WQXGA+ 4K<br>@ 60fps @ 60fps @ 60fps @ 60fpsx2 (VR) @ 120fps @ 240fps    |
| Camera             | 5~8M 10M 16M 20M 24M 12MxDual 16MxDual 48M 200M<br>360° VR 3D Depth / AR Tripple Quadraple                             |
| Image/Video        | H.264/AVC H.264/AVC H.264/AVC H.264/MVC H.265/VP9 H.265/VP9 AV1<br>(VGA) (D1) (Full HD) H.264/SVC H.265/VP9 HDR HDR10+ |
| Audio              | AAC AAC Plus WMA Dolby DSD TWS<br>Dolby 5.1 TrueHD/Digital+ Dolby Atmos Truly Wireless                                 |
| Accelerator        | SIMD Multi core Heterogeneous Neural-net 5 TOPS 15 TOPS   Multi core (2-4) (4-8) Multi-Processing Processor 15 TOPS    |
| downlink<br>[Mb/s] | UMTS HSPA HSPA+ LTE LTE-A LTE-A LTE-A 5G 5G   0.4-2 1.8-7 7-42 100 150-750 1600 2000 5000 10000                        |
| CPU [MIPS]         | 300 500 800 2400 6K 12K 13K 19K 22K 26K<br>500 800 2400 6000 12K 100K 112K 162K 180K 208K                              |
|                    | 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021                                             |

Figure 5: Application-processor trends in smart phones.

The bandwidth of wired and wireless links continue to increase at a rate of approximately  $10 \times$  higher data rates every five years. Compared to previous years, the changes this year are modest. Massive MIMO and mm-Wave technologies are being actively studied to realize full 5G communication. The first 5G mobile devices were commercialized starting last year with more on the way in 2020/2021. The explosion of IoE devices will require the evolution of narrow-band wide-area networks.



Figure 6: Data-rate trends in wired, wireless and cellular.

**Circuits for Hardware Security:** With the increasing risk and cost of information theft and safety hazards, hardware security has become a common requirement in intelligent and connected systems. Though focus on cryptographic implementation continues, cost-effective and low bit error rate PUFs (physically unclonable functions) are increasingly adopted in smart cards, sensor nodes, consumer devices, and automotive. TRNGs (true random-number generators) are also commonly required to strengthen secret key generation in cryptographic applications.

Figure 7 illustrates trends in area scaling in PUFs (area/bit) and TRNGs published recently at ISSCC, showing relentless area and cost reductions. Figure 8 shows the energy/bit scaling in PUFs published at ISSCC, and the relatively higher energy/bit of TRNGs. The sudden PUF energy increase of three years ago is attributable to the stronger emphasis on low bit error rate. Figure 9 illustrates the native and post-processing bit error rate, the latter of which has recently seen drastic reductions thanks to new techniques mitigating bit instability.



Figure 7. Area/bit trends for physically unclonable functions (PUFs) and area trends for true random number generators (TRNGs) published recently at ISSCC.



Figure 8. Energy/bit trends for PUFs and TRNGs published recently at ISSCC.



Figure 7. Native and post-processing bit error rate for PUFs, showing the dramatic improvements offered by PUF post-processing.

## **Digital Circuits – 2021 Trends**

#### Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

The demand for higher performance and energy-efficient platforms ranging from Internet of Everything (IoE) devices to cloud data-centers continues to drive innovations in CMOS digital-circuit building blocks with goals of improving performance and energy efficiency, while lowering cost and design effort. As classic technology scaling continues to slow down, circuit innovations exploit technology features, such as body biasing, voltage stacking, and emerging memories, to enhance performance and energy efficiency. In addition, variation-tolerant design is a major trend in digital circuits to improve performance, energy efficiency, and robustness across process, voltage, temperature and aging variations. Specifically, proposed adaptive-clocking and power-throttling techniques mitigate these effects on-chip with an increasingly closer interaction between circuits and micro-architecture, including thread-level power management.

A continued trend towards application-specific accelerators is leading to the development of new circuit techniques that benefit a range of emerging applications, such as memory-centric workloads for artificial intelligence or graph applications. Some of these accelerators leverage compute-in-memory strategies, while others rely on circuit operation in non-conventional modes/domains, such as bit-serial, time-domain or charge-domain computation.

**Digital PLLs for Low-Jitter Applications**: PLL trends include an analog-to-digital migration to provide more functionality, variability management, and lower design complexity at advanced nodes. Demand for compact low-jitter PLLs continues to increase. Furthermore, multiple operating modes and dynamic frequency scaling have increased the need for fractional division ratios, leading to advanced digital signal processing in the digital PLL control. The use of more automated digital design flows, such as synthesis and automated placement and routing, dramatically reduces development costs, but can degrade jitter, thus requiring new techniques to compensate. In addition, power and area reductions achieved by digital and mixed-signal PLLs now allow usage of analog functional block drivers, leading to the development of digital circuit techniques for spurious-tone cancellation. Figure 1 describes the PLL area and a key figure of merit (FoM) for digital PLLs and MDLLs across calendar years, highlighting a consistent trend in area reduction, while maintaining competitive FoM values.



Figure 1. Digital phase-locked loop (PLL) and multiplying delay-locked loop (MDLL) trends in area and key figure of merit (FoM), defined as:  $FoM = 10 \times log_{10} \{ (Jitter_{RMS}/1s)^2 \times (Power/1mW) \}.$ 

**Integrated Voltage Regulators**: Energy reduction remains a top priority as power density continues to increase. Voltage regulators, while traditionally off-chip, are increasingly integrated on-chip to enable fine-grained power management and cost reduction. While highefficient voltage down-conversion has driven inductor-based regulators (LCVR) and switched-capacitor voltage regulators (SCVR), efficient conversion for multiple different output voltages recently led to the development of single-inductor-multiple-output regulators (SIMO). A major trend continues towards block-level regulation with low-dropout (LDO) linear regulators. These voltage regulators are integrated in scaled process nodes to enable faster transient response and fine-grained dynamic voltage and frequency scaling (DVFS) of individual functional blocks. As a result, the low voltages supported in DVFS systems drive a move from analog-based LDOs to digital implementations. Another trend is towards hybrid LDOs that incorporate the best features of both analog and digital designs. While ultra-low-power LDOs now target both digital and analog loads with high power-supply rejection ratio and low output ripple, large digital computation loads drive the demand for output voltage droop mitigation on distributed LDOs with increasing maximum output currents. Figure 2 describes the current density and power conversion efficiency of these integrated voltage regulators across calendar years, demonstrating an increasing trend in current density, while maintaining high power efficiencies.



Figure 2. Integrated voltage regulator trends in current density and power efficiency.

## Machine Learning (ML) & AI – 2021 Trends

#### Subcommittee Chair: Marian Verhelst, KU Leuven - MICAS, Leuven, Belgium

In response to the world-wide trend of increased interest and enthusiasm in deep learning in recent years, ISSCC established a subcommittee dedicated to machine learning and AI, starting with last year's 2020 edition. As deep neural networks succeed in achieving higher accuracy on a wide variety of tasks, the size and computational complexity of these models also keeps rising. Across datacenter, mobile, and IoT workloads, this results in continued demand for more-energy-efficient and higher-throughput neural-network computing, in both inference, as well as training workloads. This year's submissions have targeted these objectives across a broad spectrum of applications, ranging from a 570nW inferencing solution for always-on keyword spotting, to cloud AI processors supporting inference at 102TOPS (int4) as well as training at 25.6TFLOPS (fp8).

It is important to note that the metrics that matter at the system level are energy-per-inference (or -per-training-example), and inferences/second (or training-examples/second) on a specific task at a given inference (or final trained) accuracy. This year's submissions significantly push the state-of-the-art of these efficiency and throughput numbers yet again, often by combining multiple enhancement techniques within a single chip, implemented in the most-advanced technology nodes (Figure 1):



Figure 1: Various parameters impacting low-level and system-level benchmarking metrics.

- ML accelerator chips supporting a specific set of ultra-low-power applications are increasingly presented in this conference. The application areas covered this year include smart cameras, hand gesture recognition, speech recognition and keyword spotting. Both ML engines, as well as system-level integration, with application-targeted components play crucial roles in achieving the specific system-level goals being reported (low energy consumption, high accuracy, high throughput, etc.).
- 2.) Compute-in-memory (CIM) architectures have become more popular this year, making use of static, as well as dynamic capacitive memories. Innovations here clearly shift towards support of more complicated and larger network models, and towards increased flexibility through the inclusion of digital blocks at the edge of CIM macros. Exploitation of sparsity in various forms continues to be an important focus of both inference and training acceleration, particularly for imaging applications.
- 3.) Next to inferencing workloads, there is a marked increase in the number of ISSCC submissions that can support training workloads. This introduces a wider variety of supported datatypes (fp16, fp8, bfloat, ...) and increases the demands on the memory system and chip IO bandwidth.
- 4.) At ISSCC 2021, machine learning processors are increasingly realized in the most-advanced technology nodes, with examples of chips fabricated with 5nm and 7nm technology. 3D die stacking is employed for the first time, for tight integration of a machine learning engine underneath an image sensor.

As different chipsets are often characterized on a different set of tasks, network topologies, and accuracy levels, a direct comparison of the true system-level benchmarking metrics – such as energy/inference and inference/s – is not always straightforward. It is therefore instructive to look at the reported low-level metrics of operations/s and energy/operation within the neural network. Figure 2 displays the energy-efficiency-vs-throughput operating points demonstrated by the accelerators presented at ISSCC 2021 (red), compared to the state-of-the-art in 2016-2018 (blue), 2019 (black), and 2020 (green). Figure 3 plots the evolution of both energy-efficiency and area-efficiency (throughput-per-unit-area) over the past few years. ML accelerators are clearly still improving at a very fast, almost exponential pace. Yet, ISSCC attendees should keep in mind that these TOPS/W and TOPS/s specifications depend strongly on the particular neural-network topologies being used, since these determine the tolerance for reduced computational precision and the presence of high weight and activation sparsity. Going forward, as the field matures, we believe that a common benchmarking methodology must be established which can properly account for the application context and provide proper translation between low-level and system-level performance metrics. In the meantime, the clever combination of sparsity, variable precision, and in-memory computing technologies is continuing to enhance deep-learning processor efficiency and throughput. With the increase of system-level integration of machine learning engines together with important sub-systems (imaging chips, image pre-preprocessing, audio filtering and pre-processing, etc.), these performance improvements will continue to open up new Al applications.



Figure 2: Deep-learning processor energy-efficiency (TOPS/W) and throughput (GOPS).



Figure 3: Evolution in energy efficiency (TOPs/W) and throughput per unit area (TOPs/mm<sup>2</sup>) of ML inferencing processors.

### Memory – 2021 Trends

### Subcommittee Chair: Jonathan Chang, TSMC, Hsinchu, Taiwan, Memory

The demand for high-density, high-bandwidth, and low-energy memory systems continues to grow everywhere: from high-performance computing to SoC, wearables and IoT.

This year the conference presents a high-bandwidth 24Gb/s/pin GDDR6 DRAM and LPDDR5 DRAM capacity is extended to 16Gb/channel. A 6.2GHz SRAM is reported and the first 3nm GAA (gate-all-around) SRAM is reported. Computing in Memory (CiM) performance and energy efficiency are improved with an ReRAM CiM macro supporting 8b inputs and 8b weights. The highest densities per area are reported for 3D-NAND in TLC and QLC NANDs, both adopting a circuit-under-cell-array structure while boosting programming throughput as well.

Top papers from ISSCC 2021 include:

- A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91-195.7 TOPS/W for Tiny AI Edge Device
- A 6.2GHz, 0.5V V<sub>MIN</sub> Single-Ended Current-Based Sense Amplifier (CSA) Compileable 8T SRAM in 7nm FinFET
- 3nm Gate-All-Around SRAM featuring an Adaptive Dual-BL and an Adaptive Cell-Power Assist Circuit
- A 24Gb/s/pin 8Gb GDDR6 with Half-Rate Daisy-Chain-Based Clocking Architecture with IO Circuitry for Low-Noise Operation
- A 16Gb Sub-1V 7.14Gb/s/pin LPDDR5 SDRAM Applying Mosaic Architecture with Short-Feedback 1-tap DFE, FSS Bus with Low-Level Swing, and Adaptively-Controlled Body Biasing in a 3rd Generation of 10nm DRAM
- A 176-stacked 512Gb 3b/Cell 3D NAND Flash with 11.0Gb/mm<sup>2</sup> Density Using a Peripheral Circuit under Cell Array Architecture
- A 1Tb 4b/Cell, 144-Tier, 3D NAND Flash Memory with 40 MB/s Program Throughput and 13.8Gb/mm<sup>2</sup> Bit Density

#### ADVANCED EMBEDDED MEMORIES

Scaling in SRAM continues and the first SRAM with gate-all-around (GAA) transistors is presented. Innovations in SRAM continue to support highest density, highest energy efficiency, and applications with very-low latency requirements. A new 16T bit cell and sense amplifier are presented to boost the latency and energy efficiency of 5nm and 7nm SRAM. 1Mb embedded RRAM with 1T1R configuration demonstrates a self-adaptive write pulse extension and uses a tightened reference current to improve the reliability of RRAM in 14nm FinFET.

#### HIGH-BANDWIDTH AND LOW-POWER DRAM

In order to keep pace with the ever-increasing performance requirements of various applications, from graphics/mobile to supercomputing, DRAM continues to scale density, form factor, and bandwidth. This year, ISSCC 2021 includes benchmarks for the latest interface standards, such as a 24Gb/s/pin GDDR6 and a 22Gb/s/pin GDDR6X with a PAM4 interface for graphics applications. Figure 1 shows DRAM bandwidth growth over the last 14 years.



Figure 1 - DRAM data bandwidth growth

#### NON-VOLATILE MEMORY (NVM)

In the past decade, significant investment has been put into emerging memories to find an alternative to floating-gate-based non-volatile memory. The emerging NVMs, such as phase-change memory (PCM), ferroelectric RAM (FeRAM), magnetic spin-torque-transfer (STT-MRAM), and resistive memory (ReRAM), are showing potential to achieve high-cycling capability and lower power per bit read/write operations. However, conventional flash memories are continuously improving; confirming them as the mainstream today and into the near future.

This year's papers report improvements in write performance for conventional 3D flash memory TLC (184MB/s) and QLC (40MB/s) and the widespread acceptance of asynchronous-page read for read performance. Figure 2 shows non-volatile memory performance trends.

This year's papers also report improvements in bit density for TLC (11Gb/mm2) and QLC (13.8Gb/mm2). These high densities are achieved through advancements in 3-dimensional architectures, with up to 177 stacked-WL. Figure 3 shows non-volatile memory capacity trends.



Figure 2 - Read/write bandwidth comparison of non-volatile memories.



Figure 3 - Emerging non-volatile memory capacity trend.

#### NAND FLASH MEMORY

NAND flash memory continues to advance towards higher density, lower power and higher performance; resulting in low-cost storage solutions that are replacing traditional magnetic hard-disk storage with solid-state disks (SSDs). 3D memory technology is the mainstream for NAND flash memory in mass-production by semiconductor industries. Periphery-under-the-array is currently the reference architecture for TLC and QLC: it is enabling higher bit density and multiple planes for throughput improvement.

The state-of-the-art for TLC uses more than 170-stacked-WL. This year, emphasis is on periphery area reduction, IO speed improvement (up to 2.0Gb/s) and independent-plane reads for random read performance improvement. One paper, related to QLC, reports the highest bit density of 13.8Gb/mm2 with 144-stacked-WL and a 40MB/s for program throughput.

Figure 4 shows the observed trend in NAND Flash capacities at ISSCC over the past 20 years.



Figure 4 - NAND flash memory capacity trend.

# HISTORICAL TRENDS IN TECHNICAL THEMES INNOVATIVE TOPICS IMAGERS/MEMS/MEDICAL/DISPLAYS SUBCOMMITTEE TECHNOLOGY DIRECTIONS SUBCOMMITTEE



### IMMD – 2021 Trends (Medical)

### Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

As illustrated at ISSCC 2021, biomedical systems for on-body wearable and in-body implantable use continue to evolve toward more robust, functionally complex, and energy-efficient solutions. These wearable and implantable SoCs can record weak biopotential signals in the presence of real-life interference and under stringent power and size constraints. These new SoCs and corresponding techniques for wireless power/data transfer pave the way toward robust microdevices that enable multimodal physiological recordings in both wearable and implantable fashion from nearly every major organ system.

The state of the art in biomedical integrated circuits and systems has further advanced at ISSCC 2021 with miniaturization, higher sensitivity, higher dynamic range, and interference mitigation as major trends in both implantable and wearable devices, while continuing to improve power efficiency. High-dynamic-range (>130dB) sensing systems improve tolerance to large-amplitude interference and motion artifacts, while new techniques are introduced to sense physiological signals such as PPG and ExG. Miniaturization combined with a high level of integration enables minimally invasive implants with low tissue displacement for interfacing with the body.

Multimodal physiological sensors have the potential to offer improved monitoring and diagnosis of a number of chronic conditions. The form-factor of a low-cost, easy-to-use, wearable device will enable continuous monitoring of multiple vital signs, allowing health tracking outside the hospital. These advances offer tremendous market potential in both the medical and consumer market spaces.

### IMMD – 2021 Trends (Imagers)

### Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

The CMOS image-sensor market remains one of the fastest-growing segments in the semiconductor industry, with revenues exceeding \$19 billion USD in 2019 (representing just under 5% of the total semiconductor market) and is expected to exceed \$25 billion by 2023. The pervasion of image sensors continues with multiple devices present in smartphones, as well as wide proliferation in automotive and other consumer applications. BSI and 3D-stacked processes continue to offer improved performance with increased on-chip functionality. Depth-sensing applications continue to mature with clear applications in AR/VR, automotive and mobile, driving innovation based on both SPAD and photodiode technologies, resulting in a market that is predicted to be worth \$18.5 billion by 2023.

At ISSCC 2021, three SPAD-based depth sensors are presented, highlighting further progress in direct time-of-flight detection. Progress has been demonstrated in pixel performance, including the level of in-pixel logic integration, system power and ambient light rejection. Sony will present a 168×63-resolution LiDAR system with 200m demonstrated max ranging distance and operation under 117klux ambient light. This is complemented by devices from UNIST and EPFL demonstrating innovation in pixel read-out and signal processing. In addition to SPAD-based depth sensing, Samsung will report a 4-tap, 3.5µm fast-photodiode indirect time of flight sensor.

A further four intensity sensors demonstrate continued progress in pixel scaling for standard RGB sensors, computational imaging, and embedded sensor intelligence. Sony will demonstrate a 25-fps, 124dB dynamic range SPAD image sensor with motion artifact suppression and low power operation along with a 50.1Mpixel, high-speed stacked BSI sensor featuring column-parallel kT/C cancelling sample and hold delta-sigma ADC. KU Leuven uses in-sensor current-domain MAC operations with a QQVGA convolutional imager to achieve feature extraction and region-of-interest detection, while Nikon have achieved a 17 Mpixel computational imaging sensor with up to 134dB dynamic range. Finally, Samsung demonstrates a further RGB pixel pitch reduction to 0.64µm, while improving or maintaining critical performance parameters.

### **Technology Directions – 2021 Trends**

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan

Technology innovations bring the promise of enabling new system functionalities or substantially increasing the efficiency of existing ones. Harnessing such innovations for solving tangible real-world problems requires novel system-level solutions. With a focus on envisioning the future, emerging trends in Technology Directions this year at ISSCC 2021 covers a wide range of topics including quantum engineering, optical circuits for FMCW lidar and optogenetics, low-power secure circuits for IoT, and biomedical sensing, stimulation, and harvesting. ISSCC 2021 features four sessions representing the latest technological innovations in the following areas:

**Quantum engineering:** Quantum technologies are emerging as a major multi-disciplinary research topic, including computing, sensing, telecommunications, information technology, and security. Common to these technologies are properties typical of quantum mechanics, such as superposition and entanglement. Recently, engineers have developed techniques to exploit these properties using solid-state circuits, which are employed to control and observe a growing number of quantum devices. Since most quantum devices must be operated at deep cryogenic temperatures, circuits must also operate at these or comparable temperatures, so as to ensure compact, reliable, and, especially, scalable systems. Leveraging over 60 years of CMOS technology development, researchers are increasingly engaged in cryogenic CMOS (cryo-CMOS) circuits and systems to fill this gap. Cryo-CMOS technologies will serve a range of quantum devices that can be used in several quantum engineering problems. ISSCC 2021 will feature a session devoted to some of these topics, including cyro-CMOS SoCs that can drive and read qubits, and a low power ADC designed for cryo-CMOS systems.

Low-Power and Secure Circuits for IoT: The papers from ISSCC 2021 in this area push the frontiers of low power, secure solutions for IoT applications. One paper presents an ultra-low power IC designed for communicating with commodity WiFi transceivers via backscattering using a MIMO antenna array. Another introduces a nano-watt class (148nW) always-on wake-up chip for general-purpose IoT devices with a CNN-based intelligent inference engine. A third describes a method of physical-layer identification of wireless devices using out-of-band spectral regrowth techniques.

**Biomedical Devices, Circuits, and Systems:** ISSCC 2021 includes innovative and emerging biomedical systems that traverse device, circuits, and system-level design. The developing trends in this year encompass advancements in implantable neural stimulator, bio-molecular sensing, thermal actuation/sensing, and wearable sensors/stimulators. The technologies demonstrate the promise to advance artificial retinal prosthesis, end-to-end point-of-care diagnostic platforms, hyperthermia cancer therapy, and chronic wound healing process monitoring.

**Optical Systems for Emerging Applications:** Optical technologies bring new sensing and actuation modalities critical to emerging applications. The papers from ISSCC 2021 in this area demonstrate the progression of such technologies for increased robustness and system-level integration. The co-integration of optical/photonic devices with CMOS offers advancements in the critical application domains of automation/autonomy and biomedical. One paper describes an optical phased array for FMCW lidar with on-chip self-calibration capability based on Si-photonics technology. Another introduces a mechanically flexible, implantable neural device with integrated blue and green µLED arrays for fluorescence computational imaging and optogenetic stimulation. A third describes a MEMS-based dynamic light focusing system for single-cell precision in optogenetics.





A paper number mentioned in this section follows the convention S.P, where S is the session number and P is the paper number. For example 23.2 will be the second paper in the twenty-third session. You can refer back to the TECHNICAL SESSION OVERVIEWS in this Press Kit for additional details on any given paper. Some of the papers will also be available in the "not-so-technical" SESSION HIGHLIGHTS part of this Press Kit. All sessions and papers are in ascending order in both the Session Overviews and the Session Highlights sections of the Press Kit.

### **Technical Topics Mapped to Papers**

| Technical Topic                                                                                                                         | All papers in the following Sessions |
|-----------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------|
| Communication Systems                                                                                                                   | 6, 8, 11, 14, 20, 21, 22, 23, 26, 32 |
| Analog Systems<br>includes Analog, Power Management and Data<br>Converter Subcommittees                                                 | 5, 10, 17, 27, 31, 33                |
| <b>Digital Systems</b><br>includes Memory, Digital Circuits, Machine Leaning and<br>AI, Digital Architectures and Systems Subcommittees | 4, 9, 15, 16, 24, 25, 29, 30, 35, 36 |
| Innovative Topics<br>includes Imagers/MEMS/Medical Devices/Displays and<br>Technology Directions Subcommittees                          | 7, 12, 13, 18, 19, 28, 34            |

### Selected Presenting Companies/Institution Mapped to Papers

### Chart 4.1

| Affiliation                                                       | Paper Numbers         |
|-------------------------------------------------------------------|-----------------------|
| 4Catalyzer                                                        | 34.1                  |
| ADAPS Photonics                                                   | 7.4                   |
| Advanced Institute of Information Technology of Peking University | 5.1, 12.1             |
| AMD                                                               | 3.1                   |
| Ampleon                                                           | 6.5                   |
| Analog Devices                                                    | 2.1, 31.1, 31.4, 33.3 |
| ARM                                                               | 9.8                   |
| Baidu                                                             | 3.3                   |
| Broadcom                                                          | 12.2                  |
| Broadcom Netherlands                                              | 6.2, 13.3             |

| Bronkhorst BV                                           | 5.7                                                                 |
|---------------------------------------------------------|---------------------------------------------------------------------|
| Butterfly Network                                       | 34.1                                                                |
| Cadence                                                 | 11.3                                                                |
| CEA-Léti                                                | 35.2                                                                |
| Central Semiconductor Manufacturing Corporation         | 33.2                                                                |
| Chan Zuckerberg Biohub                                  | 19.3                                                                |
| Columbia University                                     | 6.6, 9.9, 11.4, 19.2, 21.4, 26.7                                    |
| Cornell University                                      | 9.8                                                                 |
| Daegu Gyeongbuk Institute of Science and Technology     | 34.4                                                                |
| Dartmouth College                                       | 17.1, 33.8                                                          |
| Delft University of Technology                          | 5.3, 5.4, 5.6, 5.7, 6.2, 6.5, 13.3, 13.4, 14.4,<br>31.2, 31.3, 31.4 |
| Dolphin Design                                          | 35.2                                                                |
| East China Research Institute of Electronic Engineering | 14.6                                                                |
| Eindhoven University of Technology                      | 14.6                                                                |
| Endolfin                                                | 34.4                                                                |
| EPFL                                                    | 7.4, 13.2, 13.3, 13.4                                               |
| ETH Zurich                                              | 4.4, 17.3                                                           |
| eTopus Technology                                       | 8.5                                                                 |
| Everactive                                              | 21.3                                                                |
| Ewha Womans University                                  | 21.1                                                                |
| Facebook                                                | 7.4                                                                 |
| Foundation Devices                                      | 8.1                                                                 |
| Fudan University                                        | 24.2                                                                |
| Georgia Institute of Technology                         | 26.1, 26.3, 27.3, 29.1                                              |
| Globalfoundries                                         | 35.2                                                                |
| Google                                                  | 2.3                                                                 |
| Greenwaves Technologies                                 | 4.4                                                                 |
| Hahn-Schickard                                          | 28.7                                                                |
| Hanyang University                                      | 8.2, 9.1                                                            |
| Harvard University                                      | 9.8                                                                 |
| Hitachi                                                 | 4.6, 13.2                                                           |
| Huawei Technologies                                     | 8.4, 8.8                                                            |
| IBM                                                              | 9.1, 24.1                                                                           |  |  |
|------------------------------------------------------------------|-------------------------------------------------------------------------------------|--|--|
| IBM Research                                                     | 8.3, 9.1, 24.1                                                                      |  |  |
| IBM Systems and Technology                                       | 8.3                                                                                 |  |  |
| imec                                                             | 26.4, 28.3                                                                          |  |  |
| imec - Holst Centre                                              | 28.3                                                                                |  |  |
| imec-Netherlands                                                 | 21.2                                                                                |  |  |
| Industrial Technology Research Institute                         | 16.3                                                                                |  |  |
| Infineon Technologies                                            | 2.3, 5.6, 32.8                                                                      |  |  |
| Inphi                                                            | 8.6, 8.7, 29.4                                                                      |  |  |
| Institute of Microelectronics                                    | 28.5                                                                                |  |  |
| Institute of Microelectronics of the Chinese Academy of Sciences | 24.2, 33.5                                                                          |  |  |
| Institute of Microelectronics of Tsinghua University             | 9.2, 14.5                                                                           |  |  |
| Instituto Superior Tecnico/University of Lisboa                  | 20.1, 27.6                                                                          |  |  |
| Intel                                                            | 8.1, 9.2, 10.6, 10.7, 11.5, 11.6, 11.9, 13.1,<br>16.2, 17.4, 29.3, 30.2, 32.6, 36.2 |  |  |
| Intuitive Surgical                                               | 7.4                                                                                 |  |  |
| Iowa State University                                            | 33.6                                                                                |  |  |
| KAIST                                                            | 5.8, 17.6, 17.8, 18.1, 23.4, 28.6, 32.1, 32.4, 33.7, 34.4                           |  |  |
| Kilby Labs, Texas Instruments                                    | 5.5                                                                                 |  |  |
| KIOXIA                                                           | 30.4                                                                                |  |  |
| Korea Aerospace Research Institute                               | 28.6                                                                                |  |  |
| Korea Institute of Science and Technology                        | 7.4                                                                                 |  |  |
| KU Leuven                                                        | 28.3                                                                                |  |  |
| KU Leuven - MICAS                                                | 9.4, 22.3, 23.3, 26.2                                                               |  |  |
| Kyungpook National University Hospital                           | 34.4                                                                                |  |  |
| Marvell                                                          | 11.8                                                                                |  |  |
| Massachusetts Institute of Technology                            | 5.5, 6.7, 11.9                                                                      |  |  |
| MediaTek                                                         | 4.1, 11.1, 27.5, 35.1                                                               |  |  |
| Micron Semiconductor                                             | 25.3                                                                                |  |  |
| Micron Technology                                                | 25.3                                                                                |  |  |
| Microsoft                                                        | 3.1                                                                                 |  |  |
| Mojo Vision                                                      | 34.2                                                                                |  |  |
| Nanjing Low Power IC Technology Institute                        | 29.8                                                                                |  |  |

| Nanovision Biosciences                         | 18.1                   |  |  |
|------------------------------------------------|------------------------|--|--|
| Nanyang Technological University               | 6.7, 9.7, 14.2, 29.2   |  |  |
| National Chiao Tung University                 | 17.7, 17.9, 18.4, 33.1 |  |  |
| National Chung Hsing University                | 18.4                   |  |  |
| National Institute of Standards and Technology | 19.3                   |  |  |
| National Taiwan University                     | 4.7, 29.5              |  |  |
| National Tsing Hua University                  | 15.2, 15.4, 16.1, 16.3 |  |  |
| National University of Singapore               | 5.2, 28.5, 36.1        |  |  |
| Nations Innovation Technologies                | 9.7                    |  |  |
| New York University Abu Dhabi                  | 33.7, 34.4             |  |  |
| Nikon                                          | 7.8                    |  |  |
| Northwestern University                        | 15.3                   |  |  |
| now with Apple                                 | 10.2, 10.7             |  |  |
| now with Renesas                               | 21.3                   |  |  |
| NTT                                            | 22.2                   |  |  |
| Nvidia                                         | 3.2                    |  |  |
| NXP Semiconductors                             | 17.2                   |  |  |
| Oklahoma State University                      | 23.1                   |  |  |
| Oregon State University                        | 6.6, 14.3, 21.4        |  |  |
| Peking University                              | 5.1, 12.1, 27.7, 32.5  |  |  |
| Pi2star Technology                             | 15.2                   |  |  |
| Pohang University of Science and Technology    | 36.4                   |  |  |
| Politecnico di Milano                          | 32.3, 32.8             |  |  |
| Politecnico di Torino                          | 5.2                    |  |  |
| Princeton University                           | 15.1, 18.2, 22.1       |  |  |
| Purdue University                              | 36.2                   |  |  |
| Qualcomm                                       | 23.4, 35.3             |  |  |
| Quantum Motion Technologies                    | 13.2                   |  |  |
| QuTech                                         | 13.3, 13.4             |  |  |
| Rambus                                         | 11.2                   |  |  |
| Raytheon                                       | 11.9                   |  |  |
| Realtek Semiconductor                          | 17.7, 17.9, 33.1       |  |  |

| Renesas Electronics                                | 4.2                                                                                                   |  |  |
|----------------------------------------------------|-------------------------------------------------------------------------------------------------------|--|--|
| Rice University                                    | 12.3, 18.3, 19.2, 36.5                                                                                |  |  |
| Samsung Advanced Institute of Technology           | 9.5                                                                                                   |  |  |
| Samsung Electronics                                | 4.8, 6.1, 7.1, 7.9, 9.5, 10.1, 10.5, 24.3, 25.2, 25.4, 28.2, 29.6, 30.3, 32.1, 32.2, 32.4, 33.7, 33.9 |  |  |
| Samsung Semiconductor                              | 32.2                                                                                                  |  |  |
| Seoul National University                          | 9.3                                                                                                   |  |  |
| Silicon Austria Labs                               | 14.6                                                                                                  |  |  |
| Singapore University of Technology and Design      | 14.2                                                                                                  |  |  |
| SK hynix Semiconductor                             | 25.1, 30.1                                                                                            |  |  |
| SolidVue                                           | 7.2                                                                                                   |  |  |
| Sony Depthsensing Solutions                        | 7.3                                                                                                   |  |  |
| Sony LSI Design                                    | 7.3                                                                                                   |  |  |
| Sony Semiconductor Israel                          | 9.6                                                                                                   |  |  |
| Sony Semiconductor Manufacturing                   | 7.5, 7.6                                                                                              |  |  |
| Sony Semiconductor Solutions                       | 7.3, 7.5, 7.6, 9.6                                                                                    |  |  |
| South China University of Technology               | 6.7                                                                                                   |  |  |
| Southeast University                               | 29.8, 33.2                                                                                            |  |  |
| Southern University of Science and Technology      | 8.5                                                                                                   |  |  |
| Southwest Integrated Circuit Design                | 14.6                                                                                                  |  |  |
| STMicroelectronics                                 | 17.3, 22.4                                                                                            |  |  |
| Sungkyunkwan University                            | 7.2                                                                                                   |  |  |
| Texas Instruments                                  | 2.2, 5.5                                                                                              |  |  |
| TNO                                                | 13.3                                                                                                  |  |  |
| Tokyo Institute of Technology                      | 22.2, 32.7                                                                                            |  |  |
| TOPPAN TECHNICAL DESIGN CENTER CO., LTD.           | 4.6                                                                                                   |  |  |
| Total Design Service Co,LTD                        | 4.6                                                                                                   |  |  |
| Tsinghua University                                | 8.5, 10.4, 15.2, 15.4, 20.3, 27.1, 27.4, 27.7                                                         |  |  |
| TSMC                                               | 16.1, 16.3, 16.4, 24.4                                                                                |  |  |
| TSMC Corporate Research                            | 29.1                                                                                                  |  |  |
| TSMC Design Technology                             | 29.1                                                                                                  |  |  |
| Tufts University                                   | 9.8                                                                                                   |  |  |
| Ulsan National Institute of Science and Technology | 7.2, 32.1, 32.4                                                                                       |  |  |

| Université catholique de Louvain                         | 7.7                                                                             |  |  |
|----------------------------------------------------------|---------------------------------------------------------------------------------|--|--|
| University of Texas                                      | 23.1                                                                            |  |  |
| University College Dublin                                | 32.3, 32.8                                                                      |  |  |
| University of Bologna                                    | 4.4                                                                             |  |  |
| University of British Columbia                           | 32.5                                                                            |  |  |
| University of California                                 | 4.3, 6.3, 8.2, 10.2, 11.7, 12.2, 14.1, 14.7, 17.5, 18.1, 19.3, 23.2, 28.1, 28.4 |  |  |
| University of Cambridge                                  | 13.2                                                                            |  |  |
| University of Electronic Science and Technology of China | 4.5, 15.4, 20.2, 26.5                                                           |  |  |
| University of Freiburg - IMTEK                           | 28.7                                                                            |  |  |
| University of Macau                                      | 20.1, 27.6                                                                      |  |  |
| University of Michigan                                   | 10.3, 14.8, 22.4, 27.2                                                          |  |  |
| University of Minnesota                                  | 6.4                                                                             |  |  |
| University of Pisa                                       | 13.4                                                                            |  |  |
| University of Science and Technology of China            | 14.6, 33.5                                                                      |  |  |
| University of Southern California                        | 19.1, 26.6, 29.4                                                                |  |  |
| University of Texas                                      | 10.4, 16.2, 27.1, 27.4, 27.7                                                    |  |  |
| University of Toronto                                    | 17.2, 28.8                                                                      |  |  |
| University of Twente                                     | 5.7                                                                             |  |  |
| University of Virginia                                   | 21.5                                                                            |  |  |
| University of Washington                                 | 19.2, 29.7                                                                      |  |  |
| University of Wuppertal                                  | 34.3                                                                            |  |  |
| Waseda University                                        | 36.3                                                                            |  |  |
| Western Digital                                          | 30.4                                                                            |  |  |
| Xiamen University                                        | 33.5                                                                            |  |  |
| Xidian University                                        | 27.1                                                                            |  |  |
| XINYI Information Technology                             | 12.1                                                                            |  |  |
| Yonsei University                                        | 5.3, 21.1, 31.2                                                                 |  |  |
| Zhejiang Lab                                             | 24.2                                                                            |  |  |
| Zhejiang University                                      | 5.1, 12.1, 14.7, 20.2, 33.4                                                     |  |  |

# **CONTACT INFORMATION**



| ANALOG<br>Subcommittee Chair:         | Kofi Makinv                                                  | va                    | MEMORY<br>Subcommittee Chair: | MEMORY Subcommittee Chair: Jonathan Chang |                      |  |
|---------------------------------------|--------------------------------------------------------------|-----------------------|-------------------------------|-------------------------------------------|----------------------|--|
|                                       | Delft Unive                                                  | rsity of Technology   |                               | TSMC                                      |                      |  |
| Work Phone:                           | +31-15-27-                                                   | 86466<br>             | Work Phone:                   | +886-3-56                                 | 36688 ext 7125890    |  |
| Email:                                | K.a.a.makir                                                  |                       | Email:                        | jon_chang@tsmc.com                        |                      |  |
| Press Designates:                     | NA/EU: Dre                                                   | ew Hall               | Press Designates:             | NA/EU:                                    | Violante Moschiano   |  |
|                                       | FE: Ma                                                       | n-Kay Law             |                               | FE:                                       | Shinichiro Shiratake |  |
|                                       |                                                              |                       | POWER MANAGEMENT              |                                           |                      |  |
| DATA CONVERTERS                       |                                                              |                       | Subcommittee Chair:           | Yogesh Ramadass                           |                      |  |
| Subcommittee Chair:                   | Michael Flynn                                                |                       |                               | Texas Instruments                         |                      |  |
| Work Phone:                           |                                                              | of Michigan Ann Arbor | Work Phone:                   | 669-721-6737                              |                      |  |
| Fmail.                                | noflynn@u                                                    | umich edu             | Email.                        | yogesn.ra                                 | madass@fi.com        |  |
| Email.                                | mpinyim@c                                                    | innon.edu             | Press Designates:             | NA/EU:                                    | Johan Janssens       |  |
| Press Designates:                     | NA/EU:                                                       | Jan Westra            |                               | FE:                                       | Chan-Hong Chern      |  |
| , , , , , , , , , , , , , , , , , , , | FE:                                                          | Takashi Oshima        |                               |                                           |                      |  |
|                                       |                                                              |                       | RF                            |                                           |                      |  |
| DIGITAL ARCHITECTURES & SYSTEMS       |                                                              | Subcommittee Chair:   | Jan Craninckx                 |                                           |                      |  |
| Subcommittee Chair:                   | Thomas Bu                                                    | ırd                   |                               | IMEC                                      |                      |  |
|                                       | AMD                                                          |                       | Work Phone:                   | +32-16-28                                 | -87-56               |  |
| Work Phone:                           | 408-749-28                                                   | 805                   | Email:                        | jan.cranino                               | ckx@imec.be          |  |
| Email:                                | tom.burd@                                                    | amd.com               |                               |                                           |                      |  |
| Duran Danimatan                       |                                                              |                       | Press Designates:             | NA/EU:                                    | Jim Buckwalter       |  |
| Press Designates:                     | NA/EU:                                                       | Junho Huh             |                               | FE:                                       | Conan Zhan           |  |
|                                       | 1 ⊑.                                                         |                       |                               |                                           |                      |  |
|                                       |                                                              |                       | TECHNOLOGY DIRECTIONS         |                                           |                      |  |
| DIGITAL CIRCUITS                      |                                                              |                       | Subcommittee Chair:           | Makoto Na                                 | agata                |  |
| Subcommittee Chair:                   | Keith Bown                                                   | nan                   |                               | Kobe Univ                                 | ersity               |  |
|                                       | Qualcomm                                                     |                       | Work Phone:                   | +81-78-80                                 | 3-6569               |  |
| Work Phone:                           | 919-237-47                                                   | (50                   | Email:                        | nagata@c                                  | s.kobe-u.ac.jp       |  |
| Email:                                | KDOWMAN@                                                     | yqti.quaicomm.com     | Proce Decignatos:             |                                           | Donio Doly           |  |
| Press Designates                      | NΔ·                                                          | Alicia Klinefelter    | Fless Designates.             | NAVEO.<br>FE:                             | Munehiko Nagatani    |  |
| r less Designales.                    | FU:                                                          | Yvain Thonnart        |                               | I L.                                      | Mullelliko Nagalalli |  |
|                                       | FE:                                                          | Koji Hirairi          |                               |                                           |                      |  |
|                                       |                                                              | ,                     | WIRELESS                      |                                           |                      |  |
|                                       |                                                              |                       | Subcommittee Chair:           | Subcommittee Chair: Stefano Pel           |                      |  |
| IMAGERS, MEMS, MEDICA                 | L & DISPLA                                                   | YS                    |                               | Intel                                     |                      |  |
| Subcommittee Chair:                   | Chris Van Hoof                                               |                       | Work Phone:                   | 503-712-4576                              |                      |  |
|                                       | Imec                                                         | 045                   | Email:                        | stefano.pe                                | ellerano@intel.com   |  |
| Work Phone:                           | +32-16-281                                                   | 1815<br>Jof@imaa ha   | Proce Decignation             |                                           | Manuam Tabaah        |  |
|                                       | chris.vanno                                                  | ooi@imec.be           | Press Designales.             | NA/EU.<br>FE·                             | Yuan-Hung Chung      |  |
| Press Designates:                     | NA/EU:                                                       | Johan Vanderhaegan    |                               | 1 -                                       | ruun nung onung      |  |
| -                                     | FE:                                                          | Masayuki Miyamoto     |                               |                                           |                      |  |
|                                       |                                                              |                       | WIRELINE                      |                                           |                      |  |
| MACHINE LEARNING & AI                 | MACHINE LEARNING & AI<br>Subcommittee Chair: Marian Verhelst |                       | Subcommittee Chair:           | Frank O'N                                 | lahony               |  |
| Subcommittee Chair:                   |                                                              |                       |                               |                                           |                      |  |
| Work Dhanay                           | KU Leuven                                                    | 0617                  | Work Phone:                   | 503-613-1467                              |                      |  |
| Fmail.                                | marian ver                                                   | nelst@kuleuven.he     |                               | irank.oma                                 | nony@intel.com       |  |
| Lindii.                               |                                                              |                       | Press Designates:             | NA/FU                                     | Thomas Toifl         |  |
| Press Designates:                     | NA/EU:                                                       | Geoff Burr            | Boognatoo.                    | FE:                                       | Byungsub Kim         |  |
| <b>v</b>                              | FE:                                                          | Masato Motomura       |                               |                                           |                      |  |
|                                       |                                                              |                       |                               |                                           |                      |  |

### Program Chair, ISSCC 2021

Makoto Ikeda University of Tokyo Work Phone: Email:

+81-3-5841-8901 ikeda@silicon.t.u-tokyo.ac.jp

#### Program Vice-Chair, ISSCC 2021

Edith Beigné Facebook Work Phone: Email:

650-709-8127 edith.beigne@gmail.com

## **Press Coordinator**

Shahriar Mirabbasi University of British Columbia Email: shahriar@ece.ubc.ca

## **Press-Relations Liaison**

Kenneth C. Smith University of Toronto Work Phone: Email:

416-418-3034 Icfujino@aol.com



Editor-in-Chief:Shahriar MirabbasiEditor-at-Large:Kenneth C (KC) SmithPublisher:Laura Chizuko Fujino

