In a World Ruled by Software, Hardware is Still King
By Dr. Anthony Gallippi
Senior Director of System Architecture and Hardware Development, NVXL Technology, Inc.

Consumers are enamored with hot new technology, everything from driverless cars to robots so realistic, they can pass the Turing test. The reality however, is a little more mundane. Software innovation drives artificial intelligence which in turn drives new and exciting applications. But without the foundation of hardware, these cool technologies would never be realized and would never have the opportunity to change the world and our lives for the better.

The History of High Tech Hardware

The legacy of modern hardware began in earnest with the advent of “third-generation” computers back in the 1960s. This quickly led to the creation of the microprocessor, the Intel 4004, designed by a team at Intel. While the earliest microprocessor ICs contained only the processor (i.e., the central processing unit) of a computer, their ongoing growth led to chips containing most or all of the internal electronic parts of a computer.

In 1975, Olivetti launched the P6060 (see Figure 1), the world’s first complete, pre-assembled personal computer system. It had one or two 8-inch floppy disk drives, a 32-character plasma display, an 80-column graphical thermal printer, 48 Kbytes of RAM, BASIC language, and weighed in at a hefty 88 pounds. From 1975 to 1977, most microcomputers were actually marketed and sold as kits for do-it-yourselfers. Pre-assembled systems did not become popular with consumers until the introduction of the Apple II and others.

Figure 1. The P6060 weighed in at nearly 90 pounds. Computers have since gone on an extreme diet with current laptops tipping the scales at a slim 2–6 pounds

During the 1980s, CMOS logic gates developed into devices that could be made as fast as other circuit types; computer power consumption could therefore be decreased dramatically. Unlike the continuous current draw of a gate based on other logic types, a CMOS gate only draws significant current during the ‘transition’ between logic states, except for leakage. More recently, multi-core CPUs became commercially available. Content-addressable memory (CAM) has become inexpensive enough to be used in networking, and is frequently used for on-chip cache memory in modern microprocessors.

Trends in High Tech Hardware

These developments have allowed computing to become a commodity which is now pervasive and is part of our daily lives. The SoC (system on a chip) has compressed even more of the integrated circuitry into a single chip; SoCs are enabling phones and PCs to converge into single hand-held wireless mobile devices.

Power consumption has long been a concern and driver in hardware. In 2006, servers consumed 1.5% of the total U.S. energy budget. This figure doubled by 2011. Even with energy saving technologies proliferating the industry, energy consumption of computer data centers continues to grow. See Figure 2.

Figure 2. There is no slowdown in sight for power needs in data centers.

Last year, MIT’s Technology Review reported that IBM has created a 50-qubit computer that can preserve the quantum state for 90 microseconds. Common digital computing requires that the data be encoded into binary digits (bits), each of which is always in one of two definite states (0 or 1). Conversely, quantum computation uses quantum bits or qubits, which can be in superpositions of states. Large-scale quantum computers would theoretically be able to solve certain problems much more quickly than the best known classical computers and may be able to efficiently solve problems which are not practically feasible on classical computers.

New generations of hardware demonstrate that the high tech industry is nothing if not resourceful. Hardware will strive to keep up with compute, storage and networking trends, keeping a pace similar to Moore’s law. See Figure 3.

Figure 3. Moore’s Law.

What Will High Tech Hardware Enable?

So, as hardware strives to keep pace, what will this enable? The world is waiting for driverless cars, but other, equally practical innovations are on the horizon. From wellness and location monitoring in elderly communities, to smart city monitoring using facial recognition, to drones for public safety, hardware innovation is on the rise. With respect to supporting the aging population, robots can provide elderly care including mobility, companionship, medication support and monitoring.

In the medical field, robots and AI-driven machines are being developed in ways imagined in science fiction years ago. For amputees and even some spinal cord injuries, connecting limbs to the central nervous system so the brain controls the limb will mean the difference between being disabled and being mobile. The bionic man or woman is not far off from reality although multi-disciplinary engineering advances need to be made to fully realize the potential of these sorts of biomedical solutions.

Compute Acceleration

While the promise of applications such as driverless vehicles is compelling, technology challenges are slowing progress. More than any other problem facing deep learning for example, is the fact that these evolving technologies require massive amounts of compute power. Deep learning models require terabytes or even petabytes of data to train them, and with respect to inference, must address hundreds of thousands of requests per second with less than tens of milliseconds of response time. With enough compute power, these applications could rapidly identify trends and patterns that would be otherwise problematic or too time-consuming to detect and take appropriate actions. If the industry can address this insatiable need for compute power, then deep learning will become a game changer.

NVXL is developing a software-defined compute solution called polymorphic acceleration. Along with this solution, we are engineering hardware, modules and racks to bring this technology to market. NVXL’s Plug-in Configurable Accelerator (PCA) is an Arria 10 FPGA based 2.5-inch U.2 module that connects to the host using a PCIe G3 x4 high bandwidth interface. PCAs can be plugged into any server that supports standard 2.5-inch U.2 drive slots, and a standard 2U NVMe server can support up to 24 PCAs. PCA module uses OpenCL1.2 to communicate with the host processor over a PCIe G3 x4 interface. It accelerates compute intensive tasks and also supports peer-direct read/write access for reduced latency, higher throughput and reduced host CPU utilization. This means that applications that can leverage this solution are far ranging, including:

Figure 4. NVXL platform enables acceleration for compute-intensive algorithms.

NVXL’s FlexAccel technology in the PCA provides hardware acceleration for compute intensive algorithms/functions such as CNN/DNN, Genomics, Transcoding, BLAS, Encryption, Compression, Database Query, and others and delivers order of magnitude in performance boost for applications that make use of these algorithms/functions. See Figure 4.

The FPGAs used in the PCA modules support both static and dynamic reconfiguration by loading a different bitstream (FPGA image) file that provides the required functionality. Using this capability, PCA supports seamless mode switch-over to the desired functionality for any applications in a few seconds. Also, the mode switch-over feature in PCA helps to consolidate the infrastructure silos built for different applications and helps improve efficiency, reduce datacenter footprint and improve Total Cost of Ownership (TCO).

Details of NVXL’s rack solution appear in Table 1.

NVXL Solution
Server CPU 2x Intel Xeon Scalable Processors, 6130, 2.1GHz 16-Core CPUs
Memory 192GB or 384GB, DDR4-2666MHz RDIMMs
Storage 2x 256GB M.2 NVMe Flash, Boot
Network Connectivity 4x 10GBase-T, 1x IPMI
Optional 2x dual-port 40/50GbE or 2x 100GbE NIC cards
Equipment Height 2RU
2.5” U.2 Drive Slots 24x
Expansion Boxes (JBOAs *) 2x JBOAs
(4-20x single-wide, 4-8x double-wide accelerators per JBOA)
Operating Environment NVXL Acceleration Platform
OS Support Ubuntu 16.04
Management Web-based Mgmt. GUI & CLI

Table 1. NVXL DS-1 Universal Acceleration Server (UAS) specifications.


Want to talk about your hardware needs and compute acceleration?
Contact me at: and let’s talk.

Untitled Document