Introduction

Compilers traditionally optimize for code speed or size. But increasingly the energy efficiency of the generated code is a priority, whether to extend the battery life of devices, to reduce energy consumption in data centers, or to get the most computation out of energy scavenging devices.

Embecosm offers the first compilers and compiler optimizations that can optimize for energy. The technology combines Embecosm’s MAGEEC machine learning framework for GCC and LLVM with optimizations specifically aimed at improving energy efficiency. The result is compiled code that uses less energy than can be achieved with any existing compiler.

  • Optimize to generate energy efficient code
  • Outperforms all existing compilers
  • Can use Embecosm’s low cost energy measurement board or third party tools
  • Available today for LLVM and GCC

 

Technical Details

Embecosm’s optimizations for energy efficiency build on the machine learning optimization frameworks developed under the MAGEEC and TSERO projects.  these allow an existing LLVM or GCC compiler to be tuned for any quantifiable optimization criterion.   Energy consumption by computer systems is now easy to measure. For small embedded systems, the Embecosm energy measurement board can be used.  Larger systems use energy measurement features built into modern processors, which can be collected by tools such as ARM’s AllineaForge MAP.  The machine learning framework can then be trained to optimize for energy efficiency of the compiled code.

With such detailed insight into energy efficiency, Embecosm has been able to design optimizations specifically for energy efficiency.

  • For deeply embedded systems, where code is executed directly from flash memory, it is important to align frequently executed innermost loops within a single bit line. Avoiding lighting up a second bit-line can reduce the energy consumption of such loops by 12%.  Implementing this optimization requires changes to both the compiler and linker, including a two pass link-time optimization process.
  • Larger embedded systems often have a choice between executing in RAM or flash memory, although the amount of RAM is usually far smaller than the amount of flash.  RAM uses far less energy than flash memory, and for heavily executed code, it is more energy efficient to copy the code into RAM before execution.  There is the performance overhead of the initial copy of code, but energy usage can be significantly reduced.
  • On larger systems, there are often multiple ways to achieve the same result.  On some larger ARM systems it can be more energy efficient to always use the NEON vector co-processor for multiplication, even when handling scalars.

More recently research with the TSERO project and development of the GSO 2.0 superoptimization framework has allowed us to start considering superoptimization for energy efficiency.  This is work is still at an early stage, but has the potential to achieve even greater energy efficiency than machine learning optimization.

Case Study

Thanks to Embecosm’s experience, unique skills and novel techniques we were able to complete ambitious research that required searching an enormous computational space.
Dr Simon Hollis University of Bristol

The Machine Guided Energy Efficient Compiler (MAGEEEC) project was an InnovateUK supported research program between University of Bristol and Embecosm, with the aim of making machine learning feasible in commercial compilers, specifically for generating energy efficient code.  Running from mid-2012 to the end of 2013, the objective was to achieve a 20% reduction in typical code energy usage in deeply embedded systems.

Under this project we implemented the first version of the MAGEEC framework, and integrated it with an energy measurement board (the MAGEEC Wand) designed by Dr Simon Hollis, then at Bristol University.  This allowed us to sample energy usage on small computers up to 2 million times per second to an accuracy of 1%, essential if we were to get accurate training data.  This first version of MAGEEC used the plugin interface to control the GCC pass manager directly.  We were able to demonstrate machine learning that could reduce energy consumption in deeply embedded computers, using Atmel AVR processors as the evaluation system.

Total Software Energy Reduction and Optimization (TSERO) was an InnovateUK supported follow-on project between Embecosm, Allinea (now part of ARM), Concertim and STFC Daresbury Hartree Center, which aimed to apply the same techniques to high performance computing systems and data centers.  Running between 2015 and 2017, this project added superoptimization as a technique for energy optimization.  University of Bristol continued to be involved as technical advisors to the project.

Fitting custom energy loggers to valuable HPC systems was not an option, so the data for machine learning was taken from the Allinea MAP tool, which collates energy information from all nodes in an HPC system.  This required the MAGEEC system to use a much larger statistical sample to get useful data.

The MAGEEC system was rewritten for this project, based on the previous experience. Interfacing through the GCC plugin interface was too compiler specific, and proved very easy to select invalid optimization pass combinations which would crash to the compiler.  We switched to controlling the compiler through the command line.  This also made it easier to work with LLVM which has no standard plugin interface.

By late 2017 we were able to demonstrate that MAGEEC could improve execution speed compared to standard -O3 for some common HPC kernels.  We did not see improvement in call cases, but where we did, the improvement averaged 8%.  This is not as much as the target 20%, but still represents an important gain, particularly if we can achieve the same for energy efficiency.  Google and Amazon between them spend US$1billion each year on energy for their data centers, and an 8% reduction is a significant saving.

 

Machine Learning Optimization

Embecosm's MAGEEC is the first commercially robust implementation of a machine learning system and is available for both GCC and LLVM compilers.

Superoptimization

Superoptimization

Embecosm is the first company to offer superoptimization for commercial applications. This is a practical technology that can deliver a step change in performance and code size for your key algorithms and libraries.