Document
WAMR on GitHubWAMR Blogs
  • WAMR Document Home Page
  • Basics
    • Introduction
      • WebAssembly
      • WAMR Project
      • Security Feature
    • Getting Started
      • Host Environment Preparation
      • Hello-world Program On Host
      • Docker Environment Preparation
      • Hello-world Program On Docker
      • Build And Run WASM Application
        • More Tools To Create WASM Application
  • WAMR In Practice
    • Tutorial
      • WAMR Running Modes
      • Build Tutorial
        • Build iwasm
        • Build wamrc
      • Language Embedding
        • C/C++
        • Python
        • Go
      • Debugging & IDE Support
        • WAMR Source Debugging With LLDB
        • VS Code Support
          • Enable Debugging In VS Code
          • Move LLDB Binaries
    • Advance Tutorial
      • Performance Test
        • PolyBench
        • CoreMark
        • Sightglass
        • JetStream2
      • Memory Usage Tunning
      • Application Framework
      • Remote Application Management
        • Example 1: Install/Uninstall WASM App Remotely
        • Example 2: IoT App Store Demo
      • WAMR Porting Guide
    • Features
      • Export Native APIs To WASM Applications
        • Example 1: Export C Functions to WASM
        • Example 2: Using "native-lib"
      • Multiple Modules As Dependencies
        • Multi-modules Example
      • Multi-thread, Pthread APIs And Thread Management
        • Multi-thread Example
      • Linux SGX(Intel Software Guard Extension) Support
      • Linux SGX Remote Attestation
      • XIP(Execution In Place) Support
      • Socket Support
        • Example: Use Socket Api in WAMR
      • Post-MVP Features
        • WASM C API
        • 128-bit SIMD
        • Reference Types
    • More Examples
      • File Interaction Of WASI
      • GUI Example
        • Littlevgl
        • LVGL
      • Same WASM Program Executing Concurrently
      • Build And Run Workload
    • User Case
  • Programmer's Manual
    • Programmer's Manual
      • C API Lists
  • Community
    • How To Contribute
    • WAMR On Github
    • WAMR Blogs
  • Appendix
    • Appendix A. Background Knowledge And Glossary Of Terms
    • Appendix B. WebAssembly Details
    • Appendix C. Complete WAMR Guide
Powered by GitBook
On this page
  • Known issues
  • Tuning the XIP intrinsic functions
  • Example
  1. WAMR In Practice
  2. Features

XIP(Execution In Place) Support

Some IoT devices may require to run the AOT file from flash or ROM which is read-only, so as to reduce the memory consumption, or resolve the issue that there is no executable memory available to run AOT code. In such case, the AOT code inside the AOT file shouldn't be duplicated into memory and shouldn't be modified (or patched) by the AOT relocations. To address this, WAMR implements the XIP (Execution In Place) feature, which generates the AOT relocations as few as possible:

  • In the AOT code, an AOT function calls other functions with indirect mode: it doesn't call other functions directly, but looks up their pointers from the function pointer table passed by its first argument exec_env, and then calls the function pointer found. By this way the relocations to other functions are eliminated.

  • Eliminate the calls to the LLVM intrinsic functions, or, replace calling them with calling runtime self implemented functions instead, e.g. the calling to llvm.experimental.constrained.fadd.f32 is replaced by the calling to aot_intrinsic_fadd_f32.

The XIP file is an AOT file without (or with few) relocations to patch the AOT code (or text section). Developer can use the option --enable-indirect-mode --disable-llvm-intrinsics for wamrc to generate the AOT file, e.g.:

wamrc --enable-indirect-mode --disable-llvm-intrinsics -o <aot_file> <wasm_file>
or
wamrc --xip -o <aot_file> <wasm_file>

Note: --xip is a short option for --enable-indirect-mode --disable-llvm-intrinsics

Known issues

There may be some relocations to the ".rodata" like sections which require to patch the AOT code. More work will be done to resolve it in the future.

Tuning the XIP intrinsic functions

WAMR provides a default mapping table for some targets, but it may not be the best one for your target. And it doesn't cover all the supported targets.

So, wamrc provides the option --enable-builtin-intrinsics=<intr1,intr2,...> to make it possible to tune the intrinsic functions for your target.

Firstly, you should understand why we don't use the LLVM intrinsic functions directly. The reason is that the LLVM intrinsic functions can't map to the native instructions directly, e.g. the LLVM intrinsic function i32.div_s can't map to the native instruction if the target doesn't support the division instruction, it will be translated to a function call to the runtime function from libgcc/compiler-rt. This will cause the AOT code to have the relocations to the libgcc/compiler-rt, which is not acceptable for the XIP feature.

So, we need to replace the LLVM intrinsic functions with the runtime self implemented functions, which can be called through the function pointer table (--enable-indirect-mode) and don't have the relocations to the libgcc/compiler-rt (--disable-llvm-intrinsics).

Available intrinsic functions for tuning:

LLVM intrinsic function
Explanation

llvm.experimental.constrained.fadd.f32

float32 add

llvm.experimental.constrained.fadd.f64

float64 add

llvm.experimental.constrained.fsub.f32

float32 sub

llvm.experimental.constrained.fsub.f64

float64 sub

llvm.experimental.constrained.fmul.f32

float32 mul

llvm.experimental.constrained.fmul.f64

float64 mul

llvm.experimental.constrained.fdiv.f32

float32 div

llvm.experimental.constrained.fdiv.f64

float64 div

llvm.fabs.f32

float32 abs

llvm.fabs.f64

float64 abs

llvm.ceil.f32

float32 ceil

llvm.ceil.f64

float64 ceil

llvm.floor.f32

float32 floor

llvm.floor.f64

float64 floor

llvm.trunc.f32

float32 trunc

llvm.trunc.f64

float64 trunc

llvm.rint.f32

float32 rint

llvm.rint.f64

float64 rint

llvm.sqrt.f32

float32 sqrt

llvm.sqrt.f64

float64 sqrt

llvm.copysign.f32

float32 copysign

llvm.copysign.f64

float64 copysign

llvm.minnum.f32

float32 minnum

llvm.minnum.f64

float64 minnum

llvm.maxnum.f32

float32 maxnum

llvm.maxnum.f64

float64 maxnum

llvm.ctlz.i32

int32 count leading zeros

llvm.ctlz.i64

int64 count leading zeros

llvm.cttz.i32

int32 count trailing zeros

llvm.cttz.i64

int64 count trailing zeros

llvm.ctpop.i32

int32 count population

llvm.ctpop.i64

int64 count population

f64_convert_i32_s

int32 to float64

f64_convert_i32_u

uint32 to float64

f32_convert_i32_s

int32 to float32

f32_convert_i32_u

uint32 to float32

f64_convert_i64_s

int64 to float64

f64_convert_i64_u

uint64 to float64

f32_convert_i64_s

int64 to float32

f32_convert_i64_u

uint64 to float32

i32_trunc_f32_s

float32 to int32

i32_trunc_f32_u

float32 to uint32

i32_trunc_f64_s

float64 to int32

i32_trunc_f64_u

float64 to uint32

i64_trunc_f64_s

float64 to int64

i64_trunc_f64_u

float64 to uint64

i64_trunc_f32_s

float32 to int64

i64_trunc_f32_u

float32 to uint64

f32_demote_f64

float64 to float32

f64_promote_f32

float32 to float64

f32_cmp

float32 compare

f64_cmp

float64 compare

i64.div_s

int64 div

i64.div_u

uint64 div

i32.div_s

int32 div

i32.div_u

uint32 div

i64.rem_s

int64 rem

i64.rem_u

uint64 rem

i32.rem_s

int32 rem

i32.rem_u

uint32 rem

i64.or

int64 or

i64.and

int64 and

i32.const

emit i32 const into constant table

i64.const

emit i64 const into constant table

f32.const

emit f32 const into constant table

f64.const

emit f64 const into constant table

And also provide combined intrinsic functions to simplify the tuning:

  • all: all the above intrinsic functions

  • i32.common: i32.div_s, i32.div_u, i32.rem_s, i32.rem_u

  • i64.common: i64.div_s, i64.div_u, i64.rem_s, i64.rem_u, i64.or, i64.and

  • f32.common: f32_cmp, llvm.experimental.constrained.fadd.f32, llvm.experimental.constrained.fsub.f32, llvm.experimental.constrained.fmul.f32, llvm.experimental.constrained.fdiv.f32, llvm.fabs.f32, llvm.ceil.f32, llvm.floor.f32, llvm.trunc.f32, llvm.rint.f32, llvm.sqrt.f32, llvm.copysign.f32, llvm.minnum.f32, llvm.maxnum.f32

  • f64.common: f32_demote_f64, f64_promote_f32, f64_cmp, llvm.experimental.constrained.fadd.f64, llvm.experimental.constrained.fsub.f64, llvm.experimental.constrained.fmul.f64, llvm.experimental.constrained.fdiv.f64, llvm.fabs.f64, llvm.ceil.f64, llvm.floor.f64, llvm.trunc.f64, llvm.rint.f64, llvm.sqrt.f64, llvm.copysign.f64, llvm.minnum.f64, llvm.maxnum.f64

  • f32xi32: i32_trunc_f32_s, i32_trunc_f32_u, f32_convert_i32_s, f32_convert_i32_u

  • f64xi32: i32_trunc_f64_s, i32_trunc_f64_u, f64_convert_i32_s, f64_convert_i32_u

  • f32xi64: i64_trunc_f32_s, i64_trunc_f32_u, f32_convert_i64_s, f32_convert_i64_u

  • f64xi64: i64_trunc_f64_s, i64_trunc_f64_u, f64_convert_i64_s, f64_convert_i64_u

  • constop: i32.const, i64.const, f32.const, f64.const

  • fpxint: f32xi32, f64xi32, f32xi64, f64xi64

  • fp.common: f32.common, f64.common

Example

For ARM Cortex-M55, since it has double precision floating point unit, so it can support f32/f64 operations. But as a 32-bit MCU, it can only support 32-bit integer operations. So we can use the following command to generate the XIP binary:

wamrc --target=thumbv8m.main --cpu=cortex-m55 --xip --enable-builtin-intrinsics=i64.common -o hello.aot hello.wasm

For ARM Cortex-M3, since it has no floating point unit, and it can only support 32-bit integer operations. So we can use the following command to generate the XIP binary:

wamrc --target=thumbv7m --cpu=cortex-m3 --xip --enable-builtin-intrinsics=i64.common,fp.common,fpxint -o hello.aot hello.wasm

Other platforms can be tuned in the same way, which intrinsic should be enabled depends on the target platform's hardware capability.

PreviousLinux SGX Remote AttestationNextSocket Support

Last updated 1 year ago