4.5. Encoding Custom Operands

Depending on the operand types defined in the architecture, custom encoding function may be required in order to encode these more complex types.

One example of an operand that may require custom encoding is the OpenRISC 1000 memory operand which combines a register with an immediate offset. This is used for example with the l.lwz instruction, which loads a word from memory from a location specified by a register as a pointer plus some immediate offset stored in the instruction.

    l.lhz r1, 4(r2)

If an operand requires custom encoding, then EncoderMethod has to be specified in the operand TableGen definition, stating which function is used to encode the operand.

def MEMri : Operand<i32> {
  let PrintMethod = "printMemOperand";
  let EncoderMethod = "getMemoryOpValue";
  let MIOperandInfo = (ops GPR, i32imm);
}

	Note
	It does not matter where in an instruction an operand appears, encoding acts within the bit field of the size of the operand. The generated `getBinaryCodeForInstr` function takes care of mapping operand bits to their corresponding instruction bits.

The following example covers the OpenRISC 1000 memory operand, but the same method can be applied to any compound operand type.

unsigned OR1KMCCodeEmitter::
getMemoryOpValue(const MCInst &MI, unsigned Op) const {
  unsigned encoding;
  const MCOperand op1 = MI.getOperand(1);
  assert(op1.isReg() && "First operand is not register.");
  encoding = (getOR1KRegisterNumbering(op1.getReg()) << 16);
  MCOperand op2 = MI.getOperand(2);
  assert(op2.isImm() && "Second operand is not immediate.");
  encoding |= (static_cast<short>(op2.getImm()) & 0xffff);
  return encoding;
}

To create the encoding for this operand, the individual components (the immediate and the register) can be obtained in the same way as was done in getMachineOpValue and then be shifted to the relevant operand bits.

For this example the first operand (a register) is taken and its encoding taken and then shifted 16 bits left. (The OpenRISC 1000 memory operand is a register followed by a 16 bit immediate). The second operand (the immediate offset) is then encoded and combined with the register value to give the full encoding of the operand.

	Note
	The operand locations are hard coded in this example as in the OpenRISC 1000 implementation, memory operands are always at known locations and no instruction may have more than one memory operand. In a more generic case, it is best to use the provided Op value instead of hard coding operand placement.

With the functions defined, instruction encoding should now operate correctly.