974 LISA 8-Bit Microcontroller

974 : LISA 8-Bit Microcontroller

  • Author: Ken Pettit
  • Description: 8-Bit Microcontroller SOC with 128 bytes DFFRAM module
  • GitHub repository
  • Clock: 22000000 Hz

LISA Overview?

LISA is a Microcontroller built around a custom 8-Bit Little ISA (LISA) microprocessor core. It includes several standard peripherals that would be found on commercial microcontrollers including timers, GPIO, UARTs and I2C. The following is a block diagram of the LISA Microcontroller:

  • The LISA Core has a minimal set of register that allow it to run C programs:
    • Program Counter + Return Address Resister
    • Stack Pointer and Index Register (Indexed DATA RAM access)
    • 8-bit Accumulator + 16-bit BF16 Accumulator and 4 BF16 registers

Deailed list of the features

  • Harvard architecture LISA Core (16-bit instruction, 15-bit address space)
  • Debug interface
    • UART controlled
    • Auto detects port from one of 3 interfaces
    • Auto detects the baud rate
    • Interfaces with SPI / QSPI SRAM or FLASH
    • Can erase / program the (Q)SPI FLASH
    • Read/write LISA core registers and peripherals
    • Set LISA breakpoints, halt, resume, single step, etc.
    • SPI/QSPI programmability (single/quad, port location, CE selects)
  • (Q)SPI Arbiter with 3 access channels
    • Debug interface for direct memory access
    • Instruction fetch
    • Data fetch
    • Quad or Single SPI. Hereafter called QSPI, but supports either.
  • Onboard 128 Byte RAM for DATA / DATA CACHE
  • Data bus CACHE controller with 8 16-byte CACHE lines
  • Instruction CACHE with a single 4-instruction CACHE line
  • Two 16-bit programmable timers (with pre-divide)
  • Debug UART available to LISA core also
  • Dedicated UART2 that is not shared with the debug interface
  • 8-bit Input port (PORTA)
  • 8-bit Output port (PORTB)
  • 4-bit BIDIR port (PORTC)
  • I2C Master controller
  • Hardware 8x8 integer multiplier
  • Hardware 16/8 or 16/16 integer divider
  • Hardware Brain Float 16 (BF16) Multiply/Add/Negate/Int16-to-BF16
  • Programmable I/O mux for maximum flexibility of I/O usage.

It uses a 32x32 1RW DFFRAM macro to implement a 128 bytes (1 kilobit) RAM module. The 128 Byte ram can be used either as a DATA cache for the processor data bus, giving a 32K Byte address range, or the CACHE controller can be disabled, connecting the Lisa processor core to the RAM directly, limiting the data space to 128 bytes. Inclusion of the DFFRAM is thanks to Uri Shaked (Discord urish) and his DFFRAM example.

Reseting the project does not reset the RAM contents.

Connectivity

All communication with the microcontroller is done through a UART connected to the Debug Controller. The UART I/O pins are auto-detected by the debug_autobaud module from the following choices (RX/TX):

ui_in[3]  / ui_out[4]     RP2040 UART interface   
uio_in[4] / uio_out[5]    LISA PMOD board (I am developing)
uio_in[6] / uio_out[5]    Standard UART PMOD

The RX/TX pair port is auto-detected after reset by the autobaud circuit, and the UART baud rate can either be configured manually or auto detected by the autobaud module. After reset, the ui_in[7] pin is sampled to determine the baud rate selection mode. If this input pin is HIGH, then autobaud is disabled and ui_in[6:0] is sampled as the UART baud divider and written to the Baud Rate Generator (BRG). The value of this divider should be: clk_freq / baud_rate / 8 - 1. Due to last minute additions of complex floating point operations, and only 2 hours left on the count-down clock, the timing was relaxed to 20MHz input clock max. So for a 20MHz clock and 115200 baud, the b_div[6:0] value would be 42 (for instance).

If the ui_in[7] pin is sampled LOW, then the autobaud module will monitor all three potential RX input pins for LINEFEED (ASCII 0x0A) code to detect baud rate and set the b_div value automatially. It monitors bit transistions and searches for three successive bits with the same bit period. Since ASCII code 0x0A contains a "0 1 0 1 0" bit sequence, the baud rate can be detected easily.

Regardless if the baud rate is set manually or using autobaud, the input port selection will be detect automatically by the autobaud. In the case of manual buad rate selection, it simply looks for the first transition on any of the three RX pins. For autobaud, it select the RX line with three successive eqivalent bit periods.

Debug Interface Details

The Debug interface uses a fixed, verilog coded Finite State Machine (FSM) that supports a set of commands over the UART to interface with the microcontroller. These commands are simple ASCII format such that low-level testing can be performed using any standard terminal software (such as minicom, tio. Putty, etc.). The 'r' and 'w' commands must be terminated using a NEWLINE (0x0A) with an optional CR (0x0D). Responses from the debug interface are always terminated with a LINFEED plus CR sequence (0x0A, 0x0D). The commands are as follows (responsce LF/CR ommited):

Command Description
v Report Debugger version. Should return: lisav1.2
wAAVVVV Write 16-bit HEX value 'VVVV' to register at 8-bit HEX address 'AA'.
rAA Read 16-bit register value from 8-bit HEX address 'AA'.
t Reset the LISA core.
l Grant LISA the UART. Further data will be ignored by the debugger.
+++ Revoke LISA UART. NOTE: a 0.5s guard time before/after is required.

NOTE: All HEX values must be a-f and not A-F. Uppercase is not supported.

Debug Configuration and Control Registers

The following table describes the configuration and LISA debug register addresses available via the debug 'r' and 'w' commands. The individual register details will be described in the sections to follow.

ADDR Description ADDR Description
0x00 LISA Core Run Control 0x12 LISA1 QSPI base address
0x01 LISA Accumulator / FLAGS 0x13 LISA2 QSPI base address
0x02 LISA Program Counter (PC) 0x14 LISA1 QSPI CE select
0x03 LISA Stack Pointer (SP) 0x15 LISA2 QSPI CE select
0x04 LISA Return Address (RA) 0x16 Debug QSPI CE select
0x05 LISA Index Register (IX) 0x17 QSPI Mode (QUAD, flash, 16b)
0x06 LISA Data bus 0x18 QSPI Dummy read cycles
0x07 LISA Data bus address 0x19 QSPI Write CMD value
0x08 LISA Breakpoint 1 0x1a The '+++' guard time count
0x09 LISA Breakpoint 2 0x1b Mux bits for uo_out
0x0a LISA Breakpoint 3 0x1c Mux bits for uio
0x0b LISA Breakpoint 4 0x1d CACHE control
0x0c LISA Breakpoint 5 0x1e QSPI edge / SCLK speed
0x0d LISA Breakpoint 6 0x20 Debug QSPI Read / Write
0x0f LISA Current Opcode Value 0x21 Debug QSPI custom command
0x10 Debug QSPI Address (LSB16) 0x22 Debug read SPI status reg
0x11 Debug QSPI Address (MSB8)

LISA Processor Interface Details

The LISA Core requires external memory for all Instructions and Data (well, sort of for data, the data CACHE can be disabled then it just uses internal DFFRAM). To accomodate external memory, the design uses a QSPI controller that is configurable as either single SPI or QUAD SPI, Flash or SRAM access, 16-Bit or 24-Bit addressing, and selectable Chip Enable for each type of access. To achieve this, a QSPI arbiter is used to allow multiple accessors as shown in the following diagram:

The arbiter is controlled via configuration registers (accessible by the Debug controller) that specify the operating mode per CE, and CE selection bits for each of the three interfaces:

  • Debug Interface
  • LISA1 (Instruction fetch)
  • LISA2 (Data read/write)

The arbiter gives priority to the Debug accesses and processes LISA1 and LISA2 requests using a round-robbin approach. Each requestor provides a 24-bit address along with 16-bit data read/write. For the Debug interface, the address comes from the configuration registers directly. For LISA1, the address is the Program Counter (PC) + LISA1 Base and for LISA2, it is the Data Bus address + LISA2 Base. The LISA1 and LISA2 base addresses are programmed by the Debug controller and set the upper 16-bits in the 24-bit address range. The PC and Data address provide the lower 16 bis (8-bits overlapped that are 'OR'ed together). The BASE addresses allow use of a single external QSPI SRAM for both instruction and data without needing to worry about data collisions.

When the arbiter chooses a requestor, it passes its programmed CE selection to the QSPI controller. The QSPI controller then uses the programmed QUAD, MODE, FLASH and 16B settings for the chosen CE to process the request. This allows LISA1 (Instruction) to either execute from the same SRAM as LISA2 (Data) or to execute from a separate CE (such as FLASH with permanent data storage).

Additionally the Debug interface has special access registers in the 0x20 - 0x22 range that allow special QSPI accesses such as FLASH erase and program, SRAM programming, FLASH status read, etc. In fact the Debug controller can send any arbitrary command to a target device, using access that either provide an associated address (such as erase sector) or no address. The proceedure for this is:

  1. Program Debug register 0x19 with the special 8-bit command to be sent
  2. Set the 9-th bit (reg19[8]) to 1 if a 16/24 bit address needs to be sent)
  3. Perform a read / write operation to debug address 0x21 to perform the action.

Simple QSPI data reads/write are accomplished via the Debug interface by setting the desired address in Debug config register 0x10 and 0x11, then performing read or write to address 0x20 to perform the request. Reading from Debug config register 0x22 will perform a special mode read of QSPI register 0x05 (the FLASH status register).

Data access to the QSPI arbiter come from the Data CACHE interface (described later), enabling a 32K address space for data. However the design has a CACHE disable mode that directs all Data accesses directly to the internal 128 Byte RAM, thus eliminating the need for external SRAM (and limiting the data bus to 128 bytes).

Programming the QSPI Controller

Before the LISA microcontroller can be used in any meaningful manner, a SPI / QSPI SRAM (and optionally a NOR FLASH) must be connected to the Tiny Tapeout PCB. Alternately, the RP2040 controller on the board can be configured to emulate a single SPI (the details for configuring this are outside the scope of this documentation ... search the Tiny Tapeout website for details.). For the CE signals, there are two operating modes, fixed CE output and Mux Mode 3 "latched" CE mode. Both will be described here. The other standard SPI signals are routed to dedicated pins as follows:

Pin SPI QSPI Notes
uio[0] CE0 CE0
uio[1] MOSI DQ0 Also MOSI prior to QUAD mode DQ0
uio[2] MISO DQ1 Also MISO prior to QUAD mode DQ1
uio[3] SCLK SCLK
uio[4] CE1 CE1 Must be enabled via uio MUX bits
uio[6] - DQ2 Must be enabled via uio MUX bits
uio[7] - DQ3 Must be enabled via uio MUX bits

For Special Mux Mode 3 (Debug register 0x1C uio_mux[7:6] = 2'h3), the pinout is mostly the same except the CE signals are not constant. Instead they are "latched" into an external 7475 type latch. This mode is to support a PMOD board connected to the uio PMOD which supports a QSPI Flash chip, a QSPI SRAM chip, and either Debug UART or I2C. For all of that functionality, nine pins would be required for continuous CE0/CE1, however only eight are available. So the external PMOD uses uio[0] as a CE "latch" signal and the CE0/CE1 signals are provided on uio[1]/uio[2] during the latch event. This requires a series resistor as indicated to allow CE updates if the FLASH/SRAM is driving DQ0/DQ1. The pinout then becomes:

Pin SPI/QSPI Notes
uio[0] ce_latch ce_latch HIGH at beginning of cycle
uio[1] ce0_latch/MOSI/DQ0 Connection to FLASH/SRAM via series resistor
uio[2] ce1_latch/MISO/DQ1 Connection to FLASH/SRAM via series resistor
uio[3] SCLK
uio[6] -/DQ2 Must be enabled via uio MUX bits
uio[7] -/DQ3 Must be enabled via uio MUX bits

This leaves uio[4]/uio[5] available for use as either UART or I2C.

Once the SPI/QSPI SRAM and optional FLASH have been chosen and connected, the Debug configuration registers must be programmed to indicate the nature of the external device(s). This is accompilished using Debug registers 0x12 - 0x19 and 0x1C. To programming the proper mode, follow these steps:

  1. Program the LISA1, LISA2 and Debug CE Select registers (0x14, 0x15, 0x16) indicating which CE to use.

    • 0x14, 0x15, 0x16: {6'h0, ce1_en, ce0_en} Active HIGH
  2. Program the LISA1 and LISA2 base addresses if they use the same SRAM:

    • 0x12: {LISA1_BASE, 8'h0} | {8'h0, PC}
    • 0x13: {LISA2_BASE, 8'h0} | {8'h0, DATA_ADDR}
  3. Program the mode for each Chip Enable (bits active HIGH)

    • 0x17: {10'h0, is_16b[1:0], is_flash[1:0], is_quad[1:0]}
  4. For Quad SPI, Special Mux Mode 3, or CE1, program the uio_mux mode:

    • 0x1C:
      • [7:6] = 2'h2: Normal QSPI DQ2 select
      • [7:6] = 2'h3: Special Mux Mode 3 (Latched CE)
      • [5:4] = 2'h2: Normal QSPI DQ3 select
      • [5:4] = 2'h3: Special Mux Mode 3
      • [1:0] = 2'h2: CE1 select on uio[4]
  5. For RP2040, you might need to slow down the SPI clock / delay between successive CE activations:

    • 0x1E:
      • [3:0] spi_clk_div: Number of clocks SCLK HIGH and LOW
      • [10:4] ce_delay: Number clocks between CE activations
      • [12:11] spi_mode: Per-CE FALLING SCLK edge data update
  6. Set the number of DUMMY ready required for each CE:

    • 0x18: {8'h0, dummy1[3:0], dummy0[3:0]
  7. For QSPI FLASH, set the QSPI Write opcode (it is different for various Flashes):

    • 0x19: {8'h0, quad_write_cmd}

NOTE: For register 0x1E (SPI Clock Div and CE Delay), there is only a single register, meaning this register value applies to both CE outputs. Delaying the clock of one CE will delay both, and adding delay between CE activations does not keep track of which CE was activated. So if two CE outputs are used and a CE delay is programmed, it will enforce that delay even if a different CE is used. This setting is really in place for use when the RP2040 emulation is being used in a single CE SRAM mode only (i.e. you have no external PMOD with a real SRAM / FLASH chip. In the case of real chips on a PMOD, SCLK and CE delays (most likely) are not needed. The Tech Page on the Tiny Tapeout regarding RP2040 SPI SRAM emulation indicates a delay between CE activations is likely needed, so this setting is provided in case it is needed.

Architecture Details

Below is a simplified block diagram of the LISA processor core. It uses an 8-bit accumulator for most of its operations with the 2nd argument predominately coming from either immediate data in the instruction word or from a memory location addressed by either the Stack Pointer (SP) or Index Register (IX).

There are also instructions that work on the 15-bit registers PC, SP, IX and RA (Return Address). As well as floating point operations. These will be covered in the sections to follow.

Addressing Modes

Like most processors, LISA has a few different addressing modes to get data in and out of the core. These include the following:

Mode Data Description
Register Rx[n -: 8] Transfers between registers (ix, ra, facc, etc.).
Direct inst[n:0] N-bit data stored in the instruction word.
NextOp (inst+1)[14:0] Data stored in the NEXT instruction word.
Indirect mem[inst[n:0]] Address of the data is in the instruction word.
Periph periph[inst[n:0]] Accesses to the peripheral bus.
Indexed mem[sp/ix+inst[n:0]] The SP or IX register is added to a fixed offset.
Stack mem[sp] Stack pointer points to the data (push/pop).

The Control Registers

To run meaninful programs, the Program Counter (PC) and Stack Pointer (SP) must be set to useful values for accessing program instructions and data. The PC is automatically reset to zero by rst_n, so that one is pretty much automatic. All programs start at address zero (plus any base address programmed by the Debug Controller). But as far as the LISA core is concerned, it knows nothing of base addresses and believes it is starting at address zero.

Next is to program the SP to point to a useful location in memory. The Stack is a place where C programs store their local variable values and also where we store the Return Address (RA) if we need to call nested routines, etc. The stack grows down, meaning it starts at a high RAM address and decrements as things are added to the stack. Therefore the SP should be programmed with an address in upper RAM. LISA supports different Data bus modes through it's CACHE controller, including CACHE disable where it can only access 128 bytes. But for this example, let's assume we have a full range of 32K SRAM available. The LISA ISA doesn't have an opcode for loading the SP directly. Instead it can load the IX register directly with a 15-bit value using NextOp addressing, and it supports "xchg" opcodes to exchange the IX register with either the SP or RA. So to load the SP, we would write:

Example:
  ldx     0x7FFF      // Load IX with value in next opcode
  xchg_sp             // Exchange IX with SP

The IX register can be programmed as needed to access other data within the Data Bus address range. This register is useful especially for accessing structures via a C pointer. The IX then becomes the value of the pointer to RAM, and Indexed addressing mode allows fixed offsets from that pointer (i.e. structure elements) to be accessed for read/write.

Loading the PC indirectly can be done using the "jmp ix" opcode which does the operation pc <= ix. Loading ix from the pc directly is not supported, though this can be accomplished using a function call and opcodes to save RA (sra) and pop ix:

 Example:
   get_pc:
     sra         // Push RA to the stack (Save RA)
     pop_ix      // Pop IX from the stack
     ret         // Return. Upon return, IX is the same as PC

Conditional Flow Processing

Program flow is controlled using flags (zero, carry, sign), arithemetic mode (amode) and condition flags (cond) to determine when program branches should occur. Specific opcode update the flags and condition registers based on results of the operation (AND, OR, IF, etc.). Then conditional branches are made using bz, bnz and if (and variants ifte "if-then-else" and iftt "if-then-then"). Also available are rc "Return if Carry" and rz "Return if Zero", though these are less useful in C programs as typically a routine uses local variables and the stack must be restored prior to return, mandating a branch to the function epilog to restore the stack and often the return address. Below is a list of the opcodes used for conditional program processing:

Legend for operations below:

  • acc_val = inst[7:0]
  • pc_jmp = inst[14:0]
  • pc_rel = pc + sign_extend(inst[10:0])
Opcode Operation Encoding Description
jal pc <= pc_jmp 0aaa_aaaa_aaaa_aaaa Jump And Link (call).
ra <= pc
ret pc <= ra 1000_1010_0xxx_xxxx Return
reti pc <= ra 1000_11xx_iiii_iiii Return Immediate.
acc <= acc_val
br pc <= pc_rel 1011_0rrr_rrrr_rrrr Branch Always
bz pc <= pc_rel 1011_1rrr_rrrr_rrrr Branch if Zero.
if zero=1
bnz pc <= pc_rel 1010_1rrr_rrrr_rrrr Branch if Not Zero.
if zero=0
rc pc <= ra 1000_1011_0xxx_xxxx Return if Carry
if carry=1
rz pc <= ra 1000_1011_1xxx_xxxx Return if Zero
if zero=1
call_ix pc <= ix 1000_1010_100x_xxxx Call indirect via IX
ra <= pc
jump_ix pc <= ix 1000_1010_101x_xxxx Jump indirect via IX
if cond <= ?? 1010_0010_0000_0ccc If. See below.
iftt cond <= ?? 1010_0010_0000_1ccc If then-then. See below.
ifte cond <= ?? 1010_0010_0001_0ccc If then-else. See below.

The IF Opcode

The "if" opcode and it's variants "if-then-then" and "if-then-else" control program flow in a slightly different manner than the others. Instead of affecting the value of the PC directly, they set the two condition bits "cond[1:0]" to indicate which (if any) of the two following opcodes should be executed. the cond[0] bit represents the next instruction and cond[1] represents the instruction following that. All three "if" forms take an argument that checks the current value of the FLAGS to set the condition bits. The argument is encoded as the lower three bits of the instruction word ard operate as shown in the following table:

Condition Test Encoding Description
EQ zflag=1 3'h0 Execute if Equal
NE zflag=0 3'h1 Execute if Not Equal
NC cflag=0 3'h2 Execute if Not Carry
C cflag=1 3'h3 Execute if Carry
GT ~cSigned & ~zflag 3'h4 Execute if Greater Than
LT cSigned & ~zflag 3'h5 Execute if Less Than
GTE ~cSigned zflag 3'h6
LTE cSigned zflag 3'h7

The "if" opcode will set cond[0] based on the condition above and the cond[1] bit to HIGH. It only affects the single instruction following the "if" opcode. The "iftt" opcode will set both cond[0] and cond[1] to the same value based on the condition above. It means "if true, execute the next two opcodes". And the "ifte" opcode will set cond[0] based on the condition above and cond[1] to the OPPOSITE value, meaning it will execute either the following instruction OR the one after that (then-else).

Example:
  ldi      0x41        // Load A immediate with ASCII 'A'
  cpi      0x42        // Compare A immediate with ASCII 'B'
  ifte     eq          // Test if the compare was "Equal"
  jal      L_equal     // Jump if equal
  jal      L_different // Jump if different

The above code will load the "jal L_equal" opcode but will not execute it since the compare was Not Equal. Then it will execute the "jal L_different" opcode. Note that if the compare were "ifte ne", it would call the L_equal function and then upon return would not execute the "L_different" opcode. This is because the cond[1] code is saved with the Return Address (RA) during the call and restored upon return. This means the FALSE cond[1] code would prevent the 2nd opcode from executing. As an opcode gets executed, the cond[1] value is shifted into the cond[0] location, and the cond[1] is loaded with 1'b1.

Direct Operations

To do any useful work, the LISA core must be able to load and operate on data. This is done through the accumulator using the various addressing modes. The diagram below details the Direct addressing mode where data is stored directly in the opcode / instruction word:

\clearpage The instructions that use direct addressing are:

Opcode Operation Encoding Description
adc A <= A + imm + C 1001_00xx_iiii_iiii ADD immediate with Carry
ads SP <= SP + imm 1001_01ii_iiii_iiii ADD SP + signed immediate
adx IX <= IX + imm 1001_10ii_iiii_iiii ADD IX + signed immediate
andi A <= A & imm 1000_01xx_iiii_iiii AND immediate with A
cpi Z,C <= A >= imm 1010_01xx_iiii_iiii Compare A >= immediate
cpi Z,C <= A >= imm 1010_01xx_iiii_iiii Compare A >= immediate

Accumulator Indirect Operations

The Accumulator Indirect operations use immediate data in the instruction word to index indirectly into Data memory. That memory address is then used to load, store or both load and store (swap) data with the accumulator.

Opcode Operation Encoding Description
lda A <= M[imm] 1111_01pi_iiii_iiii Load A from Memory/Peripheral
sta M[imm] <= A 1111_11pi_iiii_iiii Store A to Memory/Peripheral
swapi A <= M[imm] 1101_11pi_iiii_iiii Swap Memory/Peripheral with A
M[imm] <= A
  • p = Select Peripheral (1'b1) or RAM (1'b0)
  • iiii = Immediate data

Indexed Operations

Indexed operations use either the IX or SP register plus a fixed offset from the immediate field of the opcode. The selection to use IX vs SP is also from the opcode[9] bit. The immediate field is not sign extended, so only positive direction indexing is supported. This was selected because this mode is typically used to access either local variables (when using SP) or C struct members (when using IX), and in both cases, negative index offsets aren't very useful. The following is a diagram of indexed addressing:

Opcode Operation Encoding Description
add A <= A+ M[ind] 1100_00si_iiii_iiii ADD index memory to A
and A <= A & M[ind] 1101_00si_iiii_iiii AND A with index memory
cmp A >= M[ind]? 1110_10si_iiii_iiii Compare A with index memory
dcx M[ind] -= 1 1001_11si_iiii_iiii Decrement the value at index memory
inx M[ind] += 1 1110_01si_iiii_iiii Increment the value at index memory
ldax A <= M[ind] 1111_00si_iiii_iiii Load A from index memory
ldxx IX <= M[SP+imm] 1100_110i_iiii_iiii Load IX from memory at SP+imm
mul A <= A*M[ind]L 1100_10si_iiii_iiii Multiply index memory * A, keep LSB
mulu A <= A*M[ind]H 1000_01si_iiii_iiii Multiply index memory * A, keep MSB
or A <= A M[ind] 1101_10si_iiii_iiii
stax M[ind] <= A 1111_10si_iiii_iiii Store A to index memory
stxx M[SP+imm] <= IX 1100_111i_iiii_iiii Save IX to memory at SP+imm
sub A <= A-M[ind] 1100_10si_iiii_iiii SUBtract index memory from A
swap A <= M[ind] 1110_11si_iiii_iiii Swap A with index memory
M[ind] <= A
xor A <= A ^ M[ind] 1110_00si_iiii_iiii XOR A with index memory

\clearpage Legend for table above:

  • ind = IX or SP + immediate
  • s = Select IX (zero) or SP (one)
  • iiii = Immediate data

The Zero and Carry flags are updated for most of the above operations. The Carry flag is only updated for math operations where a Carry / Borrow could occur.

Carry Zero
adc add and
add or xor
sub cmp sub
cmp dcx inx
dcx swap ldax
inx mul mulu

Stack Operations

Stack operations use the current value of the SP register to PUSH and POP items to the stack in opcode. As items are PUSHed to the stack, the SP is decremented after each byte, and as they are POPed, the SP is incremented prior to reading from RAM.

Opcode Operation Encoding Description
lra RA <= M[SP+1] 1010_0001_0110_01xx Load {cond,RA} from stack
SP += 2
sra M[SP] <= RA 1010_0001_0110_00xx Save {cond,RA} to stack
SP -= 2
push_ix M[SP] <= IX 1010_0001_0110_10xx Save IX to stack
SP -= 2
pop_ix IX <= M[SP+1] 1010_0001_0110_11xx Load IX from stack
SP += 2
push_a M[SP] <= A 1010_0000_100x_xxxx Save A to stack
SP -= 1
pop_a A <= M[SP+1] 1010_0000_110x_xxxx Load A from stack
SP += 1

How to test

You will need to download and compile the C-based assembler, linker and C compiler I wrote (will make available) Also need to download the Python based debugger.

  • Assembler is fully functional
    • Includes limited libraries for crt0, signed int compare, math, etc.
    • Libraries are still a work in progress
  • Linker is fully functional
  • C compiler is somewhat functional (no float support at the moment) but has many bugs in the generated code and is still a work in progress.
  • Python debugger can erase/program the FLASH, program SPI SRAM, start/stop the LISA core, read SRAM and registers.

\clearpage

Legend for Pinout

  • pa: LISA GPIO PortA Input
  • pb: LISA GPIO PortB Output
  • b_div: Debug UART baud divisor sampled at reset
  • b_set: Debug UART baud divisor enable (HIGH) sampled at reset
  • baud_clk: 16x Baud Rate clock used for Debug UART baud rate generator
  • ce_latch: Latch enable for Special Mux Mode 3 as describe above
  • ce0_latch: CE0 output during Special Mux Mode 3
  • ce1_latch: CE1 output during Special Mux Mode 3
  • DQ1/2/3/4: QUAD SPI bidirection data I/O
  • pc_io: LISA GPIO Port C I/O (direction controllable by LISA)

IO

#InputOutputBidirectional
0pa_in[0]/baud_div[0]pb_out[0]ce0/ce_latch
1pa_in[1]/baud_div[1]pb_out[1]mosi/dq1/ce0
2pa_in[2]/baud_div[2]pb_out[2]miso/dq2/ce1
3pa_in[3]/baud_div[3]/rxpb_out[3]sclk
4pa_in[4]/baud_div[4]pb_out[4]/txrx /pc_io[0]/scl/sda
5pa_in[5]/baud_div[5]pb_out[5]tx /pc_io[1]/sda/scl
6pa_in[6]/baud_div[6]pb_out[6]scl /pc_io[2]/dq2/rx
7pa_in[7]/baud_setpb_out[7]/baud_clksda/pc_io[3]/dq3

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (TinyTapeout 06 Factory Test) tt_um_analog_factory_test (TT06 Analog Factory Test) tt_um_analog_factory_test (TT06 Analog Factory Test) tt_um_urish_charge_pump (Dickson Charge Pump) tt_um_psychogenic_wowa (WoWA) tt_um_oscillating_bones (Oscillating Bones) tt_um_kevinwguan (Crossbar Array) tt_um_coloquinte_moosic (Moosic logic-locked design) tt_um_alexsegura_pong (Pong) tt_um_iron_violet_simon (Iron Violet) tt_um_tomkeddie_a (VGA Experiments in Tennis) tt_um_MichaelBell_tinyQV (TinyQV Risc-V SoC) tt_um_andychip1_sn74169 (sn74169) tt_um_mattvenn_r2r_dac (Analog 8bit R2R DAC) tt_um_thorkn_audiochip_v2 (AudioChip_V2) tt_um_faramire_gate_guesser (Gate Guesser) tt_um_urish_simon (Simon Says game) tt_um_TT06_SAR_wulffern (TT06 8-bit SAR ADC) tt_um_soundgen (soundgen) tt_um_ledcontroller_Gatsch (ledcontroller) tt_um_digitaler_filter_rathmayr (Digitaler Filter) tt_um_histefan_top (Snake Game) tt_um_mayrmichael_wave_generator (Wave Generator) tt_um_advanced_counter (jku-tt06-advanced-counter) tt_um_FanCTRL_DomnikBrandstetter (PI-Based Fan Controller) tt_um_ps2_morse_encoder_top (PS/2 Keyboard to Morse Code Encoder) tt_um_calculator_muehlbb (16-bit calculator) tt_um_hpretl_tt06_tempsens (Temperature Sensor NG) tt_um_haeuslermarkus_fir_filter (FIR Filter with adaptable coefficients) tt_um_mattvenn_rgb_mixer (RGB Mixer demo) tt_um_analog_loopback (Analog loopback) tt_um_entwurf_integrierter_schaltungen_hadner (Projekt KEIS Hadner Thomas) tt_um_seven_segment_fun1 (7-segment-FUN) tt_um_moving_average_master (Moving average filter) tt_um_rgbled_decoder (SPI to RGBLED Decoder/Driver) tt_um_4bit_cpu_with_fsm (4-Bit CPU mit FSM) tt_um_flappy_bird (Flappy Bird) tt_um_drops (drops) tt_um_enieman (UART-Programmable RISC-V 32I Core) tt_um_gabejessil_timer (2 Player Game) tt_um_wokwi_384804985843168257 (playwithnumbers) tt_um_wokwi_384711264596377601 (luckyCube) tt_um_hpretl_tt06_tdc (Synthesized Time-to-Digital Converter (TDC)) tt_um_wokwi_384437973887503361 (Asynchronous Down Counter) tt_um_spi_pwm_djuara (spi_pwm) tt_um_SteffenReith_PiMACTop (PiMAC) tt_um_mattvenn_relax_osc (Relaxation oscillator) tt_um_jv_sigdel (1st passive Sigma Delta ADC) tt_um_wokwi_392873974467527681 (PILIPINAS) tt_um_scorbetta_goa (GOA - grogu on ASIC) tt_um_sanojn_ttrpg_dice (TTRPG Dice + simple I2C peripheral) tt_um_urish_dffram (DFFRAM Example (128 bytes)) tt_um_lucaz97_monobit (Monobit Test) tt_um_noritsuna_i4004 (i4004 for TinyTapeout) tt_um_hpretl_tt06_tdc_v2 (Synthesized Time-to-Digital Converter (TDC) v2) tt_um_vaf_555_timer (A 555-Timer Clone for Tiny Tapeout 6) tt_um_obriensp_be8 (8-bit CPU with Debugger (Lite)) tt_um_toivoh_retro_console (Retro Console) tt_um_mattvenn_inverter (Double Inverter) tt_um_SteffenReith_ASGTop (ASG) tt_um_lucaz97_rng_tests (rng Test) tt_um_dieroller_nathangross1 (Die Roller) tt_um_kwilke_cdc_fifo (Clock Domain Crossing FIFO) tt_um_spiff42_exp_led_pwm (LED PWM controller) tt_um_devinatkin_fastreadout (Fast Readout Image Sensor Prototype) tt_um_ja1tye_tiny_cpu (Tiny 8-bit CPU) tt_um_7seg_animated (Animated 7-segment character display) tt_um_neurocore (Neurocore) tt_um_zhwa_rgb_mixer (RGB Mixer) tt_um_wokwi_394704587372210177 (Cambio de giro de motor CD) tt_um_ian_keypad_controller (Keypad controller) tt_um_urish_spell (SPELL) tt_um_vks_pll (PLL blocks) tt_um_fountaincoder_top (multimac) tt_um_dsatizabal_opamp (Simple FET OpAmp with Sky130.) tt_um_obriensp_be8_nomacro (8-bit CPU with Debugger) tt_um_LFSR_shivam (10-bit Linear feedback shift register) tt_um_shivam (Pulse Width Modulation) tt_um_algofoogle_tt06_grab_bag (TT06 Grab Bag) tt_um_meiniKi_tt06_fazyrv_exotiny (FazyRV-ExoTiny) tt_um_wokwi_394888799427677185 (4-bit stochastic multiplier traditional) tt_um_QIF_8bit (8 Bit Digital QIF) tt_um_MATTHIAS_M_PAL_TOP_WRAPPER (easy PAL) tt_um_andrewtron3000 (Rule 30 Engine!) tt_um_tommythorn_4b_cpu_v2 (Silly 4b CPU v2) tt_um_aerox2_jrb8_computer (The James Retro Byte 8 computer) tt_um_wokwi_394898807123828737 (4-bit Stochastic Multiplier Compact with Stochastic Resonator) tt_um_argunda_tiny_opamp (Tiny Opamp) tt_um_fdc_chip (Frequency to digital converters (asynchronous and synchronous)) tt_um_8bit_cpu (8-Bit CPU In a Week) tt_um_mitssdd (co processor for precision farming) tt_um_fstolzcode (Tiny Zuse) tt_um_liu3hao_rv32e_min_mcu (tt06-RV32E_MinMCU) tt_um_kianV_rv32ima_uLinux_SoC (KianV uLinux SoC) tt_um_wokwi_395444977868278785 (*NOT WORKING* HP 5082-7500 Decoder) tt_um_wokwi_394618582085551105 (Keypad Decoder) tt_um_wokwi_395054820631340033 (Workshop Hackaday Juli) tt_um_wokwi_395055035944909825 (Some_LEDs) tt_um_wokwi_395055351144787969 (Hack a day Tiny Tapeout project) tt_um_wokwi_395054823569451009 (First TT Project) tt_um_wokwi_395054823837887489 (Dice) tt_um_wokwi_395055341723330561 (Workshop_chip) tt_um_jduchniewicz_prng (8-bit PRNG) tt_um_wokwi_395054564978002945 (Bestagon LED matrix driver) tt_um_wokwi_395054466384583681 (1-Bit ALU 2) tt_um_wokwi_395058308283408385 (test for tiny tapeout hackaday) tt_um_s1pu11i_simple_nco (Simple NCO) tt_um_wokwi_395055359324730369 (Tiny_Tapeout_6_Frank) tt_um_disp1 (Display test 1) tt_um_pckys_game (PCKY´s Successive Approximation Game) tt_um_tiny_shader_mole99 (Tiny Shader) tt_um_wokwi_393815624518031361 (My Chip) tt_um_minibyte (Minibyte CPU) tt_um_emilian_rf_playground (IDAC8 based on divide current by 2) tt_um_triple_watchdog (Triple Watchdog) tt_um_wokwi_395142547244224513 (EFAB Demo 2) tt_um_chisel_hello_schoeberl (Chisel Hello World) tt_um_aiju_8080 (8080 CPU) tt_um_wokwi_395134712676183041 (Inverters) tt_um_nubcore_default_tape (DEFAULT) tt_um_wuehr1999_servotester (Servotester) tt_um_wokwi_395055722430895105 (Servo Signal Tester) tt_um_exai_izhikevich_neuron (Izhikevich Neuron) tt_um_lisa (LISA 8-Bit Microcontroller) tt_um_wokwi_394707429798790145 (32-Bit Galois Linear Feedback Shift Register) tt_um_CKPope_top (X/Y Controller) tt_um_MNSLab_BLDC (Universal Motor and Actuator Controller) tt_um_couchand_dual_deque (Dual Deque) tt_um_JamesTimothyMeech_inverter (Programmable Thing) tt_um_signed_unsigned_4x4_bit_multiplier (Signed Unsigned multiplyer) tt_um_lipsi_schoeberl (Lipsi: Probably the Smallest Processor in the World) tt_um_i_tree_batzolislefteris (Anomaly Detection using Isolation trees) tt_um_wokwi_394830069681034241 (Cyclic Redundancy Check 8 bit) tt_um_rng_3_lucaz97 (RNG3) tt_um_wokwi_395263962779770881 (Bivium-B Non-Linear Feedback Shift Register) tt_um_dvxf_dj8 (DJ8 8-bit CPU) tt_um_silicon_tinytapeout_lm07 (Digital Temperature Monitor) tt_um_htfab_flash_adc (Flash ADC) tt_um_chisel_pong (Chisel Pong) tt_um_wokwi_395414987024660481 (HELP for tinyTapeout) tt_um_jorgenkraghjakobsen_toi2s (SPDIF to I2S decoder) tt_um_cmerrill_pdm (Parallel / SPI modulation tester) tt_um_csit_luks (CSIT-Luks) tt_um_wokwi_395357890431011841 (Trivium Non-Linear Feedback Shift Register) tt_um_drburke3_top (SADdiff_v1) tt_um_cejmu_riscv (TinyRV1 CPU) tt_um_rejunity_current_cmp (Analog Current Comparator) tt_um_loco_choco (BF Processor) tt_um_qubitbytes_alive (It's Alive) tt_um_wokwi_395061443288867841 (BCD to single 7 segment display Converter) tt_um_SJ (SiliconJackets_Systolic_Array) tt_um_ejfogleman_smsdac (8-bit DEM R2R DAC) tt_um_wokwi_395055455727667201 (Hardware Trojan Part II) tt_um_ericsmi_weste_problem_4_11 (Measurement of CMOS VLSI Design Problem 4.11) tt_um_wokwi_395034561853515777 (2 bit Binary Calculator) tt_um_mw73_pmic (Power Management IC) tt_um_Counter_1_shivam (8-bit Binary Counter) tt_um_wokwi_395054508867644417 (SynchMux) tt_um_otp_encryptor (TT06 OTP Encryptor) tt_um_wokwi_395514572866576385 (Parity Generator) tt_um_ADPCM_COMPRESSOR (ADPCM Encoder Audio Compressor) tt_um_3515_sequenceDetector (Sequence detector using 7-segment) tt_um_faramire_stopwatch (Simple Stopwatch) tt_um_ks_pyamnihc (Karplus-Strong String Synthesis) tt_um_dlmiles_muldiv8 (MULDIV unit (8-bit signed/unsigned)) tt_um_dlmiles_muldiv8_sky130faha (MULDIV unit (8-bit signed/unsigned) with sky130 HA/FA cells) tt_um_tommythorn_ncl_lfsr (NCL LFSR) tt_um_lk_ans_top (ANS Encoder/Decoder) tt_um_MichaelBell_latch_mem (Latch RAM (64 bytes)) tt_um_wokwi_395179352683141121 (Combination Lock) tt_um_Uart_Transciver (UART Transceiver) tt_um_dgkaminski (4-Bit ALU) tt_um_DigitalClockTop (TDM Digital Clock) tt_um_wokwi_394640918790880257 (IFSC Keypad Locker) tt_um_wokwi_395355133883896833 (BIT COMPARATOR) tt_um_alu (SumLatchUART_System) tt_um_alfiero88_VCII (VCII) tt_um_ALU (3-bit ALU) tt_um_topTDC (Convertidor de Tiempo a Digital (TDC)) tt_um_UABCReloj (24 H Clock) tt_um_CDMA_Santiago (CDMA_2024) tt_um_dr_skyler_clock (Clock) tt_um_motor (motor a pasos) tt_um_mult_2b (mult_2b) tt_um_CodHex7seg (Decodificador binario a display 7 segmentos hexadecimal) tt_um_S2P (Serial to Parallel Register) tt_um_PWM (PWM) tt_um_ss_register (serie_serie_register) tt_um_stepper (Stepper) tt_um_g3f (Generador digital trifásico) tt_um_ALU_DECODERS (ALU with a Gray and Octal decoders) tt_um_ram (4 bit RAM) tt_um_sap_1 (SAP-1 Computer) tt_um_guitar_pedal (Integrated Distorion Pedal) tt_um_mbalestrini_usb_cdc_devices (Two ports USB CDC device) tt_um_adammaj (Tiny ALU) tt_um_wokwi_395567106413190145 (4-Bit Full Adder and Subtractor with Hardware Trojan) tt_um_gak25_8bit_cpu_ext (Most minimal extension of friend's 'CPU In a Week' in a day) tt_um_hsc_tdc (UCSC HW Systems Collective, TDC) tt_um_BoothMulti_hhrb98 (UACJ-MIE-Booth 4) tt_um_dlmiles_poc_fskmodem_hdlctrx (FSK Modem +HDLC +UART (PoC)) tt_um_simplez_rcoeurjoly (tt6-simplez) tt_um_nurirfansyah_alits01 (Analog Test Circuit ITS: VCO) tt_um_ppca (drEEm tEEm PPCA) tt_um_wokwi_395522292785089537 (Displays CIt) tt_um_fpu (Dgrid_FPU) tt_um_duk_lif (Leaky Integrate and fire neuron(LIF)) tt_um_bomba1 (Latin_bomba) tt_um_chatgpt_rsnn_paolaunisa (ChatGPT designed Recurrent Spiking Neural Network) tt_um_bit_ctrl (Bit Control) tt_um_array_multiplier_hhrb98 (Array Multiplier) tt_um_wallace_hhrb98 (UACJ-Wallace multiplier) tt_um_I2C_to_SPI (TinyTapeout SPI Master) tt_um_rng (Random number generator) tt_um_wokwi_395599496098067457 (EVEN AND ODD COUNTERS) tt_um_8bitALU (8bit ALU) tt_um_aleena (Analog Sigmoid) tt_um_rejunity_1_58bit (Ternary 1.58-bit x 8-bit matrix multiplier) tt_um_rejunity_fp4_mul_i8 (FP4 x 8-bit matrix multiplier) tt_um_PWM_Controller (PWM Controller) tt_um_couchand_cora16 (CORA-16) tt_um_frq_divider (clk frequency divider controled by rom) tt_um_wokwi_390913889347409921 (Notre Dame Dorms LED) tt_um_timer_counter_UGM (4-Digit Scanning Digital Timer Counter) tt_um_koconnor_kstep (kstep) tt_um_lancemitrex (DIP Switch to HEX 7-segment Display) tt_um_PWM_Sine_UART (PWM_Sinewave_UART) tt_um_nicklausthompson_twi_monitor (TWI Monitor) tt_um_wokwi_395615790979120129 (Cambio de giro de motor CD) tt_um_ancho (Circuito PWM con ciclo de trabajo configurable) tt_um_wokwi_395618714068432897 (32b Fibonacci Original) tt_um_voting_thingey (Voting thingey) tt_um_hsc_tdc_buf (UCSC HW Systems Collective, TDC - BUF2x1) tt_um_hsc_tdc_mux (UCSC HW Systems Collective, TDC - MUX2x1) tt_um_petersn_micro1 (14 Hour Simple Computer) tt_um_sanojn_tlv2556_interface (UART interface to ADC TLV2556 (VHDL Test)) tt_um_gray_sobel (Gray scale and Sobel filter) tt_um_wokwi_395614106833794049 (Universal gates) Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available