Welcome to SecureRISC.org

I created this site to explore some unconventional ideas for processor architecture. For decades, this material existed only as notes, scribbles, thoughts, and so on, but I have recently written some of them down in the hope that they might someday be useful, at least for discussion purposes.

Table of Contents

The Author’s Journey to This Proposal

In the 1970s I was programming on PDP-11s (assembler), PDP-10s (assembler and Lisp on ITS), and Multics (PL/1). Lisp is a dynamically typed language with garbage collection, and later in my career, I worked on a Common Lisp implementation for MIPS processors. I have retained an interest in better support for languages with dynamic typing and garbage collection. Also in the 1970s, members of the same office were programming the 6600 and later the Cray-1 for computational fluid dynamics, and via my indirect exposure thereto I have retained an appreciation of the Cray-1 approach to processor architecture, including RISC and vectors. Other friends were working on Lisp machines (e.g. the Symbolics 3600), and I have retained an interest in some of their problem solutions.

I have been involved in designing processor architectures since the early 1980s when I was contributing to a compiler (Pastel) and operating system (Amber) for a new processor architected by others. That big machine was called the S-1 Mark II, and several of us thought it might be possible to create a much simpler processor and explored something we called, without much imagination, the S-2, which would fit on 2 boards of ECL logic. While I wasn’t familiar with the term, the S-2 was very similar to the RISC ideas germinating in academia at the same time, except it had exactly one memory operand per instruction (similar to the PDP-10), but was otherwise a simple fixed 32-bit instruction format with a general purpose register file and simple pipelined instructions (at the time many machines were instead microcoded). Because the S-1 Mark II operating system team were writing a new operating system, incorporating many Multics ideas and many new ones, the Mark II and S-2 had Multics-like virtual memory and protection features. In subsequent years I have worked on processor architectures without such features, which the new operating system dominating the scene, Unix (later Linux), did not know how to use, but I retained an interest in reviving such features.

In 1985, I joined the MIPS compiler team to work on the code generator, but later transitioned to become Director of Architecture after the first architect left. As a member of the compiler team, I disagreed with many design choices in the original MIPS ISA, such as the lack of load interlocks branch delay slots, absolute addresses in JAL, and so on, but it was too late to change them. These positions gave me an up-close view of academic RISC architectures. One of my early tasks as architect was to define the 64‑bit extension to the existing 32‑bit ISA in the late 1980s. Since that time I have since felt that 32‑bit ISAs should generally be retired in favor of 64‑bit ones.

In the late 1990s I was working at SGI on out-of-order processors, and I did various studies of available Instruction Level Parallelism (ILP) available in programs with infinite parallelism, but with latencies based on cache hierarchies and branch prediction (e.g. a L1 data-cache hit might have 2‑cycle latency, but a L1 miss, L2 hit might have a 6‑cycle latency). What was fascinating to me was the instruction fetch was the limiting factor for ILP in these studies, probably because branch prediction in the 1990s was not terribly good. Today with modern branch prediction, e.g. TAGE, this may no longer be as true, but I do think that techniques which the address instruction fetch bottleneck can still be valuable. The problem is that instruction fetch is like linked list processing, but worse due to parsing at each list node to find the next link. Linked list processing is particularly latency sensitive, and often replaced by array processing in high-performance computing where possible because of the performance advantages. I tried to find ways to make instruction fetch more like array processing, but didn’t succeed, and so settled for reducing the parsing at each list node. I call the general approach block-structured instruction sets. One can think of it as replacing the Branch Target Buffer (BTB), which most contemporary processors create on-the-fly, with a compiler-generated structure resident in code segments. This approach turned out to have many other advantages (e.g. prefetching, line size fills, control flow integrity features, better, support for parallel instruction decode, and so on), and I have retained an interest in exploring the potential of block-structured Instruction Set Architectures (ISAs).

It has been apparent for a long time that bad programming languages (e.g. C and C++) have encouraged programmers to write software with many security flaws. Early on I thought our industry would learn to do better over time. By the late 1990s, however, I was not so sure. I began to think it would be valuable for processors to introduce features to help languages, compilers, operating systems to address the lack of checking that leads to such flaws. As the decades have progressed and little progress has been made, I have retained an interest in exploring the options for this in processor architecture.

Also in the late 1990s, when I was working at Silicon Graphics, there were some there proposing micro-architectural alternatives to Out-of-Order (OoO) for latency tolerance. I don’t know, but I suspect this was based on the Decoupling in earlier scientific computing processors such as the Astronautics ZS-1. While I don’t think such decoupling is a replacement for OoO architectures, I have retained an interest in instruction sets that keep decoupling feasible, which either might be combined with OoO, or used in low-end implementations of an ISA that are less susceptible to some of the many security flaws introduced by speculation on OoO implementations (e.g. Spectre, Meltdown, Foreshadow, PACMAN, Retbleed, etc.).

Between 1997 and 2001 I worked on a processor architecture that targeted, among other things, Digital Signal Processing (DSP), and in that field it was common for processors to have zero-overhead loop features, and so Tensilica’s Xtensa ISA did as well. Given that such features solve branch prediction for such loops, and thereby generally make branch prediction elsewhere more effective, I have retained an interest in ISA features that may help in this regard, even if not necessarily being zero overhead. (A simple zero overhead loop feature in a block-structured ISA might be the ability to repeat the next basic-block N times, but that would not be general enough to support loops with branches, so I have tended toward other methods.)

Tensilica’s Xtensa ISA’s highest design priorities were extensibility and code size, and I chose to give Xtensa register windows because of its enormous benefits to code size, primarily in making function entry/exit small, which also optimizes performance for function-call intensive programs. Unfortunately, I don’t see register windows as fitting particularly well into the current ISA proposals, given the heterogeneous register file, despite their advantages for function-call intensive languages and programs. Xtensa’s register windows were much more efficient than SPARC’s (I think SPARC basically killed the register window idea for subsequent ISAs by making the register window increment 16—Xtensa has instead increments of 0, 4, 8, and 12 which allows it to get more utility out of 64 physical registers than SPARC did with its 144).

In 2022 I encountered the University of Cambridge Capability Hardware Enhanced RISC Instructions (CHERI) research effort. I found their work impressive, despite some concerns, and have sought to make my proposals be CHERI-capable for applications which can tolerate doubleword pointers. I don’t see doubleword pointers being used for everything in a system however, and so the current proposals support CHERI capabilities without requiring them everywhere.

Given the above interests, I have tinkered with a few different processor architectures that combine these things: more sophisticated virtual memory and protection, block-structured ISAs, better branch prediction, and things that address security, such as better bounds and other checking, CHERI capabilities, and support for garbage collection and dynamic typing. Some of these things are synergistic. For example, garbage collection can be security feature, as explicit memory reclamation can lead to programming errors that introduces security issues, and some of the features that support dynamic typing support other capabilities. This synergy has encouraged me to propose architectures with all of these features.

In the late 2010s, long after I had been thinking about unconventional processor architectures, I was introduced to the RISC‑V ISA, which seemed very conventional, i.e. much like the MIPS ISA that I had worked on in the 1980s and 1990s, but which had cleaned up MIPS’ worst warts and worked on modernizing its virtual memory from the days when I last followed it. Thus RISC‑V is very much a conventional ISA, and because RISC‑V is open-source, I find it a useful point of comparison to my explorations, and I have often modified my exposition to be more RISC‑V centric and even to adopt some of RISC‑V’s innovations when they fit.

In 2023 after I described a little of the block-structured ISA idea to a colleague, he sent me to Bird et al.’s 1993 Supercomputing paper The Effectiveness of Decoupling. The paper did not unfortunately go into detail on the mechanisms, so it is not possible to say how much of their Control Decoupling presages the block-structured ISA, but there is a modest amount of similarity between what they called the Control Processor and in what I propose below, where it is called the Basic Block Engine. Their Address and Data Processor separation is also similar to some of the structures in the proposed ISA that facilitate decoupling of address generation and computation along the lines of the ZS-1 cited earlier.

One Particular Exploration

One particular exploration of these ideas has been developed a little further than others. For the time being I am calling it SecureRISC in the hope that over time I can improve it to live up to its name. It has been in the back of my mind for decades, but it got slightly more attention after my last full-time employment ended in 2001. Over time, I might introduce other explorations on these pages. For example, while SecureRISC is block-structured, there are additional ways in which one might take block-structured ISAs further, such as facilitating register renaming on blocks, rather than on individual instructions (I have tinkered with this and it seems promising). But leave that for the future.

Please understand that SecureRISC is not a specification at this point. It is a set of explorations, some spelled out in detail, some less specific, with the intent by writing them down it could lead to useful discussion. So with that introduction, here is the current SecureRISC proposal.

Block Structured Renaming

Since I mentioned it above, I will elaborate slightly. Imagine the basic block descriptor included, in addition to what SecureRISC includes, the set of source registers used by the basic block, and the set of output registers of the basic block. Renaming could be done for the basic block as a whole, rather than on each instruction in the block. Within the basic block, instruction sources would either reference the Nth source register to the block or the result of the Nth instruction local to the basic block. Instructions would not need explicit destination register fields as a result (this would be in the basic block descriptor).

RISC-V Proposals

SecureRISC is the primary target for these pages, but occasionally I take ideas from SecureRISC and adapt them for RISC‑V. They are on here primarily because of the tools I created for producing SecureRISC register figures. If any of these proposals were to generate interest in the RISC‑V world, it would be necessary to convert them to asciidoc.

RISC-V Garbage Collection

A primary goal of SecureRISC is to support Garbage Collection (GC) efficiently and SecureRISC Garbage Collection describes the proposal for this. Most of what is proposed for SecureRISC has been adapted to RISC‑V and is described in proposal for RISC‑V GC.

Alternative 64‑bit Translation for RISC‑V

Currently 64‑bit RISC‑V has Sv39, Sv48, and Sv57 translation models for its supervisors using 3, 4, and 5‑level page tables with 512 PTEs per level for virtual address spaces of −238..238−1, of −247..247−1, and −256..256−1 respectively, with a 56‑bit physical address space. An obvious extension to Sv64 using a 6‑level page table for an address space of −263..263−1 is likely someday. As an alternative, I have created a 64‑bit translation for RISC‑V called Ssv64 based on a subset of the SecureRISC translation proposal that I believe has significant advantages compared to the existing three models and the obvious extension to 64 bits.

Ssv64 was designed to be as RISC‑V Sv57 etc. compatible as possible, which meant changing a number of things carried over from SecureRISC. In March 2023 I back-ported some of those changes into SecureRISC, since in most cases they don’t reduce SecureRISC functionality and gratuitous incompatibility isn’t helpful.

Alternative Smmtt Proposal

Proposal for Alternative Smmtt is a proposal that borrows from Ssv64 above to replaced the fixed table structure in the proposed RISC‑V Smmtt extension with something more flexible.

Tagged RISC-V

The RISC‑V multiverse concept has been proposed, with universes representing aligned entities within a general RISC‑V framework. For example, CHERI RISC‑V is considered a separate universe from the main line of RISC‑V. I propose that a stepping stone to SecureRISC might be a new universe called Tagged RISC‑V, which is basically the RISC‑V instruction set with SecureRISC Virtual Memory and tagging, including using two tags for CHERI-128 capabilities. Unfortunately this loses the Block Structured aspect of SecureRISC, and the Control Flow Integrity aspects associated with it, but it can provide bounds checking with CHERI, or sized pointers or cliques, as well as better Garbage Collection, and support for runtime typing.

Proposal for RISC‑V Matrix

SecureRISC has a matrix accumulator feature that I propose for extending the RISC‑V Vector ISA for AI.


No Junk Email! SecureRISC.org Mail Policy
Do not send unsolicited commercial email (i.e. spam) to this site!
We reserve to right to charge up to US$5000 per violation.

Valid XHTML 1.1 Valid CSS!
<webmaster at securerisc.org>
No Junk Email!
2024-09-02