# RISC-V Libre Computing



# Freedom hardware: Current state and forward looking statements

V. Alex Brennen, MIT Kurt Keville, MIT ISN / ORCD

The RISC-V architecture and ecosystem have undergone tremendous growth recently. We will take a look at the current state of RISC-V and its current deployment footprint. We'll discuss where RISC-V may be headed and the role it may play in completely open and free datacenter servers, tablets, and cellphones. We will review the emergence of the ARM architecture and how it may be an important stepping stone to a free computing platform. We'll discuss the differences between the ARM licensing model and the X86 64 architecture licensing model as well as the role of ARM processors in cell phones and cloud datacenters (such as AWS). Finally, we'll take a brief look at some options for starting RISC-V free and open hardware development for both experienced FPGA programmers and newbies. We'll explain options including physical RISC-V processors, FPGAs, and software emulation.

# What is RISC-V

- Instruction Set Architecture (ISA)
- ISAs Are Licensed Like Software
- Other Active ISAs:
- x86\_64 (Intel, AMD)
- ARM64 (CPU: Graviton; SoC: A16, Snapdragon)
- Licensing Status
  - RISC-V (Open, Libre, Extensible, Free [BSD] )
  - x86\_64 (Proprietary, Not Extensible, Very Expensive)
  - ARM64 (Extensible, Somewhat Expensive)

# **Current State of ISAs**

x86\_64: Mature, High Performance, Very Broad Support

- Examples: Intel/AMD Datacenter Servers, Desktops, Laptops

ARM64: Less Mature, Good Performance, Broad Support

- Examples: AWS Graviton, Apple, Cellphones, Tablets RISC-V: Developing, OK Performance, OK Support
  - Examples: Dev Boards (primarily), SoC Devices, Embedded Devices, Task Offloading (Daughter Processors/Boards [Nvidia GPUs])

## X86\_64 => ARM64 => RISC-V? GNU Parallel?

- Historically ISAs were proprietary intellectual property (think Oracle)
- x86\_64 brought collaboration (Intel & AMD); small step toward openness
- ARM64 brought lower cost and more broad licensing policy larger step toward openness
- What Did ARM Licenses Look Like?
  - ARM licenses based on usage of CPUs
  - Some companies got perpetual licenses (Apple, Amazon?)
  - ARM would license to pretty much anyone
  - Often 10% or more of die free for special use (Example: Amazon's "Everything is the network", TLS, Sockets)
  - Possible to add instructions to ARM, but not easy ARM Corp decides

# What About RISC-V

- Libre (BSD, mostly), take it if you want it
- Compatibility? That's handled with flags
- Instructions and flags can be added through an Open Standards process
  - Examples:
    - RVV (Vector Extensions (like Intel SSE & SSE2))
    - Crypto (US/NIST AES/SHA, Russian Ghost)
- Progression:
  - Microsoft (Booooo!) => GNU/Linux => GNU
  - Intel/AMD/SGI/IBM/etc => ARM64 => RISC-V

## Amazon, Apple, Cloud Providers

- Will Amazon Move to RISC-V?

- Perpetual License (ARM is Cost Free, RISC-V is Cost Free)
- Which Develops Faster?
- Libre Software Native Developed on RISC-V?
- Extension Porting (TCP/IP, Crypto, TLS)
- Validation and Testing Workloads
  - ARM64 Everywhere in AWS
  - I Run Almost All My MIT Workloads on ARM64 (Graviton)
  - AWS Lambda (ARM64, Whether You Want It Or Not)
- Similar Story With Apple and Google (Tau 2A)
- Other Cloud Providers (Hetzger, Rackspace... Yandex)
  - Very Different Story
  - Especially Yandex

# **Politics: Sanctions**

- Russia & China Need Chips
  - Not Just Weapons: Medical Devices; Cars; Traffic Lights; HVAC; TVs; Phones; Washing Machines; Everything!
- ARM Canceled All Russian Licenses
- How Fast Will RISC-V Develop if Russia & China Make An Emergency Move To It?
- Note: Russia Has Proprietary VLIW ISA Elbrus Developed by MCST
- Elbrus Tested vs Intel by Yandex Test Failed
- Security: Can You Trust Proprietary Chips?
  - Pentium90 FPDIV Bug (Design)
  - Crypto Bugs (SHA3 Overflow) (Software/Standard)
  - Backdoors? (Hardcoded Credentials) (Design)
  - Stealthily Doped Gates (Manufacturing)

# Getting Started With RISC-V

- Ubuntu has RISC-V images; Debian (same)
- C/C++; Python; Golang; Rust; Ruby; Java; etc...
- QEMU Emulation (How GNU/Linux packages are built)
  - Built in RISC-V 32bit and RISC-V 64bit cores
- FPGA (Download Verilog RISC-V off Github)
- RISC-V hardware (DevBoards: MangoPi, SciFive, StarFive (Chinese))

## Fully Functional RISC-V Processor in Verilog

eutput [31:0] mem vdata, eutput [31:0] mem vdata, eutput [31:0] mem vdata, input [31:0] mem vdata, input [31:0] mem rdata, input mem rbusy, input mem rbusy, reset 12.0 parameter ADSK ADDR = 2:'N00000000; parameter ADDR WIDTR = 20; localparam ADDR MAD = (1:-ADDR WIDTR)(1'b0)); (- onehot ': when [?:0] functils = 0:bo0000001 << lestr[14:12]; stre 13:01 Uime = ( instr[31], instr[33:12], (1:5(-00))); uime 13:00 Iime = ((1:(0mer[31])), instr[33:12], instr[33:12], uime 13:00 Iime = ((1:(0mer[31])), instr[33:12], wire 1sALU = 1sALUIME | 1sALUreg: reg [31:0] rel: reg [31:0] rel: reg [31:0] resisterfile [31:0]; always @!pocedge clk! begin if [writeBack] if (rdId != 0) registerFile[rdId] <= writeBackData: wise [11:0] sluze1 = rs1; wise [11:0] bluze1 = isAlureg | isBranch ? rs2 : lime; reg [11:0] slupAg; reg [11:0] aluShast; wire aluBusy = [sluShast; wire aluBusy = [sluSha wine tis:) aluftiss = alufa1 + alufa2; wine [13:0] aluftiss = (icita1 + alufa2) + (1:00,alufa1) + 30:01; wine [13:0] aluftiss = (icita1 + alufa2) + (1:00,alufa1) + 30:01; wine transmission = (icita1) + (i wire [31:0] aluPlus = aluIn1 + aluIn2: cod dif if(latubant) begin if(latubant) alushant - 1; alushant ~ alushant - 1; alusheg <= fuectif() i alusheg <= 1 ; alusheg <= fuectif() i alusheg[31], alusheg[31:1]); instr() i alusheg[31], alusheg[31:1]); end wire predicate = funct3IS[0] & EQ | funct3IS[1] & 1EQ | funct3IS[4] & LT funct3IS[5] & 1LT | funct3IS[0] & LTU | funct3IS[7] & 1LTU reg [ADDR wIDTH-1:0] PC; reg [31:2] instr; wire [ADDR wIDTH-1:0] PCplus4 = PC + 4; wire [ADDR\_wIDTH-1:0] PCplus1mm = PC + ( instr[3] 7 Jimm[ADDR\_WIDTH-1:0] ; wire [ADDR\_WIDTH-1:0] PCplus1mm = PC + ( instr[4] 7 Uinm[ADDR\_WIDTH-1:0] ; minm[ADDR\_WIDTH-1:0] PCplus1mm = PC + ( instr[4] 7 Uinm[ADDR\_WIDTH-1:0] ; Birm[ADOR MIDTH-1:0] }: 

input clk.

assign mem addr - (ADDR PAD ADDR PAD, state[MAIT\_INSTR\_bit] | state[FETCH\_INSTR\_bit] 7 PC : Loadstore addr); wire [31:0] writeBackData cycles (1SSYSTEM aluout 1 1 4 4 1 1 1 ? {ADDR PAD, PCplusIme) : 32'50) ? {ADDR PAD, PCplus4 ) : 32'50) ? {ADDR PAD, PCplus4 ) : 32'50) ? LOAD data : 32'50) C14 ADDDY (15JALR (15Load | 15.7AL (Istood ) LOAD\_data : 32\*00); Load and the set byteAccess : Load [13:13] = 2\*001; LOAD\_slop -LOAD\_slop -LOAD\_slop -the LOAD\_slop -the set byteAccess ? LOAD\_byte[7] : LOAD\_halfword[15]); mem halfword(cost ? ([13:(LOAD\_slop])), LOAD\_halfword[15]); mem halfword(cost ? ([13:(LOAD\_slop])), LOAD\_halfword[15]); Load byte ? Load barbord = dottal Load halfword(cost); Load barbord = dottal Load barbord = d wire [7:8] LOAD byte = leadstore\_addr[8] 7 LOAD\_halfword[15:8] : LOAD\_halfword[7:8]; wire [3:8] STORE wmask = men byteAccess 7 (loadstore\_addr[1] 7 (loadstore\_addr[0] 7 4 b1000 : 4 b0000) : (loadstore\_addr[0] 7 4 b0010 : 4 b0001) ] ; men halfwordAccess [leadstore\_addr[1] ? 4'b1100 : 4'b0011) : 4'b1111; localparam FETCH INSTR bit = 11 localparam WAIT INSTR bit = 0; localparam WAIT INSTR bit = 1; localparam EXECUTE bit = 2; localparam WAIT ALU OF MEM bit = 3; localparam NB STATES = 4; localpares FETCH INSTR = 1 << FETCH INSTR bit: localpares wAIT INSTR = 1 << WAIT INSTR bit: localpares EXECUTE = 1 << EXECUTE bit: localpares WAIT ALL OR WM = 1 << WAIT ALL OR MEM bit: res [NR STATES-1:0] state: reg [NB 5TATES-1:0] state: ) {
wils writewed (state(Decore ) &
wils writewed (state(Decore Dat) | state(MAIT ALU\_OM\_HDM\_Dat));
saids new retro - state(DECORE Dat) | state(MAIT ALU\_OM\_HDM\_Dat));
state(MAIT - state(DECORE Dat) | & state(Decore Dat) | state(MAIT ALU\_OM\_HDM\_DAT);
state(Decore Date) | & state(Decore Date) | state(Date) | wire jumpToPCplusInm = isJAL | (isBranch & predicate); wire needToMsit = isLoad | isStore | isALU & funct3IsShift; always @(possedge <1k) begin if():<=set) begin state <= wkit ALU OR MEM; PC <= RESET\_ADDR[ADDR WIDTH-1:0];</pre> end else (\* parallel case \*) (\* parallel\_case \*)
sate in warm in the interval of t and and end state[WAIT\_ALU\_OR\_MEM\_bit]: begin
if[!aluBusy & 'mem\_rbusy & 'mem\_wbusy! state <- FETCH\_INSTR:</pre> end default: state - walt INSTR: endcase and reg [31:0] cycles; always @[posedge clk] cycles - cycles + 1;

# Forward Looking Statements

- "Software is eating the world." Marc Andreessen
- Libre software/hardware evolves faster and has more eyes (less bugs/security issues)
- · Libre software/hardware is cheaper
- A Libre CPU is inevitable and RISC-V the most likely candidate there for RISC-V is inevitable
- AI can help with hardware design
  - Altium Designer (7-Layer board path timing, EM interference prevention trace and component spacing)
  - Google chip design lead MIT AI Lab talk
    - · Google uses AI to help design chips at the gate level
- Imagine the future full stack is all Libre software (including CPU)
  - Imagine a profiler can make RISC-V CPU design adjustments/decisions
    - Lots of AES128 crypto? More AES128 crypto modules
    - Lots of network traffic? More L2 cache and paths to RAM/NIC
    - Simulations w/ bignum math? More 1024bit vector registers

### Leading Alternative FOSSi Hardware Options

- 1) RISC-V . https://riscv.org/
- 2) MIPS (Early Playstation consoles)
- 3) OpenRISC, DLX, and predecessors
- 4) J-Core, formerly SuperH from Hitachi <a href="https://j-core.org/">https://j-core.org/</a> .
- 5) OpenPiton. <u>http://www.openpiton.org/</u>
- 6) Epiphany <a href="https://www.adapteva.com/">https://www.adapteva.com/</a>
- 7) Nyami <u>https://www.cs.binghamton.edu/~millerti/nyami-ispass20</u> <u>15.pdf</u>
- 8) GPLGPU <u>https://github.com/asicguy/gplgpu</u>
- 9) Proof of Life needed. OpenPOWER, OpenSPARC, Geode
  - 10)EOLed hardware with deprecated GNU/Linux port projects

## The RV Trade Space

#### C class microcontroller

- 32-bit 3-8 stage in-order variant aimed at 50-250 Mhz microcontroller variants
- •Optional memory protection Very low power static design
- •Fault Tolerant variants for ISO26262 applications
- IoT variants will have compressed/reduced ISA support

#### I class processors

- •64-bit, 1-4 core, 5-8 stage out of order, aimed at 200-1Ghz industrial control / general purpose applications
- •Devices aimed at networking applications will have dual-quad issue support
- •Other features shared L2 cache, AXI bus, threading support

#### **M Class processors**

- •Enhanced variants of the I-class processors aimed at general purpose compute, low end server and mobile applications
- •Enhancements over I class large issue size, quad-threaded, up to 8 cores, freq up to 2.5 Ghz, optional NoC fabric

#### S class processors

- •64-bit superscalar, multi-threaded variant for desktop/server applications. 1.2-3Ghz, 2-16 cores, crossbar/ring interconnect, segmented L3 cache
- •RapidIO based external cache coherent interconnect for multi-socket applications (up to 256 sockets)
- •Hybrid Memory Cube support, 256/512 bit SIMD
- •Specialized variants with FUs for database acceleration, security acceleration.
- •Experimental variants will be used as test-bed for our Adaptive System Fabric project which aims to design a data-center architecture using NV RAM devices and unified interconnects for memory, storage and networking and leverages persistent memory techniques

#### **H** class processors

- •64-bit in-order, multi-threaded, HPC variant with 32-100 cores, 512 bit SIMD, Interconnect TBD
- •Goal is 3-5 + Tflops (DP, sustained)

#### **T class processors**

Experimental security oriented 64-bit variants with tagged ISA, single address space support, decoupling of protection from memory management.

### **Leading RV Companies**

- 1) SiFive
- 2) RIOS Labs
- 3) Imperas
- 4) Andes
- 5) Esperanto
- 6) Ventana
- 7) Codasip
- 8) Imagination Technologies
- 9) Renesas, Espressif, Syntacore

10)Google, NVidia, Intel, Qualcomm, Seagate

### **Leading Projects**

- 1) Mentorship program <u>https://riscv.org/risc-v-mentorship-program/</u>
- 2) RV Exchange <u>https://riscv.org/exchange/</u>
- 3) OpenHW. <u>https://www.openhwgroup.org</u>
- 4) Chipyard. https://github.com/ucb-bar/chipyard
- 5) OpenCompute. <u>https://www.opencompute.org/</u>
- 6) <u>https://lowrisc.org/</u>
- 7) <u>https://opentitan.org/</u>
- 8) <u>https://opencores.org/</u>
- 9) <u>https://www.fossi-foundation.org/</u>

10)Projects List <a href="https://github.com/riscvarchive/riscv-cores-list">https://github.com/riscvarchive/riscv-cores-list</a>

### **Interesting Products**

- 1) MangoPi board
- 2) RV Soldering Iron
- 3) Multiarchitecture. https://www.bunniestudios.com/blog/?p=6606
- 4) Many products on <u>https://www.crowdsupply.com/search?q=risc-v</u>

## The Bad News

- 1) Little market penetration in any vertical
- 2) No product in many verticals yet
- 3) HPCwire articles <u>https://www.hpcwire.com/2022/11/18/risc-v-is-far-from-being-an-al</u> <u>ternative-to-x86-and-arm-in-hpc/</u>
- 4) Nvidia article hpcwire.com/2022/09/23/nvidia-shuts-out-risc-vsoftware-support-for-gpus

#### Conferences

- 1) Boston Area Architecture Conference. <u>https://bostonarch.github.io/2022/</u>
- 2) RISC-V Summit . <u>https://riscv.org/event/risc-v-summit-2022/</u>
- 3) Princeton Comp Arch Day. <u>https://bit.ly/pton-comp-arch-day-2023</u>
- 4) LatchUp. <u>http://latchup.io/</u>
- 5) Week Of Open Source Hardware <u>https://www.fossi-foundation.org/wosh/</u>
- 6) RISC-V lecture next week <u>https://www.csail.mit.edu/event/risc-v-inevitable</u>
- 7) Open Hardware Summit <u>https://2023.oshwa.org/</u>
- 8) HPC on RV Workshop <u>https://riscv.epcc.ed.ac.uk/community/isc23-workshop/</u>
- 9) MEWD. <u>https://stamcenter.asu.edu/mewd-workshop/</u>
- 10) OGHPC (OG = Oil & Gas) <u>https://www.energyhpc.rice.edu/</u>

11) HPC on Wall Street <a href="https://www.hpcaiwallstreet.com/">https://www.hpcaiwallstreet.com/</a>

# Decidedly Not FOSSi hardware (but interesting in the HPC space)

- 1) Morello Board (CHERI architecture)
- 2) Neocortex WSE
- 3) Lucata, formerly Emu
- 4) Alpha (big in China)
- 5) Y2K snapshot. Sun, SGI, Cray, DEC, Sicortex, Compaq, HPE
- 6) From Debian-ports list : S390, BlueGene, CBE, hppa, ia64
- 7) Further Back Wang, Symbolics, MAI Basic, Kendall Square Research, Thinking Machines, Apollo, Prime, Mercury, Data General, ADI
- 8) Various FPGA based dev boards or SOCs.

### **RISC-V** in the Datacenter

- 1) What workloads can be moved over?
- 2) What unique aspects of the RV design can be taken advantage of?
- 3) Compare to the incumbents (TOP500.org, Green500.org)
- 4) <u>https://www.opennovation.org/</u>
- 5) Advancing HPC with RISC-V <u>https://youtube.com/watch?v=iFlcJFcOJKk</u>
- 6) <u>https://riscv.org/news/2018/10/top-500-article-europeans-b</u> udget-1-4-billion-euros-to-build-next-generation-supercom puters/
- 7) <u>https://semiengineering.com/is-risc-v-ready-for-supercomp</u> <u>uting/</u>
- 8) <u>https://www.theregister.com/2023/02/08/riscv\_hpc/</u>
- 9) <u>https://semiengineering.com/risc-v-targets-data-center/</u>

### **Future Directions**

- 1) RV128 <u>http://rv128.mit.edu/</u>
- 2) HW / SW Co-design <u>http://riscv.mit.edu/</u>
- 3) Posits replacing IEEE 754 Floating Point standard https://posithub.org/docs/RISC-V/RISC-V.htm

NEWS ARTIFICIAL INTELLIGENCE

### **Posits, a New Kind of Number, Improves the Math of AI** > The first posit-based processor core gave a tenthousandfold accuracy boost

BY DINA GENKINA | 25 SEP 2022 | 4 MIN READ |  $\Box$ 



# **Questions?**

- V. Alex Brennen
  - <u>vab@mit.edu</u>



- Kurt L. Keville
- <u>klk@mit.edu</u>

