This presentation is an HTML5 website

Press key to advance.

JIT Compiling Domain Specific Languages

Jeremy Voorhis

JIT Compiling DSLs
  • Survey of DSLs
  • LLVM
  • Implementing a Compiled DSL
What are DSLs?

Domain Specific Languages are programming languages biased towards a particular problem domain.

Advantages of DSLs
  • Expressive
  • Share language with domain experts
  • Hide implementation details
  • Emphasis on meaning
The Payoff of DSL Technology
Paul Hudak (1998) Modular Domain Specific Languages and Tools
DSLs are old news

The embedded language technique dates back to 1966, in Peter Landin's The Next 700 Programming Languages

What's My Model?
  • Represent domain concepts
  • Use the language
    • In identifiers
    • In program structure
DSL Execution

Interpreted or compiled?

Interpreted DSLs
  • Easiest to implement
  • Easiest to maintain
  • Good choice in absence of performance requirements
Compiled DSLs
  • Greater initial overhead
  • Generate code in some target language
  • JIT compilation
  • Generate machine code directly
Embedded or External?
External DSLs

External DSLs define a custom syntax and semantics.

Embedded DSLs

Embedded DSLs reuse the syntax and semantics of their host language.

Musical Examples
  • Csound
  • ChucK
  • Faust
  • ...
Csound
Orchestra
instr 1			
	asig	oscil	10000, 440, 1
	out	asig
endin
Score
f1	0	256	10	1	;  a sine wave function table

;  a pentatonic scale
i1	0	.5  	0	8.01	
i1 	.5   	.	.	8.03	
i1	1.0   	.	.	8.06	
i1	1.5   	.	.	8.08	
i1	2.0   	.	.	8.10	
e				
ChucK
Digital Delay
// patch
adc => DelayL delay => dac;

// set delay parameters
.75::second => delay.max => delay.delay;

// infinite time loop
while( true ) 1::second => now;
Faust
Noise Generator
random  = +(12345)~*(1103515245);
noise   = random/2147483647.0;
process = noise * vslider("vol", 0, 0, 1, 0.1);
LLVM
LLVM logo
What is LLVM?
  • A reusable compiler backend
  • An umbrella for compiler tools
    • Clang
    • MC
    • compiler-rt
    • KLEE
reusable code for doing compilery things Chris Lattner
LLVM as a Compiler Backend
  • Generate IR
  • Run Passes
  • Generate Code
Intermediate Representation
IR language as linguistic switchbox connecting all stages of compilation Wesley Smith, Graham Wakefield (2009) Augmenting Computer Music With Just-In-Time Compilation
IR Features
  • Single Static Assignment
  • Type System
  • Modules
  • Metadata
  • Target Independent
  • Excellent Documentation
IR Example: Factorial
; ModuleID = 'Factorial'

define i64 @fac(i64) {
; <label>:1
  %2 = icmp eq i64 %0, 1                ; <i1> [#uses=1]
  br i1 %2, label %7, label %3

; <label>:3                             ; preds = %1
  %4 = sub i64 %0, 1                    ; <i64> [#uses=1]
  %5 = call i64 @fac(i64 %4)            ; <i64> [#uses=1]
  %6 = mul i64 %0, %5                   ; <i64> [#uses=1]
  br label %7

; <label>:7                             ; preds = %3, %1
  %8 = phi i64 [ 1, %1 ], [ %6, %3 ]    ; <i64> [#uses=1]
  ret i64 %8
}
Passes
  • Built-in passes
    • Instruction Combining
    • Loop Unrolling
    • ...
  • Custom pass API
Pass Manager
  • Define queue of passes
  • Passes may be repeated
  • Run at module or function level
Sample Pass Queue
  • Promote Memory To Register
  • Combine Redundant Instructions
  • Reassociate Expressions
  • Global Value Numbering
  • Dead Store Elimination
  • Simplify Control Flow Graph
  • ...
LLVM API
  • C++ classes
  • C bindings (incomplete)
  • ll
  • bc
LLVM For Your Language
  • Ruby
  • OCaml
  • Haskell
  • Python
  • C bindings are versatile
ruby-llvm
  • Written in pure Ruby
  • Interfaces to C via FFI
ruby-llvm Features
  • IR Builder
  • Pass Manager
  • Execution Engine
ruby-llvm Sample
mod = LLVM::Module.create("Factorial")

mod.functions.add(
  "fac",
  [LLVM::Int], LLVM::Int
) do |fac, p0|
  entry   = fac.basic_blocks.append
  recur   = fac.basic_blocks.append
  result  = fac.basic_blocks.append
  # continued →
end
ruby-llvm Sample Cont'd
  builder = LLVM::Builder.create
  
  builder.position_at_end(entry)
  builder.cond(
    builder.icmp(:eq, p0, LLVM::Int(1)),
    result, recur)
  
  builder.position_at_end(recur)
  fac_call = builder.call(fac,
               builder.sub(p0, LLVM::Int(1)))
  fac_result = builder.mul(p0, fac_call)
  builder.br(result)
  
  builder.position_at_end(result)
  builder.ret(
    builder.phi(LLVM::Int,
      LLVM::Int(1), entry,
      fac_result,   recur))
ruby-llvm Sample Cont'd
; ModuleID = 'Factorial'

define i64 @fac(i64) {
; <label>:1
  %2 = icmp eq i64 %0, 1                ; <i1> [#uses=1]
  br i1 %2, label %7, label %3

; <label>:3                             ; preds = %1
  %4 = sub i64 %0, 1                    ; <i64> [#uses=1]
  %5 = call i64 @fac(i64 %4)            ; <i64> [#uses=1]
  %6 = mul i64 %0, %5                   ; <i64> [#uses=1]
  br label %7

; <label>:7                             ; preds = %3, %1
  %8 = phi i64 [ 1, %1 ], [ %6, %3 ]    ; <i64> [#uses=1]
  ret i64 %8
}
ruby-llvm Future Work
  • Migrate to LLVM 2.7
  • Binary distribution
  • Bitcode reader/writer
  • C backend
  • Clang bindings
Siren
  • Embedded, JIT compiled DSL
  • Realtime execution
  • Events for control
  • Voices for synthesis
Siren: Events
Events encode musical structure
Event = Rest duration
      | Note duration params
      | Seq Event Event
      | Par Event Event
Excerpt of Satie's First Gymnopedie
rest(1) & note(1, fs5) & note(1, a5) &
note(1, g5) & note(1, fs5) & note(1, cs5) &
note(1, b4) & note(1, cs5) & note(1, d5) &
note(3, a4)
Modeling Sound
Sound is a pressure wave
Modeling Sound
Time      = ℝ
Amplitude = ℝ
Sound     = Time → Amplitude
Modeling Sound
Time      = Float
Amplitude = Float
Sound     = Time → Amplitude
Siren: Voices
A sine wave oscillator
class Oscil < Siren::Voice
  param :freq, :float, 0.0
  param :amp, :float, 0.5

  def render(gt, vt, c)
    cos(2 * PI * freq * vt)
  end
end
Abstract Syntax
  • Defines constants
  • Leverages host language
  • Untyped
  • Easy to compile
Representing Abstract Syntax
  • Techniques vary by host language
  • Tagged unions
  • Class hierarchy
  • Algebraic data type
Representing Abstract Syntax
AST = FloatLit Float
    | FAdd AST AST
    | ...
    | Cos AST
    | ...
    | IntLit Int
    | ...
DSL → Target Language
  • Code generator is a semantic function
  • Traverse AST from bottom up, making LLVM IR builder calls
  • Visitor pattern
  • Fold
Denotational Semantics
  • Semantic function maps DSL terms to their meaning
 E = Int z | E + E
〚Int z〛= z
〚x + y〛=〚x〛+〚y〛
Hutton (1998) Fold and Unfold for Program Semantics
Concrete Syntax

An embedded DSL's user interface

  • Integrate with host language
  • Catch type errors early
  • Facilitate construction of programs in abstract syntax
Runtime Support
  • Don't generate everything
  • Reuse support code
  • C calling convention is ubiquitous:
    • Call C from generated code
    • Call generated code from C
Thank You
  • http://llvm.org/
  • http://github.com/jvoorhis
  • http://jvoorhis.com/
  • @jvoorhis