Thursday, December 24, 2015

Working on the Swift compiler with Jetbrains AppCode

Just a few weeks ago, Apple open-sourced the Swift compiler and standard library. This is exciting news for the Swift community: not only will Swift development now be done in the open, but the availability of Swift's source makes it possible to port Swift to other platforms.

The Swift codebase is very large (approaching 400k lines of C++ code, including test cases) so an IDE to help you navigate through it would be ideal. I'm partial to the Jetbrains tools, but wasn't sure if I could use them to browse the Swift source tree. I had some difficulty getting CLion to use the CMake scripts in the Swift repo and was about to give up. Fortunately, I noticed in the bottom of Swift's README that there's a script to generate an Xcode project from the Swift repo.

So here's how I generated the Xcode project and was able to work on Swift using AppCode:

1. Create a new parent directory, cd into it and clone the Swift repo:

$ mkdir Swift
$ cd Swift
$ git clone https://github.com/apple/swift.git

2.  Prep the Swift repo, per the instructions in the README. This will import other tools that Swift depends on (like cmark and some LLVM tools)

$ cd swift
./utils/update-checkout --clone

3.  Generate the Xcode project:

$ utils/build-script -X --skip-build -- --reconfigure

4. Now open AppCode and click File | Open...

5. Find the outer Swift directory you created in step 1 (the one that holds the Swift repo and all its associated tools).

6. Open the build folder, then open Xcode-DebugAssert. You'll see a folder that looks like swift-macosx-x86_64 (the exact name may differ based on your system).

Just open this folder in AppCode, wait for indexing to complete, and you'll be able to browse and Edit the Swift source from AppCode!

Sunday, July 26, 2015

GCD and Parallel Collections in Swift

One of the benefits of functional programming is that it's straightforward to parallelize operations. Common FP idioms like map, filter and reduce can be adapted so they run on many cores at once, letting you get instant parallelization wherever you find a bottleneck.

The benefits of these parallel combinators are huge. Wherever you find a bottleneck in your program, you can simply replace your call to map with a call to a parallel map and your code will be able to take advantage of all the cores on your system. On my eight-core system, for example, simply using a parallel map can theoretically yield an eight-fold speed boost. Of course, there are a few reasons you might not see that theoretical speed improvement: namely, the overhead of creating threads, splitting up the work, synchronizing data between the threads, etc. Nevertheless, if you profile your code and focus on hotspots, you can see tremendous improvements with simple changes.

Swift doesn't yet come with parallel collections functions, but we can build them ourselves, using Grand Central Dispatch:
// requires Swift 2.0 or higher
extension Array {
    public func pmap(transform: (Element -> T)) -> [T] {
        guard !self.isEmpty else {
            return []
        }
        
        var result: [(Int, [T])] = []
        
        let group = dispatch_group_create()
        let lock = dispatch_queue_create("pmap queue for result", DISPATCH_QUEUE_SERIAL)
        
        let step: Int = max(1, self.count / NSProcessInfo.processInfo().activeProcessorCount) // step can never be 0
        
        for var stepIndex = 0; stepIndex * step < self.count; stepIndex++ {
            let capturedStepIndex = stepIndex

            var stepResult: [T] = []
            dispatch_group_async(group, dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)) {
                for i in (capturedStepIndex * step)..<((capturedStepIndex + 1) * step) {
                    if i < self.count {
                        let mappedElement = transform(self[i])
                        stepResult += [mappedElement]
                    }
                }

                dispatch_group_async(group, lock) {
                    result += [(capturedStepIndex, stepResult)]
                }
            }
        }
        
        dispatch_group_wait(group, DISPATCH_TIME_FOREVER)
        
        return result.sort { $0.0 < $1.0 }.flatMap { $0.1 }
   }
}

pmap takes the same arguments as map but runs the function across all of your system's CPUs. Let's break the function down, step by step.
  1. In the case of an empty array, pmap returns early, since the overhead of splitting up the work and synchronizing the results is non-trivial. We might take this even further by falling back to standard map for arrays with a very small element count.
  2. Create a Grand Central Dispatch group that we can associate with the GCD blocks we'll run later on. Since all of these blocks will be in the same group, the invoking thread can wait for the group to be empty at the end of the function and know for certain that all of the background work has finished before returning to the caller.
  3. Create a dedicated, sequential lock queue to control access to the result array. This is a common pattern in GCD: simulating a mutex with a sequential queue. Since a sequential queue will never run two blocks simultaneously, we can be sure that whatever operations we perform in this queue will be isolated from one another.
  4. Next, pmap breaks the array up into "steps", based on the host machine's CPU count (since this is read at runtime from NSProcessInfo, this function will automatically scale up to use all available cores). Each step is dispatched to one of GCD's global background queues. In the invoking thread, this for loop will run very, very quickly, since all it does is add closures to background queues.
  5. The main for loop iterates through each "step," capturing the stepIndex in a local variable, capturedStepIndex. If we don't do this, the closures passed to dispatch_group_async will all refer to the same storage location - as the for loop increments, all of the workers will see stepIndex increase by one and will all operate on the same step. By capturing the variable, each worker has its own copy of stepIndex, which never changes as the for loop proceeds.
  6. We calculate the start and end indices for this step. For each array element in that range, we call transform on the element and add it to this worker's local stepResult array. Because it's unlikely that the number of elements in the array will be exactly divisible by a given machine's processor count, we check that i never goes beyond the end of the array, which could otherwise happen in the very last step.
  7. After an entire step has been processed, we add this worker's results to the master result array. Since the order in which workers will finish is nondeterministic, each element of the result array is a tuple containing the stepIndex and the transformed elements in that step's range. We use the lock queue to ensure that all changes to the result array are synchronized. 
      • Note that we only have to enter this critical section once for each core - an alternative implementation of pmap might create a single master result array of the same size as the input and set each element to its mapped result as it goes. But this would have to enter the critical section once for every array element, instead of just once for each CPU, generating more memory and processor contention and benefiting less from spatial locality. 
      • We use dispatch_sync instead of dispatch_async because we want to be sure that the worker's changes have been applied to the masterResults array before declaring this worker to be done. If we were to use dispatch_async, the scheduler could very easily finish all of the step blocks but leave one or more of these critical section blocks unprocessed, leaving us with an incomplete result.
  8. Back on the original thread, we call dispatch_group_wait, which waits until all blocks in the group have completed. At this point, we know that all work has been done and all changes to the master results array have been made.
  9. The final line sorts the master array by stepIndex (since steps finish in a nondeterministic order) and then flattens the master array in that order.
To see how this works, let's create a simple profile function:

func profile(desc: String, block: () -> A) -> Void {
    let start = NSDate().timeIntervalSince1970
    block()
    
    let duration = NSDate().timeIntervalSince1970 - start
    print("Profiler: completed \(desc) in \(duration * 1000)ms")

}
We'll test this out using a simple function called slowCalc, which adds a small sleep delay before each calculation, to ensure that each map operation does enough work. In production code, you should never sleep in code submitted to a GCD queue - this is purely to simulate a slow calculation for demonstration purposes. Without this little delay, the overhead of parallelization would be too great to see a speedup:

func slowCalc(x: Int) -> Int {
    NSThread.sleepForTimeInterval(0.1)
    return x * 2
}

let smallTestData: [Int] = [Int](0..<10)
let largeTestData = [Int](0..<300)

profile("large dataset (sequential)") { largeTestData.map { slowCalc($0) } }
profile("large dataset (parallel)") { largeTestData.pmap { slowCalc($0) } }

On my eight-core machine, this results in:

Profiler: completed large dataset (sequential) in 31239.7990226746ms
Profiler: completed large dataset (parallel) in 4005.04493713379ms

an 7.8-fold increase, which is about what you'd expect.

It's important thing to remember that if each iteration doesn't do enough work, the overhead of splitting up work, setting up worker blocks and synchronizing data access will far outweigh the time savings of parallelization. The amount of overhead involved can be surprising. This code is identical to the above, except that it doesn't add the extra delay.

profile("large dataset (sequential, no delay)") { largeTestData.map { $0 * 2 } }
profile("large dataset (parallel, no delay)") { largeTestData.pmap { $0 * 2 } }

On my machine, it results in:

Profiler: completed large dataset (sequential, no delay) in 53.4629821777344ms
Profiler: completed large dataset (parallel, no delay) in 161.548852920532ms

The parallel version is three times slower than the sequential version! This is a really important consideration when using parallel collection functions:
  1. Make sure that each of your iterations does enough work to make parallelization worth it.
  2. Parallel collections are not a panacea - you can't just sprinkle them throughout your code and assume you'll get a performance boost. You still need to profile for hotspots, and it's important to focus on bottlenecks found through profiling, rather than hunches about what parts of your code are slowest.
  3. Modern CPUs are blindingly fast - basic operations like addition or multiplication are so fast that it's not worth parallelizing these, unless your array is very large.
You can use the same techniques to implement a parallel filter function:

// requires Swift 2.0 or higher
extension Array {
    public func pfilter(includeElement: Element -> Bool) -> [Element] {
        guard !self.isEmpty else {
            return []
        }
        
        var result: [(Int, [Element])] = []
        
        let group = dispatch_group_create()
        let lock = dispatch_queue_create("pmap queue for result", DISPATCH_QUEUE_SERIAL)
        
        let step: Int = max(1, self.count / NSProcessInfo.processInfo().activeProcessorCount) // step can never be 0
        
        for var stepIndex = 0; stepIndex * step < self.count; stepIndex++ {
            let capturedStepIndex = stepIndex
            
            var stepResult: [Element] = []
            dispatch_group_async(group, dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)) {
                for i in (capturedStepIndex * step)..<((capturedStepIndex + 1) * step) {
                    if i < self.count && includeElement(self[i]) {
                        stepResult += [self[i]]
                    }
                }
                
                dispatch_group_async(group, lock) {
                    result += [(capturedStepIndex, stepResult)]
                }
            }
        }
        
        dispatch_group_wait(group, DISPATCH_TIME_FOREVER)
        
        return result.sort { $0.0 < $1.0 }.flatMap { $0.1 }
    }
}

This code is almost exactly identical to pmap - only the logic in the inner for loop is different.

We can now start using these combinators together (again, we have to use a slowed-down predicate function in order to see the benefit from parallelization):

func slowTest(x: Int) -> Bool {
    NSThread.sleepForTimeInterval(0.1)
    return x % 2 == 0
}

profile("large dataset (sequential)") { largeTestData.filter { slowTest($0) }.map { slowCalc($0) } }
profile("large dataset (sequential filter, parallel map)") { largeTestData.filter { slowTest($0) }.pmap { slowCalc($0) } }
profile("large dataset (parallel filter, sequential map)") { largeTestData.pfilter { slowTest($0) }.map { slowCalc($0) } }
profile("large dataset (parallel filter, parallel map)") { largeTestData.pfilter { slowTest($0) }.pmap { slowCalc($0) } }

which results in:

Profiler: completed large dataset (sequential) in 1572.28803634644ms
Profiler: completed large dataset (sequential filter, parallel map) in 1153.90300750732ms
Profiler: completed large dataset (parallel filter, sequential map) in 642.061948776245ms
Profiler: completed large dataset (parallel filter, parallel map) in 231.456995010376ms

Using one parallel combinator gives a slight improvement; combining the two parallel operations gives us an almost sevenfold performance improvement over the basic sequential implementation.

Here are some other directions to pursue:
  1. Implement parallel versions of find, any/exists and all. These are tricky because their contracts stipulate that processing stops as soon as they have a result. So you'll have to find some way to stop your parallel workers as soon as the function has its answer.
  2. Implement a parallel version of reduce. The benefit of doing this is that reduce is a "primitive" higher-order function - you can easily implement pmap and pfilter given an existing parallel reduce function.
  3. Generalize these functions to work on all collections (not just arrays), using Swift 2's protocol extensions.

Monday, December 15, 2014

Simple Combinators for Manipulating CGPoint/CGSize/CGRect with Swift

One of the most painful things about Objective-C was having to modify CGPoint, CGSize or CGRect values. The clunky struct interface made even simple modifications verbose and ugly, since struct expressions were read-only:

    CGRect imageBounds = self.view.bounds;
    imageBounds.size.height -= self.footer.bounds.size.height;

    self.imageView.bounds = imageBounds;

Even though we have auto-layout, I often find myself doing this kind of arithmetic with points, size or rects. In Objective-C, it required either generating dummy variables so you can modify members (as above), or really messy struct initialization syntax:

    self.imageView.bounds = (CGRect) { 
        .origin = self.view.bounds.origin,
        .size = CGSizeMake(self.view.bounds.size.width, self.view.bounds.size.height -    
                           self.footer.bounds.size.height) };

Fortunately, none of this boilerplate is necessary with Swift. Since Swift lets you extend even C structures with new methods, I wrote a handful of combinators that eliminate this kind of code. The above snippet can now be replaced with:

    self.imageView.bounds = self.view.bounds.mapHeight { $0 - self.footer.size.height }

I can easily enlarge a scroll view's content size to hold its pages:

    self.scrollView.contentSize = self.scrollView.bounds.size.mapWidth { $0 * CGFloat(pages.count) }

I can do calculations that previously would've required dozens of lines of code in just one or two:

    let topHalfFrame = self.view.bounds.mapHeight { $0 / 2 }
    let bottomHalfFrame = topHalfFrame.mapY { $0 + topHalfFrame.size.height }

These two lines will give me two frames that each take up half of the height of their parent view.

In cases where I simply need to set a value, I use the primitive "with..." functions:

    self.view.bounds.withX(0).withY(0).withSize(0).withHeight(0)

Note that these methods can all be chained to create complex expressions.

The code for these methods is trivial, yet they give you a huge boost in expressive power.

GitHub projecthttps://github.com/moreindirection/SwiftGeometry

Code

extension CGPoint {
    func mapX(f: (CGFloat -> CGFloat)) -> CGPoint {
        return self.withX(f(self.x))
    }
    
    func mapY(f: (CGFloat -> CGFloat)) -> CGPoint {
        return self.withY(f(self.y))
    }
    
    func withX(x: CGFloat) -> CGPoint {
        return CGPoint(x: x, y: self.y)
    }
    
    func withY(y: CGFloat) -> CGPoint {
        return CGPoint(x: self.x, y: y)
    }
}

extension CGSize {
    func mapWidth(f: (CGFloat -> CGFloat)) -> CGSize {
        return self.withWidth(f(self.width))
    }
    
    func mapHeight(f: (CGFloat -> CGFloat)) -> CGSize {
        return self.withHeight(f(self.height))
    }
    
    func withWidth(width: CGFloat) -> CGSize {
        return CGSize(width: width, height: self.height)
    }
    
    func withHeight(height: CGFloat) -> CGSize {
        return CGSize(width: self.width, height: height)
    }
}

extension CGRect {
    func mapX(f: (CGFloat -> CGFloat)) -> CGRect {
        return self.withX(f(self.origin.x))
    }
    
    func mapY(f: (CGFloat -> CGFloat)) -> CGRect {
        return self.withY(f(self.origin.y))
    }
    
    func mapWidth(f: (CGFloat -> CGFloat)) -> CGRect {
        return self.withWidth(f(self.size.width))
    }
    
    func mapHeight(f: (CGFloat -> CGFloat)) -> CGRect {
        return self.withHeight(f(self.size.height))
    }
    
    func withX(x: CGFloat) -> CGRect {
        return CGRect(origin: self.origin.withX(x), size: self.size)
    }
    
    func withY(y: CGFloat) -> CGRect {
        return CGRect(origin: self.origin.withY(y), size: self.size)
    }
    
    func withWidth(width: CGFloat) -> CGRect {
        return CGRect(origin: self.origin, size: self.size.withWidth(width))
    }
    
    func withHeight(height: CGFloat) -> CGRect {
        return CGRect(origin: self.origin, size: self.size.withHeight(height))
    }
}

Tuesday, August 26, 2014

NSNotificationCenter, Swift and blocks

The conventional way to register observers with NSNotificationCenter is to use the target-action pattern. While this gets the job done, it's inherently not type-safe.

For example, the following Swift snippet will compile perfectly:

    NSNotificationCenter.defaultCenter().addObserver(self, selector: Selector("itemAdded:"),
      name: MyNotificationItemAdded, object: nil)

even though at runtime it will fail unless self has a method named itemAdded that takes exactly one parameter (leaving off that last colon in the selector will turn this line into a no-op). Plus, this method gives you no way to take advantages of Swift's closures, which would allow the observer to access local variables in the method that adds the observer and would eliminate the need to create a dedicated method to handle the event.

A better way to do this is to use blocks. And NSNotificationCenter does include a block-based API:

    NSNotificationCenter.defaultCenter().addObserverForName(MyNotificationItemAdded, object: nil, queue: nil) { note in
      // ...
    }

This is much nicer, especially with Swift's trailing closure syntax. There are no method names to be looked up at runtime, we can refer to local variables in the method that registered the observer and we can perform small bits of logic in reaction to events without having to create and name dedicated methods.

The catch comes in resource management. It's very important that an object remove its event observers when it's deallocated, or else NSNotificationCenter will try to invoke methods on invalid pointers.

The traditional target-action method has the one advantage that we can easily handle this requirement with a single call in deinit:

  deinit {
    NSNotificationCenter.defaultCenter().removeObserver(self)
  }

With the block API, however, since there is no explicit target object, each call to addObserverForName returns "an opaque object to act as observer." So your observer class would need to track all of these objects and then remove them all from the notification center in deinit, which is a pain.

In fact, the hassle of having to do bookkeeping on the observer objects almost cancels out the convenience of using the block API. Frustrated by this situation, I sat down and created a simple helper class, NotificationManager:

class NotificationManager {
  private var observerTokens: [AnyObject] = []

  deinit {
    deregisterAll()
  }

  func deregisterAll() {
    for token in observerTokens {
      NSNotificationCenter.defaultCenter().removeObserver(token)
    }

    observerTokens = []
  }

  func registerObserver(name: String, block: (NSNotification -> Void)) {
    let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: nil, queue: nil, usingBlock: block)

    observerTokens.append(newToken)
  }
  
  func registerObserver(name: String, forObject object: AnyObject, block: (NSNotification -> Void)) {
    let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: object, queue: nil, usingBlock: block)
    
    observerTokens.append(newToken)
  }
}

First, this simple class provides a Swift-specialized API around NSNotificationCenter.  It provides an additional convenience method without an object parameter (rarely used, in my experience) to make it easier to use trailing-closure syntax. But most importantly, it keeps track of the observer objects generated when observers are registered, and removes them when the object is deinit'd.

A client of this class can simply keep a member variable of type NotificationManager and use it to register its observers. When the parent class is deallocated, the deinit method will automatically be called on its NotificationManager member variable, and its observers will be properly disposed of:

class MyController: UIViewController {
  private let notificationManager = NotificationManager()
  
  override init() {
    notificationManager.registerObserver(MyNotificationItemAdded) { note in
      println("item added!")
    }
    
    super.init()
  }
  
  required init(coder: NSCoder) {
    fatalError("decoding not implemented")
  }
}

When the MyController instance is deallocated, its NotificationManager member variable will be automatically deallocated, triggering the call to deregisterAll that will remove the dead objects from NSNotificationCenter.

In my apps, I add a notificationManager instance to my common UIViewController base class so I don't have to explicitly declare the member variable in all of my controller subclasses.

Another benefit of using my own wrapper around NSNotificationCenter is that I can add useful functionality, like group observers: an observer that's triggered when any one of a group of notifications are posted:

struct NotificationGroup {
  let entries: [String]
  
  init(_ newEntries: String...) {
    entries = newEntries
  }

}

extension NotificationManager {
  func registerGroupObserver(group: NotificationGroup, block: (NSNotification -> ()?)) {
    for name in group.entries {
      registerObserver(name, block: block)
    }
  }
}

This can be a great way to easily set up an event handler to run when, for example, an item is changed in any way at all:

   let MyNotificationItemsChanged = NotificationGroup(
      MyNotificationItemAdded,
      MyNotificationItemDeleted,
      MyNotificationItemMoved,
      MyNotificationItemEdited
    )

    notificationManager.registerGroupObserver(MyNotificationItemsChanged) { note in
      // ...
    }

Thursday, June 26, 2014

Unit Testing in Swift

Since Swift was released at the beginning of the month, I've been doing using it for most of my iOS development. It's been a pleasant experience: I've been able to discard huge amounts of boilerplate and take advantage of a few functional programming techniques that were previously unavailable on the iPhone and iPad.

One area where Swift has made huge improvements over Objective-C is unit tests. Objective-C's verbosity made it difficult to create small, focused classes to perform specific tasks. Plus, the language's insistence on keeping only one class to a file and the cumbersome pairing of every implementation file with a header imposed a hefty penalty on programmers who tried to divide their work up into discrete, testable components.

Unit testing in Swift is done with the same XCTest framework introduced back in Xcode 5 for Objective-C. But Swift's concision and its inclusion of modern language features like closures makes XCTest much more pleasant than it was to use under Objective-C. We'll walk through a very simple example of Swift unit testing below.

To get started, create an empty iOS Application project in Xcode called Counter. Xcode will generate a CounterTests folder for you and an associated test target.

First, let's create a simple class to be tested. Create the file "Counter.swift" and add the following code to it:

import Foundation

public class Counter {
  public var count: Int
  
  public init(count: Int) {
    self.count = count
  }
  
  public convenience init() {
    self.init(count: 0)
  }
  
  public func increment() {
    self.count++
  }

}

This is a very simple class, but it will be enough to illustrate how to use XCTest to test your own Swift code.

UPDATE: Note that as of Xcode 6.1, any symbols that you want to be visible in your test case should be declared public so they can be seen from the test target, which is distinct from your main application target. In the above example, the class above and any of its members that need to be accessed in the test case have been declared public. Thanks to Kaan Ersan for pointing this out in the comments.

Create a file called "CounterTest.swift" in the CounterTests folder Xcode generated for you (this simple test will be your "Hello, world" for Swift testing):

import XCTest
import Counter

class CounterTest: XCTestCase {
  func testSimpleAddition() {
    let counter = Counter()
    XCTAssertEqual(0, counter.count)
  }

}

NOTE: In the current version of Swift (Beta 2), you have to import your main target into the test target to get your tests to compile and run. This is why we import Counter at the top.

NOTE: I've seen a few Swift tutorials recommend that you use the built-in Swift function assert in your test cases - do not do this! assert will terminate your entire program if it fails. Using the XCTAssert functions provides a number of important benefits:

  • If one test case fails, your other cases can continue running; assert stops the entire program.
  • Because the XCTAssert functions are more explicit about what you're expecting, they can print helpful failure messages (e.g. "2 was not equal to 3") whereas assert can only report that its condition was false. There's a broad variety of assert functions, including XCTAssertLessThan, XCTAssertNil, etc.
  • The Swift language specification explicitly forbids string interpolation in the message passed to assert; the XCTAssert functions don't face this limitation.
To try your test code out, click "Test" on the "Product" menu. Your single test should pass.

We'll add two more test cases to create and exercise several instances of Counter and to ensure that the counter wraps around when it overflows:

import XCTest
import Test

class CounterTest: XCTestCase {
  func testInvariants() {
    let counter = Counter()
    XCTAssertEqual(0, counter.count, "Counter not initialized to 0")
    
    counter.increment()
    XCTAssertEqual(1, counter.count, "Increment is broken")

    XCTAssertEqual(1, counter.count, "Count has unwanted side effects!")
  }
  
  func testMultipleIncrements() {
    let counts = [1, 2, 3, 4, 5, 6]
    
    for count in counts {
      let counter = Counter()
      
      for i in 0..count {
        counter.increment()
      }
      
      XCTAssertEqual(counter.count, count, "Incremented value does not match expected")
    }
  }
  
  func testWraparound() {
    let counter = Counter(count: Int.max)
    counter.increment()
    
    XCTAssertEqual(counter.count, Int.min)
  }
}

These tests should pass as well.

You can find out more about XCTest in the Apple guide "Testing with Xcode." I hope this was helpful - please feel free to comment if anything is unclear.

Monday, February 11, 2013

anorm-typed: Statically-Typed SQL Queries for Scala Play Applications

The Play framework's default persistence framework, Anorm, is a very thin wrapper around JDBC (the whole library is about 800 lines of code). Although I like the idea of a framework that treats a database as a database - instead of trying to shoehorn databases into the OO paradigm - Anorm has never really appealed to me. Since it's just a wrapper around SQL, you end up writing lots of raw SQL in your application. This is a problem, because the Scala compiler and typechecker have no opportunity to check your database interaction for errors. As flawed as ORM approaches can be, at least they can generate valid SQL for you. Consider this Anorm call from the Play! documentation:
  SQL(
    """
      select * from Country c 
      join CountryLanguage l on l.CountryCode = c.Code 
      where c.code = {countryCode};
   """
  ).on("countryCode" -> "FRA")
Here are just some of the ways this code can go wrong at runtime:
  • A typo in an SQL keyword
  • A typo in a column or table name
  • Reference to a column or table that doesn't exist
  • A typo in the "countryCode" key passed to the "on" function
  • Passing in a non-string value for "countryCode"
  • A mismatch between the parameters named in the query string and the keys passed to "on"
With Anorm's primary competitors (SLICK and Squeryl), you create mappings between columns and class fields, then use a query DSL to translate Scala Collections-like code into SQL. These frameworks are still vulnerable to some of the above problems, but they have some advantages:
  • You map each column only once, so if you get the column's name or type wrong, there's only one place to correct it, and then the rest of your program will be free of that particular bug.
  • These frameworks generate SQL themselves from a simple Scala DSL, so most syntax errors are ruled out.
Yet, these frameworks also introduce a number of issues:
  • You need to manually maintain model mappings that can drift out of sync with the database
  • The DSL's these libraries provide are necessarily limited. Some queries that would be straightforward and fast with pure SQL are simply inexpressible in these DSL's.
  • Both mappings are database-agnostic. This has obvious advantages, but if you need to take advantage of a database-specific data type, function or syntactical convenience, you're out of luck.
About a month ago, Play developer Guillaume Bort announced a proof-of-concept implementation of a statically-checked version of Anorm, Play's persistence framework (source on Github). The framework was inspired by Joni Freeman's sqltyped framework. The main API of anorm-typed is the TypedSQL macro. When you compile a Scala file that contains TypedSQL calls, these calls are expanded into type-safe code that accepts parameters for any placeholders in the SQL and returns a tuple based on the column types selected in the query. Here's a short example:

  // assume
  // CREATE TABLE users(
  //    id integer,
  //    best_friend_id integer,
  //    name varchar(256)
  // );

  val q = TypedSQL("select * from users")
  q().list() // returns List[(Int, Int, String)]

  val q2 = TypedSQL("select name from users where id = ?")
  q2(5).single() // returns String

  val q3 = TypedSQL("update users set name = ? where id = 5")
  q3("Tyrone Slothrop").execute()

The anorm-typed module will catch every one of the errors I listed above - before your application can even compile. Note that everything here is type-checked, and that the code simply will not compile if we make a mistake matching Scala types to SQL types, if the SQL has a syntax error, if we use a nonexistent column, or if we provide the wrong number of arguments to the query. Awesome. Of course, there are some drawbacks to this approach:
  • The TypedSQL macro needs to connect to your database during compilation. This can cause a number of issues:
    • CI servers or other automated builds will need to be able to access a database to finish compilation
    • IDE's have no idea what to do with the TypedSQL macro - IntelliJ highlights every call as an error, even though the code compiles fine.
Still, this is pretty close to my holy grail for database interaction. I'm planning to set aside some time to work on an alternative implementation that would suit my needs a little better: instead of a macro, I'm planning to build an SBT plugin for Play apps that would, as with the conf/routes compiler, compile a list of SQL queries into an autogenerated file.