Feb 2 2010

BNRPersistence

I just open sourced a simple persistence layer for Cocoa and iPhone that uses Tokyo Cabinet. If you are frustrated with Core Data, you might find it useful. Check it out on GitHub.

Here is the Readme:

After a few years of whining that Core Data could have been better, I thought I should write a persistence framework that points in what I think is the right direction. And this is it.

One big difference? Objects can have ordered relationships. For example, a playlist of songs is an ordered collection. This is awkward to do in Core Data because it uses a relational database.

Another big difference? It doesn’t use SQLite, but rather a key-value store called Tokyo Cabinet.

BNRPersistence is not really a framework at the moment, just a set of classes that you can include in your project.

Install

First, you need to download Tokyo Cabinet:
http://1978th.net/tokyocabinet/
(There is a sourceforge page, but the latest build seems to be on this site.)

In Terminal.app, untar the tarball and cd into the resulting directory.

(You want 64-bit library? In Terminal, set an environment variable for 64-bit builds:

export CFLAGS='-arch x86_64'

)

Configure, build, and install:

./configure
make
sudo make install

Now, you have a /usr/local/lib/usr/local/lib/libtokyocabinet.a that needs to linked into any project that uses these classes. (I usually use the “Add->Existing Frameworks” menu item to do this.)

You’ll also need to have /usr/local/include/ among your header search path. (See the Xcode target’s build info to add this.)

Now just add the classes in the BNRPersistence directory into your project. (If you copy them, you won’t get the new version when you update your git repository. This may be exactly what you want, but these are pretty immature, so I would suggest that you link to them instead.)

Using it

You’ve used Core Data? The BNRStore is analogous to NSManagedObjectContext. BNRStoredObject is analogous to NSManagedObject. There is no model, instead, like archiving, you must have two methods in you BNRStoredObject subclass. Here are the methods from a Playlist class:

- (void)readContentFromBuffer:(BNRDataBuffer *)d
{
    [title release];
    title = [[d readString] retain];
    
    [songs release];
    songs = [d readArrayOfClass:[Song class]
                     usingStore:[self store]];
    [songs retain];
}

- (void)writeContentToBuffer:(BNRDataBuffer *)d
{
    [d writeString:title];
    [d writeArray:songs ofClass:[Song class]];
}

So, BNRDataBuffers are like NSData, but they have methods for reading and writing different types of data. (It does byte-swapping so you can move the data files from PPC to Intel without a problem.)

(I certainly debate the idea of a model file that would replace these methods, but for now this is fast and simple. If you would rather have a model file than implement these methods, you are invited to write a BNRPersistenceModelEditor.)

All the instances of class will be stored in a single TokyoCabinet file, thus you will be saving to a directory containing one file for each class that you are storing.

To create a store, first you must create a backend. I have tried BerkeleyDB and GDBM, but I am quite enamored with Tokyo Cabinet right now. (If you would like to try the other backends, write me and I’ll send you the necessary classes.)

NSError *error;
NSString *path = @"/tmp/complextest/";
BNRTCBackend *backend = [[BNRTCBackend alloc] initWithPath:path
                                                     error:&error];
if (!backend) {
    NSLog(@"Unable to create database at %@", path);
    ...display error here... 
}

Now that you have a backend, you can create a store and tell it which classes you are going to be saving or loading:

BNRStore *store = [[BNRStore alloc] init];
[store setBackend:backend];
[backend release];
    
[store addClass:[Song class]];
[store addClass:[Playlist class]];

(You must add the classes in the same order every time. The classID of a class (see BNRClassMetaData) is determined by the order)

Now to get a list of of the playlists in the store:

NSArray *allPlaylists = [store allObjectsForClass:[Playlist class]];

To insert a new playlist:

Playlist *playlist = [[Playlist alloc] init];
[store insertObject:playlist];

To delete a playlist:

[store deleteObject:playlist];

To update a playlist:

[store willUpdateObject:playlist];
[playlist setTitle:@"CarMusic"];

(Yes, this is a place where a model file would make the framework cooler: I could become an observer of this object and get notified when the value changed.)

(As a side-note, there is some experimental support for automatically updating the undo managed. Just give the store an undo manager. This also could be made better with a model file.)

To save all the changes:

BOOL success = [store saveChanges:&error];
if (!success) {
    NSLog(@"error = %@", [error localizedDescription]);
    return -1;
}

Each class in a store can have a version. This is kept in the BNRClassMetaData object for the class. You can reach this in your readContentFromBuffer method (because you have access to the store).

In your BNRStoredObject class, you can implement these two methods if you wish:

- (void)prepareForDelete;
- (void)dissolveAllRelationships;

prepareForDelete is where you implement your delete rule: When a song is deleted, for example, it needs to remove itself from any playlists it is in.

dissolveAllRelationships is a fix for a common problem. You close the document, but the objects in the document have retain cycles so they don’t get deallocated properly. In dissolveAllRelationships, your stored objects (the ones in memory) are being asked to release any other stored objects they are retaining.

In the directory, you will find TCSpeedTest and CDSpeedTest. These are command-line tools that compare the speed of some tasks in BNRPersistence (TCSpeedTest) and Core Data (CDSpeedTest)

Your mileage may vary, but I see:

Creating 1,000 playlists, 100,000 songs, and 100 songs in each playlist and saving (ComplexInsertTest):
BNRPersistence is 10 times faster than CoreData

Reading in the playlist and getting the title of the first song in each playlist (ComplexFetchTest):
BNRPersistence is 13 times faster than CoreData

Creating 1,000,000 songs and inserting them and saving (SimpleInsertTest):
BNRPersistence is 17 times faster than CoreData

Fetching in 1,000,000 songs (SimpleFetchTest)
CoreData is a little faster than BNRPersistence
(BNRPersistence is single-threaded and CoreData has some clever multi-threading in this case. I think I can do similar tricks in BNRPersistence and catch up in this case. In either case, it is very, very fast. On my machine, fetching a million songs takes 3 seconds the first time and 2 seconds the second time.)

Getting it on the phone

The first problem is that you need to compile TokyoCabinet for arm. I tried every configure trick I could come up with and then just created an Xcode static library project and dumped the source into the project. This project is in the repository.

When you link to the resulting static library, you will also need to link in libz, which is part of the iPhone SDK

License

My code is under the MIT license and Tokyo Cabinet is under the LGPL. I think this will enable you to use it how you want to use it. If something bad happens because of the code, you can’t sue us. And if you make changes to Tokyo Cabinet, I think you need to submit those changes to the author. But I’m not a lawyer…

To Do

I recognize that there is room for improvement here:

1) The creation of a model-file architecture and editor
2) Add Tokyo Distopia to make full-text search fast
3) Use B+ trees to make attributes indexable
4) Better automatic undo support
5) Automatic syncing to a web service
6) Easy hooks for QuickLook images and Spotlight metadata in BNRStoreDocument
7) Hook it up to Tokyo Tyrant for non-local storage

9 Comments

  1. Jonathan Wight

    Interesting stuff.

    What about multi-threading? Esp. multiple writers (although not to the same records)? CoreData provides a good model IMO with multiple MOCs, entity identifiers and conflict resolution. Any existing support/plans with BNRPersistence?

  2. Clay Bridges

    Is there any advantage to this vs. the standard NSCoding (that is, I don’t need CoreData)?

  3. Harry Jordan

    @Jonathan Wight,
    Re. multithreading, this question came up at Aaron’s presentation at NSConference. It can’t handle access from multiple threads, but it doesn’t have to run on the main thread. I think I’ll probably end up writing a wrapper that using either locks or dispatch_queues.

    Another thing that doesn’t seem to be highlighted here.. it’s hellish fast.

    In the test that Aaron showed us, it took .3 of a second vs 8 point something for Core Data; and .107 of a second to compared with something like 4.2 seconds for Core Data. Admittedly the test was staked in BNRPersistence’s favour for this test (ordered lists of stuff), but it was damn impressive.

  4. pixelmixture

    what do you find frustrating with coredata?

  5. John Joyce

    How much of the speed is BNR implementation and how much is TC itself?
    I see they’ve got a Ruby API which cries out MacRuby as well… (if not Rails back end from hell)

  6. Fat Johnnie

    Could it be that this is a lot of effort to avoid learning how to use MySQL properly?!

  7. asdf

    Note that using your instructions, Tokyo Cabinet is built as a static library. Fulfilling the requirements of the LGPL when linking statically requires you to either redistribute your application in source format or as a set of linkable object files.

  8. Andris

    “One big difference? Objects can have ordered relationships. For example, a playlist of songs is an ordered collection. This is awkward to do in Core Data because it uses a relational database.” – and what about ORDER BY in SQL? Or maybe I don’t get the explanation of this big difference.

  9. Andris

    I looked at your NSConverence session (http://vimeo.com/17081942) where you discuss this problem… I think I have to understand Core Data better… Because fetching the first song of a 1000 playlists in SQL is very easy and should take milliseconds.

Leave a Comment

Join the discussion. Do not worry, your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>