Writing datastructures requiring pointers/references in Clojure?

Question

I've been working on toy a Database in Clojure and wanted to implement a B+ Tree. When I started thinking about it, I realised there may not be a way to have something like a pointer/reference to other nodes in Clojure. It doesn't matter for something like a BST or a lot of other Tree structures since all you need is to store a Node's child. But what do I do in something like a B+ tree where I need to be able to refer to a Node's sibling?

When looking for solutions, I came across a post in Google Groups about how you don't implement a Doubly linked list in Clojure because there are other ways of doing things in Clojure.

What do I do for a B+ Tree though?

i don't understand the problem completely, but out of curiosity: why linked list ? couldn't just solve the problem in some other way ? — Belun
– Belun, Commented Nov 22, 2010 at 8:28
I wasn't trying to implement a linked list, but mentioned it because the problem I was having with implementing a B+ tree, i.e, dealing with references to nodes, is the same that I'd encounter while dealing with Linked lists. — Tachy
– Tachy, Commented Nov 24, 2010 at 15:33

Rob Lachlan · Accepted Answer · 2010-11-21 20:56:23Z

It's not that it's difficult to have references to objects in clojure; but generally, these references are immutable. It's immutability which makes the doubly linked list impossible, because unlike a singly-linked list, you can't change any part of it without creating a mutation somewhere.

To see this, suppose I have a singly linked list,

a -> b -> c

and suppose I want to change the head of it. I can do so, with changing the entirety of the list. I create a new list by creating a new value for the head value, and reuse the tail:

a'-> b -> c

But doubly linked lists are impossible. So in clojure, and other functional languages, we sometimes use a zipper in such situations.

Now, suppose you really need mutable references in Clojure -- how do it? Well, depending on what concurrency semantics you need, clojure has vars, refs, atoms, etc.

Also, with deftype, you can create objects that have mutable fields, and these mutable fields can hold references to other things. You can also use raw java arrays in clojure for this same purpose.

Is your database going to be an in-memory database, or a disk-backed database? If on disk, I think that the issue of pointer swizzling is trickier than that of having mutable references.

Getting back to the issue of functional data structures, I believe that it is possible to create B-trees which have purely functional semantics. The first clue here is that it's a tree, and trees are the bread butter and meat of functional data structures. Secondly, note that there are databases which work in an append-only fashion -- couchDB for instance. This has the benefit that the database is its own log, in a sense. To get more of an idea of the costs and benefits of this approach you might want to watch Slava Akhmechet's presentation. His company, RethinkDB, eventually took a sort of hybrid approach, IIRC.

Thanks for your answer. I will look into things you pointed out. My toy DB will probably be in memory for now. It seems like I'm going to learn a bunch about Clojure in the process.

ivar · Accepted Answer · 2010-11-22 09:17:33Z

1

You may wish to look at Chouser's finger trees in Clojure to see how the functionality of a doubly-linked list may be implemented using functional style.

Alternatively, you may simply want to step back and ask yourself why you believe that B+ is a good choice of data structure for a functional language.

If you are unfamiliar with the alternatives, you may want to look at Chris Okazaki's book "Purely Functional Data Structures."

answered Nov 22, 2010 at 9:17

ivar

1,49413 silver badges20 bronze badges

1 Comment

Tachy Over a year ago

Thanks for the book suggestion. I will try to find an alternate to the traditional B+ Tree. The reason I wanted to use a B+ tree is because I wanted to eventually store the data on disk and B+ trees are great for such scenarios, especially range queries with few disk accesses.

Collectives™ on Stack Overflow

Writing datastructures requiring pointers/references in Clojure?

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related