Anh's blog: A note on CAN implementation

Content Addressable Network is an unique structured P2P system. Its topology is a d-dimensional torus. This overlay offer constant states (number of neighbours per nodes) and short hop counts (in terms of routing path length). Some works cited CAN as using hyper-cube space as its topology, which is a mistake. It is clearly stated in the original CAN paper that the space wrapped around, meaning the shape of the space is a torus.

Having a number of interesting properties, CAN is remarkably unpopular compared to Chord and Pastry, with respect to implementations. The keys challenges that a CAN implementation is to address are as below (solutions to them involve thinking in high dimension space):

1. What is the nature of a peer's ID? How is its generated and changed during the evolution of the network?

2. How to determine if two peers are neighbors, in d-dimensional tori.

3. How to determine if one peer can "takeover" another? In other words, can two neighbors be merged?

4. How to compute the distance from a search key to a peer ?

Googling an existing implementation of CAN will most likely yield nothing. Thanks to Ryan Huebsch (author of the paper titled: Content-based multicast: comparison of implementation options), a Java implementation of the system comes out of its hidding corner. As part of the PIER project, Ryan's group has implemented the CAN overlay, whose source can be obtained in the project's website.

Reading through the code, the challenges mentioned above can be addressed as follows:

1. ID of a peer IS the coordinate of its zone. In particular, it will be the vector ((l_1,h_1), (l_2,h_2), ..,(l_d,h_d)) where l_i, h_i is the low and high value of the edge in the i^th dimension. As new peer joins, the zone is split and as the consequence the peer's ID changes. Notice that in the PhD thesis of the CAN's author, an alternative view of the ID as path from root to leaf of a partition tree is not easy to implement.

2. In a d-dimensional torus, two peers (or zones) are neighbors if their edges overlap in exactly (d-1) dimensions and abut in exactly 1 dimension. The overlapping and abut properties can be easily checked using the above representation of a zone (the edge in dimension i is the pair (l_i,h_i)). One must also take the space-wrapping into consideration. Basically, one can see the dimension i as a ring, instead of a straight line.

3. To see if zone P and Q can be merged together, the original paper of CAN provide an inelegant solution based on the partition tree. As a result of that, a peer must also store an history of other nodes it took-over, as opposed to simply storing a set of coordinates as in (1). In this implementation, P and Q are mergable if their areas added up to the new zone's area, whose coordinate set is constructed from that of P and Q. In particular: (l_i,h_i) = (min(l_i',l_i''), max(h_i',h_i'')). Simple!

4. The key is represent by its coordinate (k_1,k_2,..,k_d). Its distance from/towards a zone P is computed by first identifying the closest point in the zone to K. Then the distance is sqrt((k_1-p_1)^2 + (k_2-p_2)^2 + ... + (k_d-p_d)^2)

Anh's blog

Wednesday, June 17, 2009

A note on CAN implementation

0 Comments:

Previous Posts

About Me