文章目录
1、概述
2、epoll 与红黑树
3、红黑树介绍
4、应用开发方法
5、性能简单测试
6、github 完整源码
7、其他红黑树学习资源
1、概述
本文主要描述红黑树的概念、经典应用场景,并在应用开发层面示例代码说明红黑树的高效特性。
2、epoll 与红黑树
epoll 的高效就在于,当我们调用 epoll_ctl 保存上百万个句柄,epoll_wait 仍然可以飞快的返回,并有效的将发生事件的句柄给我们用户。这是因为我们在调用epoll_create 时,linux 内核除了帮我们在 epoll 文件系统里建了个 file 结点,以及在其内核 cache 里建了个红黑树用于存储以后 epoll_ctl 传来的 socket 外,还会再建立一个 list 链表,用于存储准备就绪的事件,当 epoll_wait 调用时,仅仅观察这个 list 链表里有没有数据即可。有数据就返回,没有数据就 sleep,等到 timeout 时间到后即使链表没数据也返回。所以,epoll_wait 非常高效。
而且,通常情况下即使我们要监控百万计的句柄,大多一次也只返回很少量的准备就绪句柄而已,所以,epoll_wait 仅需要从内核态 copy 少量的句柄到用户态而已,如何能不高效?!
那么,这个准备就绪 list 链表是怎么维护的呢?当我们执行 epoll_ctl 时,除了把 socket 放到 epoll 文件系统里 file 对象对应的红黑树上之外,还会给内核中断处理程序注册一个回调函数,告诉内核,如果这个句柄的中断到了,就把它放到准备就绪 list 链表里。所以,当一个 socket 上有数据到了,内核在把网卡上的数据 copy 到内核中后就来把 socket 插入到准备就绪链表里了。
如此,一颗红黑树,一张准备就绪句柄链表,少量的内核 cache,就帮我们解决了大并发下的 socket 处理问题。执行 epoll_create 时,创建了红黑树和就绪链表,执行 epoll_ctl 时,如果增加 socket 句柄,则检查在红黑树中是否存在,存在立即返回,不存在则添加到树干上,然后向内核注册回调函数,用于当中断事件来临时向准备就绪链表中插入数据。执行 epoll_wait 时立刻返回准备就绪链表里的数据即可。红黑树提供了对于上百万节点的数据存储和快速查找功能,对于 epoll 的高效起到了决定性作用。
本段描述基于下面链接内容修改而来:
http://blog.csdn.net/russell_tao/article/details/7160071
其他参考:
http://www.cnblogs.com/sniperHW/p/3619384.html
3、红黑树介绍
R-B Tree,全称是Red-Black Tree,又称为“红黑树”,它一种特殊的二叉查找树。红黑树的每个节点上都有存储位表示节点的颜色,可以是红(Red)或黑(Black)。
红黑树的特性:
(1)每个节点或者是黑色,或者是红色。
(2)根节点是黑色。
(3)每个叶子节点(NIL)是黑色。 [注意:这里叶子节点,是指为空(NIL或NULL)的叶子节点!]
(4)如果一个节点是红色的,则它的子节点必须是黑色的。
(5)从一个节点到该节点的子孙节点的所有路径上包含相同数目的黑节点。
注意:
(1) 特性(3)中的叶子节点,是只为空(NIL或null)的节点。
(2) 特性(5),确保没有一条路径会比其他路径长出俩倍。因而,红黑树是相对是接近平衡的二叉树。
参考:
http://www.cnblogs.com/skywang12345/p/3624202.html
红黑树的图形化介绍:
http://www.cnblogs.com/yangecnu/p/Introduce-Red-Black-Tree.html
红黑树接口逻辑示意图:
http://www.cnblogs.com/chengxuyuancc/archive/2013/04/06/3002044.html
4、应用开发方法
(1)应用场景
红黑树的应用比较广泛,主要是用它来存储有序的数据,它的时间复杂度是O(lgn),效率非常之高。例如:Epoll 实现、Java集合中的 TreeSet 和 TreeMap、C++ STL 中的 set、map,以及 Linux 虚拟内存的管理,都是通过红黑树去实现的。
参考:http://www.cnblogs.com/skywang12345/p/3624202.html
(2)源码来源:linux 源码、网上、nginx 、自实现
最早认为红黑树的功能可以在 linux 系统中直接引入头文件使用,找到其头文件的位置,引用后编译有很多的找不到的头文件,通过查找资料发现,红黑树源码可以由其他几个方式获取 —— linux 源码、网上、nginx 、自实现;其中在 linux 源码中解藕出来如下面的链接1所示内容,其中也包含其他情况的实现链接。
代码:
rbtree.h
#ifndef _LINUX_RBTREE_H
#define _LINUX_RBTREE_H
#if defined(container_of)
#undef container_of
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
#else
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
#endif
#if defined(offsetof)
#undef offsetof
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#else
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#endif
#undef NULL
#if defined(__cplusplus)
#define NULL 0
#else
#define NULL ((void *)0)
#endif
struct rb_node
{
unsigned long rb_parent_color;
#define RB_RED 0
#define RB_BLACK 1
struct rb_node *rb_right;
struct rb_node *rb_left;
} __attribute__((aligned(sizeof(long))));
/* The alignment might seem pointless, but allegedly CRIS needs it */
struct rb_root
{
struct rb_node *rb_node;
};
#define rb_parent(r) ((struct rb_node *)((r)->rb_parent_color & ~3))
#define rb_color(r) ((r)->rb_parent_color & 1)
#define rb_is_red(r) (!rb_color(r))
#define rb_is_black(r) rb_color(r)
#define rb_set_red(r) do { (r)->rb_parent_color &= ~1; } while (0)
#define rb_set_black(r) do { (r)->rb_parent_color |= 1; } while (0)
static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p)
{
rb->rb_parent_color = (rb->rb_parent_color & 3) | (unsigned long)p;
}
static inline void rb_set_color(struct rb_node *rb, int color)
{
rb->rb_parent_color = (rb->rb_parent_color & ~1) | color;
}
#define RB_ROOT (struct rb_root) { NULL, }
#define rb_entry(ptr, type, member) container_of(ptr, type, member)
#define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL)
#define RB_EMPTY_NODE(node) (rb_parent(node) == node)
#define RB_CLEAR_NODE(node) (rb_set_parent(node, node))
static inline void rb_init_node(struct rb_node *rb)
{
rb->rb_parent_color = 0;
rb->rb_right = NULL;
rb->rb_left = NULL;
RB_CLEAR_NODE(rb);
}
extern void rb_insert_color(struct rb_node *, struct rb_root *);
extern void rb_erase(struct rb_node *, struct rb_root *);
typedef void (*rb_augment_f)(struct rb_node *node, void *data);
extern void rb_augment_insert(struct rb_node *node,
rb_augment_f func, void *data);
extern struct rb_node *rb_augment_erase_begin(struct rb_node *node);
extern void rb_augment_erase_end(struct rb_node *node,
rb_augment_f func, void *data);
/* Find logical next and previous nodes in a tree */
extern struct rb_node *rb_next(const struct rb_node *);
extern struct rb_node *rb_prev(const struct rb_node *);
extern struct rb_node *rb_first(const struct rb_root *);
extern struct rb_node *rb_last(const struct rb_root *);
/* Fast replacement of a single node without remove/rebalance/add/rebalance */
extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
struct rb_root *root);
static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
struct rb_node ** rb_link)
{
node->rb_parent_color = (unsigned long )parent;
node->rb_left = node->rb_right = NULL;
*rb_link = node;
}
#endif /* _LINUX_RBTREE_H */
rbtree.c
#include "rbtree.h"
static void __rb_rotate_left(struct rb_node *node, struct rb_root *root)
{
struct rb_node *right = node->rb_right;
struct rb_node *parent = rb_parent(node);
if ((node->rb_right = right->rb_left))
rb_set_parent(right->rb_left, node);
right->rb_left = node;
rb_set_parent(right, parent);
if (parent)
{
if (node == parent->rb_left)
parent->rb_left = right;
else
parent->rb_right = right;
}
else
root->rb_node = right;
rb_set_parent(node, right);
}
static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
{
struct rb_node *left = node->rb_left;
struct rb_node *parent = rb_parent(node);
if ((node->rb_left = left->rb_right))
rb_set_parent(left->rb_right, node);
left->rb_right = node;
rb_set_parent(left, parent);
if (parent)
{
if (node == parent->rb_right)
parent->rb_right = left;
else
parent->rb_left = left;
}
else
root->rb_node = left;
rb_set_parent(node, left);
}
void rb_insert_color(struct rb_node *node, struct rb_root *root)
{
struct rb_node *parent, *gparent;
while ((parent = rb_parent(node)) && rb_is_red(parent))
{
gparent = rb_parent(parent);
if (parent == gparent->rb_left)
{
{
register struct rb_node *uncle = gparent->rb_right;
if (uncle && rb_is_red(uncle))
{
rb_set_black(uncle);
rb_set_black(parent);
rb_set_red(gparent);
node = gparent;
continue;
}
}
if (parent->rb_right == node)
{
register struct rb_node *tmp;
__rb_rotate_left(parent, root);
tmp = parent;
parent = node;
node = tmp;
}
rb_set_black(parent);
rb_set_red(gparent);
__rb_rotate_right(gparent, root);
} else {
{
register struct rb_node *uncle = gparent->rb_left;
if (uncle && rb_is_red(uncle))
{
rb_set_black(uncle);
rb_set_black(parent);
rb_set_red(gparent);
node = gparent;
continue;
}
}
if (parent->rb_left == node)
{
register struct rb_node *tmp;
__rb_rotate_right(parent, root);
tmp = parent;
parent = node;
node = tmp;
}
rb_set_black(parent);
rb_set_red(gparent);
__rb_rotate_left(gparent, root);
}
}
rb_set_black(root->rb_node);
}
static void __rb_erase_color(struct rb_node *node, struct rb_node *parent,
struct rb_root *root)
{
struct rb_node *other;
while ((!node || rb_is_black(node)) && node != root->rb_node)
{
if (parent->rb_left == node)
{
other = parent->rb_right;
if (rb_is_red(other))
{
rb_set_black(other);
rb_set_red(parent);
__rb_rotate_left(parent, root);
other = parent->rb_right;
}
if ((!other->rb_left || rb_is_black(other->rb_left)) &&
(!other->rb_right || rb_is_black(other->rb_right)))
{
rb_set_red(other);
node = parent;
parent = rb_parent(node);
}
else
{
if (!other->rb_right || rb_is_black(other->rb_right))
{
rb_set_black(other->rb_left);
rb_set_red(other);
__rb_rotate_right(other, root);
other = parent->rb_right;
}
rb_set_color(other, rb_color(parent));
rb_set_black(parent);
rb_set_black(other->rb_right);
__rb_rotate_left(parent, root);
node = root->rb_node;
break;
}
}
else
{
other = parent->rb_left;
if (rb_is_red(other))
{
rb_set_black(other);
rb_set_red(parent);
__rb_rotate_right(parent, root);
other = parent->rb_left;
}
if ((!other->rb_left || rb_is_black(other->rb_left)) &&
(!other->rb_right || rb_is_black(other->rb_right)))
{
rb_set_red(other);
node = parent;
parent = rb_parent(node);
}
else
{
if (!other->rb_left || rb_is_black(other->rb_left))
{
rb_set_black(other->rb_right);
rb_set_red(other);
__rb_rotate_left(other, root);
other = parent->rb_left;
}
rb_set_color(other, rb_color(parent));
rb_set_black(parent);
rb_set_black(other->rb_left);
__rb_rotate_right(parent, root);
node = root->rb_node;
break;
}
}
}
if (node)
rb_set_black(node);
}
void rb_erase(struct rb_node *node, struct rb_root *root)
{
struct rb_node *child, *parent;
int color;
if (!node->rb_left)
child = node->rb_right;
else if (!node->rb_right)
child = node->rb_left;
else
{
struct rb_node *old = node, *left;
node = node->rb_right;
while ((left = node->rb_left) != NULL)
node = left;
if (rb_parent(old)) {
if (rb_parent(old)->rb_left == old)
rb_parent(old)->rb_left = node;
else
rb_parent(old)->rb_right = node;
} else
root->rb_node = node;
child = node->rb_right;
parent = rb_parent(node);
color = rb_color(node);
if (parent == old) {
parent = node;
} else {
if (child)
rb_set_parent(child, parent);
parent->rb_left = child;
node->rb_right = old->rb_right;
rb_set_parent(old->rb_right, node);
}
node->rb_parent_color = old->rb_parent_color;
node->rb_left = old->rb_left;
rb_set_parent(old->rb_left, node);
goto color;
}
parent = rb_parent(node);
color = rb_color(node);
if (child)
rb_set_parent(child, parent);
if (parent)
{
if (parent->rb_left == node)
parent->rb_left = child;
else
parent->rb_right = child;
}
else
root->rb_node = child;
color:
if (color == RB_BLACK)
__rb_erase_color(child, parent, root);
}
static void rb_augment_path(struct rb_node *node, rb_augment_f func, void *data)
{
struct rb_node *parent;
up:
func(node, data);
parent = rb_parent(node);
if (!parent)
return;
if (node == parent->rb_left && parent->rb_right)
func(parent->rb_right, data);
else if (parent->rb_left)
func(parent->rb_left, data);
node = parent;
goto up;
}
/* * after inserting @node into the tree, update the tree to account for * both the new entry and any damage done by rebalance */
void rb_augment_insert(struct rb_node *node, rb_augment_f func, void *data)
{
if (node->rb_left)
node = node->rb_left;
else if (node->rb_right)
node = node->rb_right;
rb_augment_path(node, func, data);
}
/* * before removing the node, find the deepest node on the rebalance path * that will still be there after @node gets removed */
struct rb_node *rb_augment_erase_begin(struct rb_node *node)
{
struct rb_node *deepest;
if (!node->rb_right && !node->rb_left)
deepest = rb_parent(node);
else if (!node->rb_right)
deepest = node->rb_left;
else if (!node->rb_left)
deepest = node->rb_right;
else {
deepest = rb_next(node);
if (deepest->rb_right)
deepest = deepest->rb_right;
else if (rb_parent(deepest) != node)
deepest = rb_parent(deepest);
}
return deepest;
}
/* * after removal, update the tree to account for the removed entry * and any rebalance damage. */
void rb_augment_erase_end(struct rb_node *node, rb_augment_f func, void *data)
{
if (node)
rb_augment_path(node, func, data);
}
/* * This function returns the first node (in sort order) of the tree. */
struct rb_node *rb_first(const struct rb_root *root)
{
struct rb_node *n;
n = root->rb_node;
if (!n)
return NULL;
while (n->rb_left)
n = n->rb_left;
return n;
}
struct rb_node *rb_last(const struct rb_root *root)
{
struct rb_node *n;
n = root->rb_node;
if (!n)
return NULL;
while (n->rb_right)
n = n->rb_right;
return n;
}
struct rb_node *rb_next(const struct rb_node *node)
{
struct rb_node *parent;
if (rb_parent(node) == node)
return NULL;
/* If we have a right-hand child, go down and then left as far as we can. */
if (node->rb_right) {
node = node->rb_right;
while (node->rb_left)
node=node->rb_left;
return (struct rb_node *)node;
}
/* No right-hand children. Everything down and left is smaller than us, so any 'next' node must be in the general direction of our parent. Go up the tree; any time the ancestor is a right-hand child of its parent, keep going up. First time it's a left-hand child of its parent, said parent is our 'next' node. */
while ((parent = rb_parent(node)) && node == parent->rb_right)
node = parent;
return parent;
}
struct rb_node *rb_prev(const struct rb_node *node)
{
struct rb_node *parent;
if (rb_parent(node) == node)
return NULL;
/* If we have a left-hand child, go down and then right as far as we can. */
if (node->rb_left) {
node = node->rb_left;
while (node->rb_right)
node=node->rb_right;
return (struct rb_node *)node;
}
/* No left-hand children. Go up till we find an ancestor which is a right-hand child of its parent */
while ((parent = rb_parent(node)) && node == parent->rb_left)
node = parent;
return parent;
}
void rb_replace_node(struct rb_node *victim, struct rb_node *new,
struct rb_root *root)
{
struct rb_node *parent = rb_parent(victim);
/* Set the surrounding nodes to point to the replacement */
if (parent) {
if (victim == parent->rb_left)
parent->rb_left = new;
else
parent->rb_right = new;
} else {
root->rb_node = new;
}
if (victim->rb_left)
rb_set_parent(victim->rb_left, new);
if (victim->rb_right)
rb_set_parent(victim->rb_right, new);
/* Copy the pointers/colour from the victim to the replacement */
*new = *victim;
}
链接1:http://www.cnblogs.com/skywang12345/p/3624202.html
(3)应用开发接口:插入、搜索、释放;
获取到红黑树的源码后,使用红黑树还要:基于内核红黑树接口,自定义的一个结构体,来存储你所在应用场景的节点数据,方便应用开发;并实现相应红黑树的操作接口(添加、搜索、释放等)
代码:
red_black_tree_app.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <time.h>
#include <sys/time.h>
#include "rbtree.h"
#define NUM_NODES 1000000
struct mynode {
struct rb_node node;
char *string;
};
struct rb_root mytree = RB_ROOT;
struct mynode * my_search(struct rb_root *root, char *string)
{
struct rb_node *node = root->rb_node;
while (node) {
struct mynode *data = container_of(node, struct mynode, node);
int result;
result = strcmp(string, data->string);
if (result < 0)
node = node->rb_left;
else if (result > 0)
node = node->rb_right;
else
return data;
}
return NULL;
}
int my_insert(struct rb_root *root, struct mynode *data)
{
struct rb_node **new = &(root->rb_node), *parent = NULL;
/* Figure out where to put new node */
while (*new) {
struct mynode *this = container_of(*new, struct mynode, node);
int result = strcmp(data->string, this->string);
parent = *new;
if (result < 0)
new = &((*new)->rb_left);
else if (result > 0)
new = &((*new)->rb_right);
else
return 0;
}
/* Add new node and rebalance tree. */
rb_link_node(&data->node, parent, new);
rb_insert_color(&data->node, root);
return 1;
}
void my_free(struct mynode *node)
{
if (node != NULL) {
if (node->string != NULL) {
free(node->string);
node->string = NULL;
}
free(node);
node = NULL;
}
}
static float get_more_sec_time(struct timeval start, struct timeval end)
{
float ftime = (end.tv_sec-start.tv_sec) + (end.tv_usec-start.tv_usec)/1000000.0;
return ftime;
}
int main()
{
struct mynode *mn[NUM_NODES];
/* *insert */
int i = 0;
for (; i < NUM_NODES; i++)
{
mn[i] = (struct mynode *)malloc(sizeof(struct mynode));
mn[i]->string = (char *)malloc(sizeof(char) * 4);
sprintf(mn[i]->string, "%d", i);
my_insert(&mytree, mn[i]);
}
gettimeofday( &end, NULL );
printf("insert %d figure run time:%f s.\n", NUM_NODES, get_more_sec_time(start, end));
/* search 1 test */
gettimeofday( &start, NULL );
struct mynode *data = my_search(&mytree, "20");
if (data)
{
gettimeofday( &end, NULL );
printf("get one data(%s) from red_black_tree run time: %d.%06d s.\n", data->string,(end.tv_sec-start.tv_sec), (end.tv_usec-start.tv_usec));
}
return 0;
}
参考:http://www.cnblogs.com/haippy/archive/2012/09/02/2668099.html
5、性能简单测试
基于以上提供的红黑树应用接口,实现的性能测试。代码:参考上方:red_black_tree_app.c
测试数据:
(1)插入 1000 条数据、从中查出值为 20 的节点分别耗费的时间:
insert 1000 figures run time: 0.000517 s.
get one data(20) from red_black_tree run time: 0.000001 s.
(2)插入 1000000 条数据、从中查出值为 20 的节点分别耗费的时间:
insert 1000000 figure run time:0.817847 s.
get one data(20) from red_black_tree run time: 0.000002 s.
小结:从百万级数据集和千级数据集获取某个值,其时间都在微秒级;由此可以想象为什么 epoll 随着并发量的增加其性能没有明显下降的原因了。
6、github 完整源码
https://github.com/raoping2017/red_black_tree_app.git
7、其他红黑树学习资源
网上看到了一个红黑树不错的博客,这里也链接过来,作者是 July:
http://blog.csdn.net/v_JULY_v/article/details/6114226