集合与字典（上）----集合

2019年4月21日 129次阅读来源: 帅气的昵称都有人用了

集合

集合是现代数学中一个非常重要的概念，在某些情况下又被称为类、族或者搜索。

但实际上，数学上的集合与计算机当中所说的集合有很大的区别。首先数学中的集合可以是无限集，即集合中的元素可以由无穷多个，但计算机当中是一定要保证集合为有限集的。其次，根据定义来说，集合中的元素是没有排列顺序的，但为了方便运算，我们要保证集合中的元素是有序排放的，并且集合中必须存放相同数据类型的数据元素，这样才能通过关系运算符>和<来保证元素的顺序。
除了基本的概念之外，我们还要知道集合的三种基本运算，即并运算，交运算，差运算。具体的定义就不在这里赘述了。

集合的实现

先来讲一下位向量集合，其实位向量是一个数组，数组中每一个元素都是一个二进制位（0或者1）。位向量中并不存储实际的数据而只存储数据的状态，而且这个状态必须具有二值性。

计算机中每种数据都有一定的类型，每种类型的数据都只有有限个。这就从理论上确保了使用位向量表示数据状态的可行性。

我们来举一个非常简单的例子：

《集合与字典（上）----集合》位向量表示法

《集合与字典（上）----集合》 A与B并运算其位向量

《集合与字典（上）----集合》 A与B交运算其位向量

《集合与字典（上）----集合》 A与B差运算其位向量

大致的意思明白了，那么该如何实现呢？

class VecSet{
    int* contain;
    int size;
public:
    VecSet(int MaxSize = DefaultSize);
    ~VecSet();
    
    void Add(int add);
    void Del(int del);
    void MakeEmpty();
    int GetSize() {return size;}
    bool IsMember(const int x);
    VecSet& operator+(VecSet& another);
    VecSet& operator-(VecSet& another);
    VecSet& operator*(VecSet& another);
    VecSet& operator=(VecSet& another);
    bool operator==(VecSet& another);
    friend ostream& operator<<(ostream& stream, VecSet& set);
};

类中的成员函数的实现：

/** VecSet构造函数 */
VecSet::VecSet(int MaxSize):size(MaxSize) {
    assert(MaxSize > 0);
    contain = new int[size];
    assert(contain != NULL);
    MakeEmpty();
}
/** VecSet析构函数 */
VecSet::~VecSet() {
    if(contain != NULL)
        delete [] contain;
}
/** VecSet主要成员函数 */
void VecSet::Add(int add) {
    assert(add >= 0 && add < size);
    contain[add] = 1;
}
void VecSet::Del(int del) {
    assert(del >= 0 && del < size);
    contain[del] = 0;
}
void VecSet::MakeEmpty() {
    for(int i=0; i < size; i++)
        contain[i] = 0;
}
bool VecSet::IsMember(const int x) {
    assert(x >= 0 && x < size);
    return contain[x] != 0;
}

其中，函数Add(int add)用于向位向量中添加数add，即把代表add的二进制位置为1；
函数Del(int del)用于从位向量中删除数del，即把代表del的二进制位置为0；
除了一些基本的函数实现之外，还有一些运算符的重载，是用于实现我们前面所提到的交运算，并运算与差运算的。

/** 运算符的重载方法 */
VecSet& VecSet::operator+(VecSet &another) {
    assert(this -> GetSize() == another.GetSize());
    VecSet* temp = new VecSet(this -> GetSize());
    for(int i=0; i < size; i++) {
        (*temp).contain[i] = contain[i]||another.contain[i];
    }
    return *temp;
}
VecSet& VecSet::operator-(VecSet &another) {
    assert(this -> GetSize() == another.GetSize());
    VecSet* temp = new VecSet(this -> GetSize());
    for(int i=0; i<size; i++) {
        // 逻辑运算符&&的运算规则是当&&左边的条件为假时，运算结果总为假，当且仅当运算符&&左边的二进制为1，且右边的
        // 二进制值为0时运算结果为1.所以在这里使用了一个！来辅助实现这种运算规则
        (*temp).contain[i] = contain[i] && (!another.contain[i]); 
    }
    return *temp;
}
VecSet& VecSet::operator*(VecSet &another) {
    assert(this -> GetSize() == another.GetSize());
    VecSet* temp = new VecSet(this -> GetSize());
    for(int i=0; i<size; i++) {
        (*temp).contain[i] = contain[i] && another.contain[i];
    }
    return *temp;
}
VecSet& VecSet::operator=(VecSet &another) {
    assert(this -> GetSize() == another.GetSize());
    VecSet* temp = new VecSet(this -> GetSize());
    for(int i=0; i<size; i++) {
        contain[i] = another.contain[i];
    }
    return *temp;
}
bool VecSet::operator==(VecSet &another) {
    assert(this -> GetSize() == another.GetSize())
    for(int i=0; i<size; i++) {
        if(contain[i] != another.contain[i])
            return false;
    }
    return true;
}
ostream & VecSet::operator<<(ostream &stream, VecSet &set) {
    for(int i=0; i<size; i++) {
        if(set.IsMember(i)) cout << i <<" ";
    }
    cout << endl;
    return stream;
}

在这里面，+用于求两个集合的并运算，-用于求两个集合的差运算，*用于求两个集合的交运算，=用于把集合another整体赋值给当前集合，==用于判断两个集合是否相等，<<用于输出当前集合。
虽然VecSet的定义与实现都不是很复杂，但还是要注意size的取值，不要取得过大，以防会占用过多的存储空间。

位向量集合具有两个优点：
·由于位向量集合是使用静态数组实现的，因此实现方式简单，算法时间复杂度较低。
·位于位向量集合中的元素与元素之间是没有任何关联的，因此添加或删除某个元素不会对集合中的其他元素造成任何的影响。

单链表集合

虽然前面的位向量集合已经很不错了，但是其最大的一个缺点在于：无法动态的调整整个数组的大小。在之前学习数据结构的时候，我们因为数组是静态的，为了方便我们的动态调整，我们引申出了链表的概念，在这里也完全一样，我们引申出链表集合的概念。
但链表集合的概念又和刚才所说的位向量集合有很大的区别，因为刚才的位向量集合中，我们存放的是二进制的标志值，但在链表集合中，我们的目标是存放各种各样的内容，因此需要达到支持用户自定义数据类型的目的，我们在设计链表集合时采用模板机制。而且我们还需要对>和<进行重载，以便我们对所定义的内容的大小进行规定，比如对水果的名称进行排序，我们就可以按照字典序的方式进行排列。
首先我们先给出类ListSet的定义：

template <class T> class ListSet;
template <class T>
class ListSetNode
{
    friend class ListSet<T>;
    T data;
    ListSetNode<T>* link;
public:
    ListSetNode(): link(NULL) {}
    ListSetNode(T value): link(NULL), data(value) {}
};
template <class T>
class ListSet
{
    ListSetNode<T>* head;
    ListSetNode<T>* tail;
public:
    ListSet();
    ~ListSet();
    ListSet(ListSet& listSet);
    
    void Add(T add);
    void Del(T del);
    bool IsEmpty();
    void MakeEmpty();
    ListSet<T>& operator+(ListSet<T>& another);
    ListSet<T>& operator-(ListSet<T>& another);
    ListSet<T>& operator*(ListSet<T>& another);
    ListSet<T>& operator=(ListSet<T>& another);
    bool operator==(ListSet<T>& another);
    void Output();
    ListSetNode<T>* GetHead() {return head;}
};

和我们之前所学习的单链表一样，我们需要先定义节点，再定义链表。
接下来我们对每一个成员函数进行实现：

template <class T>
ListSet<T>::ListSet() {
    this->head = new ListSetNode<T>();
    this->tail = this->head;
}
template <class T>
ListSet<T>::~ListSet() {
    MakeEmpty();
    delete head;
}
template <class T>
ListSet<T>::ListSet(ListSet<T>& listSet) {
    this->head = new ListSetNode<T>();
    this->tail = this->head;
    ListSetNode<T>* current = listSet.head->link;
    while(current != nullptr)
    {
        Add(current->data);
        current = current->link;
    }
}

上述构造函数的功能是创建一个只有表头节点的空链表集合。

template <class T>
void ListSet<T>::Add(T value)
{
    ListSetNode<T>* add = new ListSetNode<T>(value);
    ListSetNode<T>* preCurrent = head;   // 记录当前节点
    ListSetNode<T>* current = preCurrent->link;  // 记录当前节点的前驱节点
    while(current != NULL && current->data < value) {   // 寻找插入位置
        preCurrent = current;
        current = preCurrent->link;
    }
    if(head == tail && current == NULL) {   // 向空链表中插入节点
        head -> link = add;
        tail = add;
    }
    if(head != tail && current == NULL) {   // 向链表尾插入值add
        preCurrent -> link = add;
        add -> link = NULL;
        tail = add;
    }
    if(current != NULL && current -> data == value) {   // 链表中已存在值add
        return;
    }
    if(current != NULL && current -> data > value) {
        preCurrent -> link = add;
        add -> link = current;
    }
}

因为我们需要保证集合中元素的唯一性，因此我们在发现链表中已存在值add时，算法直接结束。

template <class T> 
void ListSet<T>::Del(T del) {
    ListSetNode<T>* preCurrent = head;
    ListSetNode<T>* current = preCurrent -> link;
    while(current != NULL && current -> data != del) {   // 寻找删除节点
        preCurrent = current;
        current = preCurrent -> link;
    }
    if(current != NULL) {
        preCurrent -> link = current -> link;
        if(current -> link == NULL) {tail = preCurrent;}   // 更新表尾指针
        delete current;
    }
}

Add(T add)和Del(T del)方法与我们之前的链表的基本操作是基本一致的。
但既然是集合，那么就一定需要交，并，差运算，那么我们还是通过运算符重载的方法进行实现。

 /** 并运算 */
 template <class T>
 ListSet<T>& ListSet<T>::operator+(ListSet<T> &another) {
    ListSet<T>* temp = new ListSet(another);
    ListSetNode<T>* current = this -> head -> link;
    while(current != nullptr) {
        temp -> Add(current -> data);
        current = current -> link;
    }
    return *temp;
 }
 /** 差运算 */
 template <class T>
 ListSet<T>& ListSet<T>::operator-(ListSet<T> &another) {
     ListSet<T>* temp = new ListSet(*this);
     ListSetNode<T>* current = another.head -> link;
     while(current != nullptr) {
         temp -> Del(current -> data);
         current = current -> link;
     }
     return *temp;
 }
 /** 交运算 */
 template <class T>
 ListSet<T>& ListSet<T>::operator*(ListSet<T> &another) {
     ListSet<T>* temp = new ListSet(*this);
     *temp = *this - another;
     *temp = *this - *temp;
     return *temp;
 }

在交运算中，我们使用了数学公式:
《集合与字典（上）----集合》

其余成员函数实现如下：

 template <class T>
 bool ListSet<T>::IsEmpty() {
     return head == tail;
 }
 
 template <class T>
 void ListSet<T>::MakeEmpty() {
     ListSetNode<T>* current = head -> link;
     ListSetNode<T>* del = nullptr;
     while(head -> link != nullptr) {   // 循环删除集合中的元素
         head -> link = current -> link;
         del = current;
         current = current -> link;
         delete del;
     }
     tail = head;
 }
 
 template <class T>
 ListSet<T>& ListSet<T>::operator=(ListSet<T> &another) {
     MakeEmpty();
     ListSetNode<T>* current = another.head -> link;
     while(current != nullptr) {
         Add(current -> data);
         current = current -> link;
     }
     return *this;
 }
 
 template <class T>
 bool ListSet<T>::operator==(ListSet<T> &another) {
     ListSetNode<T>* current1 = head -> link;
     ListSetNode<T>* current2 = another.head -> link;
     while(current1 != nullptr) {
         if(current2 == nullptr) {return false;}
         if(current1 -> data != current2 -> data) {return false;}
         current1 = current1 -> link;
         current2 = current2 -> link;
     }
     if(current2 != nullptr) {return false;}
     return true;
 }
 
 template <class T>
 void ListSet<T>::Output() {
     ListSetNode<T>* current = this -> GetHead() -> link;
     while(current != nullptr) {
         cout << current -> data << " ";
         current = current -> next;
     }
     cout << end;
 }

以上这两种集合表示的方法都具有不同的优点和缺点，因此他们也适用于不同的情况中。因此要根据具体情况进行具体选择。在面临数据量比较大的情况中时，若存储空间允许，类ListSet可以存储“无限”多的数据，类ListSet的缺点是该类中方法的时间复杂度较高，实现效率不如类VecSet。

以上就是集合的基本概念与实现方法，下一次我们将继续讲解字典的用法，敬请期待。

    原文作者：帅气的昵称都有人用了
    原文地址: https://www.jianshu.com/p/717493054218
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。