oracle null值和索引

2024年5月20日 386次阅读来源: 拉风的菜鸟

NULL值是关系数据库系统布尔型(true,false,unknown)中比较特殊类型的一种值，通常称为UNKNOWN或空值，即是未知的，不确定的。由于NULL存在着无数的可能，因此NULL值也不等于NULL值，所以与NULL值相关的操作同样都为NULL值。正是基于这样一个特性，对于NULL值列上的B树索引导致了is null/is not null不走索引的情形，下面描述了NULL值与索引以及索引NULL列上的执行计划，如何使得NULL值走索引的情形。注：本文仅仅讨论的是B树索引上的NULL值，位图索引不在此范围之内。

一、null值与索引
SQL> create table test(id number(2),name varchar2(10),age number(2));
SQL> create unique index test_id_ind on test(id);

//id上存在唯一索引，但由于null值不等于null值，因此能成功插入
SQL> insert into test(id,name,age) values(null,’test1′,10);
SQL> insert into test(id,name,age) values(null,’test2′,10);
SQL> commit;

//在test表上创建唯一复合索引，基于id,name列
SQL> create unique index test_id_name_ind on test(id,name);

//基于多列的复合索引，不全为null的重复行则不能被插入
SQL> insert into test(id,name,age) values(null,’test1′,20);
insert into test(id,name,age) values(null,’test1′,20)
*
第 1 行出现错误:
ORA-00001: 违反唯一约束条件 (SCOTT.TEST_ID_NAME_IND)

//(null,null)不等同于(null,null)，所以的两次null能够被插入。
SQL> insert into test(id,name,age) values(null,null,20);
SQL> insert into test(id,name,age) values(null,null,20);

SQL> commit;

SQL> select * from test;
ID NAME AGE
———- ———- ———-
test1 10
test2 10
20
20

SQL> exec dbms_stats.gather_table_stats(‘scott’,’test’,cascade=>true);

//注意：虽然我们建表时使用了”test”小写表名，但因为表名以大写存储，所以此时应该使用”TEST”
SQL> select index_name,index_type,num_rows,status,distinct_keys from user_indexes where table_name=’TEST’;

INDEX_NAME INDEX_TYPE NUM_ROWS STATUS DISTINCT_KEYS
—————— ———— ———- ——– ————-
TEST_ID_IND NORMAL 0 VALID 0
TEST_ID_NAME_IND NORMAL 2 VALID 2

二、null值与执行计划
SQL> set autotrace traceonly;

由于null值是不被存储，所以当使用id is null作为谓词时，走了全表扫描
SQL> select * from test where id is null;
————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 3 | 21 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 3 | 21 | 3 (0)| 00:00:01 |
————————————————————————–

尽管当前表上id列上的所有值都为null，但不排除后续记录插入的id不为null的列。所以当使用id is not null作为谓词时，此时执行计划中走了索引全扫描。
SQL> select * from test where id is not null;
————————————————————
| Id | Operation | Name | Rows |
————————————————————
| 0 | SELECT STATEMENT | | 1 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 |
|* 2 | INDEX FULL SCAN | TEST_ID_IND | 1 |
————————————————————

对于复合唯一索引的情形，当使用单列且非前导列谓词时，使用is null与 is not null等同于单列唯一索引的情形。name is null走全表扫描而name is not null走索引。
SQL> select * from test where name is null;
————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 7 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 7 | 3 (0)| 00:00:01 |
————————————————————————–

SQL> select * from test where name is not null;
————————————————————————
| Id | Operation | Name | Rows | Bytes |
————————————————————————
| 0 | SELECT STATEMENT | | 2 | 14 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 2 | 14 |
|* 2 | INDEX FULL SCAN | TEST_ID_NAME_IND | 1 | |
————————————————————————

//下面看看两个列都作为谓词的情形

尽管两个谓词列上都存在索引，一个为单列唯一索引，一个为复合唯一索引。Oracle 选择了复合索引test_id_name_ind。
SQL> select * from test where id is null and name is not null;
—————————————————————-
| Id | Operation | Name | Rows |
—————————————————————-
| 0 | SELECT STATEMENT | | 2 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 2 |
|* 2 | INDEX RANGE SCAN | TEST_ID_NAME_IND | 1 |
—————————————————————-

谓词的顺序与复合索引定义的顺序一样，第一个谓词为id is not null，而第二个谓词为name is null。此时Oracle 选择了单列唯一索引test_id_ind。
SQL> select * from test where id is not null and name is null;
——————————————————————-
| Id | Operation | Name | Rows | Bytes |
——————————————————————-
| 0 | SELECT STATEMENT | | 1 | 7 |
|* 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 7 |
|* 2 | INDEX FULL SCAN | TEST_ID_IND | 1 | |
——————————————————————-

三、is null走索引
SQL> set autot off;

SQL> delete from test where name is null;

SQL> select * from test;
ID NAME AGE
———- ———- ———-
test1 10
test2 10

SQL> update test set id=1 where name=’test1′;
SQL> update test set id=2 where name=’test2′;
SQL> commit;

SQL> alter table test modify id not null;

SQL> set autot trace exp;
SQL> select * from test where id is null;
—————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
—————————————————————-
| 0 | SELECT STATEMENT | | 1 | 7 | 0 (0)|
|* 1 | FILTER | | | | |
| 2 | TABLE ACCESS FULL| TEST | 3 | 21 | 3 (0)|
—————————————————————-

//组合索引中，第一个字段没有出现在where语句中，此时走INDEX SKIP SCAN
SQL> select * from test where name is null;
—————————————————————-
| Id | Operation | Name | Rows |
—————————————————————-
| 0 | SELECT STATEMENT | | 1 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 |
|* 2 | INDEX SKIP SCAN | TEST_ID_NAME_IND | 1 |
—————————————————————-

SQL> insert into test(id,name,age) values(3,null,30);
SQL> commit;

SQL> exec dbms_stats.gather_table_stats(‘scott’,’TEST’,cascade=>true);

SQL> select index_name,index_type,num_rows,status,distinct_keys from user_indexes where table_name=’TEST’;
INDEX_NAME INDEX_TYPE NUM_ROWS STATUS DISTINCT_KEYS
—————— ———– ———— ——– ————-
TEST_ID_IND NORMAL 3 VALID 3
TEST_ID_NAME_IND NORMAL 3 VALID 3

四、通过基于函数的索引来使得is null使用索引
SQL> create table test2(id number(2),name varchar2(10));

//用nvl函数来创建函数索引，即当obj_id为null值时，存储-1
SQL> create index test2_fn_ind on test2(nvl(id,-1));

SQL> select count(*) from test2 where nvl(id,-1)=-1;
———————————————————
| Id | Operation | Name | Rows | Bytes |
———————————————————
| 0 | SELECT STATEMENT | | 1 | 13 |
| 1 | SORT AGGREGATE | | 1 | 13 |
|* 2 | INDEX RANGE SCAN| TEST_FN_IND | 1 | 13 |
———————————————————

五、使用伪列创建基于函数的索引来使得is null使用索引
//通过添加一个值为-1(可取任意值)的伪列来创建索引
SQL> create index test2_new_id_ind on test2(id,-1);
SQL> select count(*) from test2 where id is null;
————————————————————-
| Id | Operation | Name | Rows | Bytes |
————————————————————-
| 0 | SELECT STATEMENT | | 1 | 13 |
| 1 | SORT AGGREGATE | | 1 | 13 |
|* 2 | INDEX RANGE SCAN| TEST_NEW_ID_IND | 1 | 13 |
————————————————————-

四、总结

无论是单列唯一索引或复合唯一索引，对于可以为null的列或复合null值，Oracle不会为其存储索引值。故在基于单列创建B树唯一索引或多列创建B树复合唯一索引的情形下，
当列上允许为null值时：
where子句使用了基于is null的情形，其执行计划走全表扫描。
where子句使用了基于is not null的情形，其执行计划走索引扫描。

当列上不允许为null值时，存在非null约束：
where子句使用了基于is null的情行，其执行计划走索引扫描。
where子句使用了基于is not null的情形，其执行计划走索引扫描。

Null与索引使用时的建议：
1、对于用于连接或经常被谓词使用到的列应尽可能避免NULL值属性，因为它容易导致索引失效。
2、为需要使用NULL值的列添加缺省值(alter table tb modify(col default ‘Y’))。
3、如果NULL值不可避免也不能使用缺省值，应考虑为该常用列使用nvl函数创建索引，或使用伪列来创建索引以提高查询性能。
4、对于复合索引应保证索引中至少有一列不为NULL值，还是因为全部列为NULL时不被索引存储，以保证使用is null是可以使用索引。
5、对于复合索引应保证索引列应使用数据类型长度最小的列来添加not null约束应节省磁盘空间。

    原文作者：拉风的菜鸟
    原文地址: https://blog.csdn.net/qq_38218238/article/details/82418392
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。