The NULL value means “no data.”

The NULL value can be surprising until you get used to it. Conceptually, NULL means “a missing unknown value” and it is treated somewhat differently from other values.

使用

官方举例

1
2
mysql> INSERT INTO my_table (phone) VALUES (NULL);
mysql> INSERT INTO my_table (phone) VALUES ('');

分别用NULL和空字符串表达不同的含义
第一句被看作:电话号码未知
第二句被看作:这个人没有电话,所以没有电话号码
这样看来,官方是推荐我们使用NULL来表达 未知

1
2
3
4
5
6
mysql> SELECT NULL, 1+NULL, CONCAT('Invisible',NULL);
+------+--------+--------------------------+
| NULL | 1+NULL | CONCAT('Invisible',NULL) |
+------+--------+--------------------------+
| NULL | NULL | NULL |
+------+--------+--------------------------+

以上SQL语句验证了NULL的不可操作性

1
2
mysql> SELECT * FROM my_table WHERE phone = NULL;
Empty set (0.00 sec)

以上SQL语句验证了NULL的不可比性

两个常用的函数

IFNULL(expr1, expr2):expr1不为NULL时返回expr1,否则返回expr2
我们可以借助IFNULL函数判定该字段是否是一个NULL值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
mysql> SELECT IFNULL(1,0);
+-------------+
| IFNULL(1,0) |
+-------------+
| 1 |
+-------------+
mysql> SELECT IFNULL(NULL,10);
+-----------------+
| IFNULL(NULL,10) |
+-----------------+
| 10 |
+-----------------+
mysql> SELECT IFNULL(1/0,10);
+----------------+
| IFNULL(1/0,10) |
+----------------+
| 10.0000 |
+----------------+
由于对0的整除是无意义的,所以MySQL视其为NULL
mysql> SELECT IFNULL(1/0,'yes');
+-------------------+
| IFNULL(1/0,'yes') |
+-------------------+
| yes |
+-------------------+
同上

NULLIF(expr1,expr2):expr1 = expr2相等返回NULL,否则返回expr1

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> SELECT NULLIF(1,1);
+-------------+
| NULLIF(1,1) |
+-------------+
| NULL |
+-------------+
mysql> SELECT NULLIF(1,2);
+-------------+
| NULLIF(1,2) |
+-------------+
| 1 |
+-------------+

对于经常在SQL中处理逻辑判断的需求是个快捷的方法,可以替代下面复杂的写法

1
CASE WHEN expr1 = expr2 THEN NULL ELSE expr1 END.

注意事项

  • 不区分大小写
  • 与 \N 同义 【LOAD DATA INFILE/SELECT … INTO OUTFILE的文本中NULL的显示内容】
  • 即不是0也不是空字符串
  • 在MySQL中,0和NULL都表示FALSE
  • 不能与任何值进行比较,包括NULL与NULL的比较,且不会得到任何有意义的结果

    1
    2
    3
    4
    5
    6
    mysql> SELECT 1 = NULL, 1 <> NULL, 1 < NULL, 1 > NULL;
    +----------+-----------+----------+----------+
    | 1 = NULL | 1 <> NULL | 1 < NULL | 1 > NULL |
    +----------+-----------+----------+----------+
    | NULL | NULL | NULL | NULL |
    +----------+-----------+----------+----------+
  • 0和空字符串表示实际值,所以可以插入到NOT NULL的字段中

    1
    2
    3
    4
    5
    6
    mysql> SELECT 0 IS NULL, 0 IS NOT NULL, '' IS NULL, '' IS NOT NULL;
    +-----------+---------------+------------+----------------+
    | 0 IS NULL | 0 IS NOT NULL | '' IS NULL | '' IS NOT NULL |
    +-----------+---------------+------------+----------------+
    | 0 | 1 | 0 | 1 |
    +-----------+---------------+------------+----------------+

    引申:那么NOT NULL的比较是什么样子

    1
    2
    3
    4
    5
    6
    mysql> SELECT 1 IS NOT NULL, 0 IS NOT NULL, NULL IS NOT NULL;
    +---------------+---------------+------------------+
    | 1 IS NOT NULL | 0 IS NOT NULL | NULL IS NOT NULL |
    +---------------+---------------+------------------+
    | 1 | 1 | 0 |
    +---------------+---------------+------------------+

数据类型相关

  • 对于INT和FLOAT类型,且有AUTO_INCREMENT属性的字段,插入的NULL值会被自动增加步长
    sql_auto_is_null=ON时【默认OFF】,插入一个非明确值,如:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    katyusha@127.0.0.1:3306:test 14:35:44>desc t;
    +-------+---------+------+-----+---------+----------------+
    | Field | Type | Null | Key | Default | Extra |
    +-------+---------+------+-----+---------+----------------+
    | id | int(11) | NO | PRI | NULL | auto_increment |
    +-------+---------+------+-----+---------+----------------+
    katyusha@127.0.0.1:3306:test 12:40:24>insert into t values (null);
    Query OK, 1 row affected (0.01 sec)
    katyusha@127.0.0.1:3306:test 14:35:31>insert into t values();
    Query OK, 1 row affected (0.01 sec)
    katyusha@127.0.0.1:3306:test 14:35:35>select * from t where id is null;
    +----+
    | id |
    +----+
    | 6 |
    +----+

    对NOT NULL的字段使用IS NULL,会得到LAST_INSERT_ID()的结果

  • 对于TIMESTAMP类型,插入的NULL会被当前日期时间所替代

    前提:

    explicit_defaults_for_timestamp=OFF【默认】时,TIMESTAMP字段默值认为NOT NULL DEFAULT on update CURRENT_TIMESTAMP

    explicit_defaults_for_timestamp=ON时,TIMESTAMP类型默认为NULL DEFAULT NULL

  • 对于DATE和DATETIME类型

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    katyusha@127.0.0.1:3306:test 15:11:04>desc t;
    +-------+----------+------+-----+---------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------+----------+------+-----+---------+-------+
    | col | datetime | NO | | NULL | |
    | col2 | date | NO | | NULL | |
    +-------+----------+------+-----+---------+-------+
    katyusha@127.0.0.1:3306:test 15:11:09>select * from t where col is null;
    +---------------------+------------+
    | col | col2 |
    +---------------------+------------+
    | 0000-00-00 00:00:00 | 0000-00-00 |
    +---------------------+------------+
    katyusha@127.0.0.1:3306:test 15:11:18>select * from t where col2 is null;
    +---------------------+------------+
    | col | col2 |
    +---------------------+------------+
    | 0000-00-00 00:00:00 | 0000-00-00 |
    +---------------------+------------+

    使用IS NULL可以得到0000-00-00 00:00:00的值

函数相关

  • 连接函数返回NULL

    1
    2
    3
    4
    5
    6
    mysql> SELECT CONCAT('My', NULL, 'QL');
    +--------------------------+
    | CONCAT('My', NULL, 'QL') |
    +--------------------------+
    | NULL |
    +--------------------------+

    引申:那么对于NULL字符串的连接,可以使用CONCAT_WS()来替代,此函数会跳过所有的NULL值

    1
    2
    3
    4
    5
    6
    SELECT CONCAT_WS(',','First name',NULL,'Last Name');
    +----------------------------------------------+
    | CONCAT_WS(',','First name',NULL,'Last Name') |
    +----------------------------------------------+
    | First name,Last Name |
    +----------------------------------------------+
  • 使用聚合函数COUNT(), MIN(), SUM()时,NULL被忽略

  • COUNT(*)包含NULL值

    也就是COUNT(*)和COUNT(column)结果不同

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    katyusha@127.0.0.1:3306:test 16:51:04>select * from t;
    +------+
    | col |
    +------+
    | 1 |
    | NULL |
    | NULL |
    +------+
    katyusha@127.0.0.1:3306:test 16:51:21>select count(*), count(col) from t;
    +----------+------------+
    | count(*) | count(col) |
    +----------+------------+
    | 3 | 1 |
    +----------+------------+

索引相关

  • 只有MyISAM、InnoDB、Memory存储引擎支持在字段上使用NULL和建立索引
  • 使用DISTINCT, GROUP BY, ORDER BY时,所有NULL被视作是等同的值

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    katyusha@127.0.0.1:3306:test 17:06:08>select * from t;
    +------+
    | col |
    +------+
    | NULL |
    | NULL |
    | 1 |
    | 3 |
    +------+
    katyusha@127.0.0.1:3306:test 17:07:44>select count(*), count(col) from t1 group by col;
    +----------+-----------+
    | count(*) | count(col) |
    +----------+-----------+
    | 2 | 0 |
    | 1 | 1 |
    | 1 | 1 |
    +----------+-----------+
  • 使用ORDER BY时,NULL处于首位;加上DESC,NULL处于末位

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    katyusha@127.0.0.1:3306:test 16:33:04>select * from t order by col;
    +------+
    | col |
    +------+
    | NULL |
    | 1 |
    +------+
    katyusha@127.0.0.1:3306:test 16:33:10>select * from t order by col desc;
    +------+
    | col |
    +------+
    | 1 |
    | NULL |
    +------+
  • 对于LEFT JOIN右侧的表,条件字段即使有索引,也不会被使用到

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    katyusha@127.0.0.1:3306:test 15:56:46>explain select * from t1 left join t2 on t1.id = t2.id where col='b';
    +----+-------------+-------+------+----------------+---------+---------+------------+------+------------------------------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+-------+------+----------------+---------+---------+------------+------+------------------------------------+
    | 1 | SIMPLE | t1 | ref | idx_id | idx_id | 5 | test.t2.id | 1 | Using index |
    | 1 | SIMPLE | t2 | ref | idx_id,idx_col | idx_col | 32 | const | 1 | Using index condition; Using where |
    +----+-------------+-------+------+----------------+---------+---------+------------+------+------------------------------------+
    katyusha@127.0.0.1:3306:test 15:56:39>explain select * from t1 left join t2 on t1.id = t2.id where col is null;
    +----+-------------+-------+-------+---------------+--------+---------+------+------+----------------------------------------------------------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+-------+-------+---------------+--------+---------+------+------+----------------------------------------------------------------+
    | 1 | SIMPLE | t1 | index | NULL | idx_id | 5 | NULL | 3 | Using index |
    | 1 | SIMPLE | t2 | ALL | idx_id | NULL | NULL | NULL | 2 | Using where; Not exists; Using join buffer (Block Nested Loop) |
    +----+-------------+-------+-------+---------------+--------+---------+------+------+----------------------------------------------------------------+
  • 对于混合条件

    1
    key_column=expr OR key_column IS NULL

    在子查询中比较常见,当MySQL进行了优化,EXPLAIN结果中的type是ref_or_null
    ref_or_null的工作原理:先从key中读取数据,然后对NULL值的行进行额外的搜索

  • 如果条件语句中包含多个key_column IS NULL,MySQL只优化一个

    1
    2
    3
    SELECT * FROM t1, t2
    WHERE (t1.a=t2.a AND t2.a IS NULL)
    OR (t1.b=t2.b AND t2.b IS NULL);

    对于上面语句,MySQL只会对(t1.a=t2.a AND t2.a IS NULL)进行优化,(t1.b=t2.b AND t2.b IS NULL)不会被使用到索引
    所以,不能保证MySQL会想你想象的那样工作

  • 如果查询结果集的长度超出了max_allowed_packet配置,字符串函数的返回值是NULL

“坑”相关

上面哪个没用好都是坑,不过上面的内容官方文档都有说明,下面这个属于开发自己踩到
  • NOT IN

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    katyusha@127.0.0.1:3306:test 18:17:33>select * from t1;
    +------+------+
    | a | b |
    +------+------+
    | 1 | 1 |
    | 2 | NULL |
    | NULL | 3 |
    | 2 | 1 |
    | 2 | NULL |
    | NULL | 3 |
    +------+------+
    katyusha@127.0.0.1:3306:test 18:19:03>select * from t2;
    +------+------+
    | a | b |
    +------+------+
    | 2 | NULL |
    | NULL | 5 |
    +------+------+
    katyusha@127.0.0.1:3306:test 18:17:23>select * from t1 where a not in (1, null);
    Empty set (0.00 sec)
    katyusha@127.0.0.1:3306:test 18:15:52>select * from t1 where a not in (select a from t2);
    Empty set (0.01 sec)

    单表或联表查询,WHERE条件NOT IN中如果包含NULL值,无法返回正确的结果集

运维

不同的存储引擎,存储方式不同
MyISAM每行头有一个位域用来存储每个字段的NULL编码情况,一个NULL的字段依然占据空间,所以NULL并不会减少/节省空间

InnoDB每行头存储每个字段有一个“域 开始 偏移量”,为每个字段分配1-2个字节。字段为NULL的“域 开始 偏移量”的高位为“ON”。然而这种情况,该字段应该不需要存储任何内容。所以,如果你使用了大量的NULL,你的存储会有明显的降低。

下面针对MyISAM和InnoDB,分别创建了2张表,一种是带主键ID的,一种是不带主键ID的

无索引的容量对比

带主键ID

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
CREATE TABLE `t_myisam_null` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`col1` varchar(100) DEFAULT NULL,
`col2` varchar(100) DEFAULT NULL,
`col3` varchar(100) DEFAULT NULL,
`col4` varchar(100) DEFAULT NULL,
`col5` varchar(100) DEFAULT NULL,
`col6` varchar(100) DEFAULT NULL,
`col7` varchar(100) DEFAULT NULL,
`col8` varchar(100) DEFAULT NULL,
`col9` varchar(100) DEFAULT NULL,
`col10` varchar(100) DEFAULT NULL,
`col11` varchar(100) DEFAULT NULL,
`col12` varchar(100) DEFAULT NULL,
`col13` varchar(100) DEFAULT NULL,
`col14` varchar(100) DEFAULT NULL,
`col15` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
CREATE TABLE `t_innodb_null` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`col1` varchar(100) DEFAULT NULL,
`col2` varchar(100) DEFAULT NULL,
`col3` varchar(100) DEFAULT NULL,
`col4` varchar(100) DEFAULT NULL,
`col5` varchar(100) DEFAULT NULL,
`col6` varchar(100) DEFAULT NULL,
`col7` varchar(100) DEFAULT NULL,
`col8` varchar(100) DEFAULT NULL,
`col9` varchar(100) DEFAULT NULL,
`col10` varchar(100) DEFAULT NULL,
`col11` varchar(100) DEFAULT NULL,
`col12` varchar(100) DEFAULT NULL,
`col13` varchar(100) DEFAULT NULL,
`col14` varchar(100) DEFAULT NULL,
`col15` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

不带主键ID

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
CREATE TABLE `t_myisam_null_noid` (
`col1` varchar(100) DEFAULT NULL,
`col2` varchar(100) DEFAULT NULL,
`col3` varchar(100) DEFAULT NULL,
`col4` varchar(100) DEFAULT NULL,
`col5` varchar(100) DEFAULT NULL,
`col6` varchar(100) DEFAULT NULL,
`col7` varchar(100) DEFAULT NULL,
`col8` varchar(100) DEFAULT NULL,
`col9` varchar(100) DEFAULT NULL,
`col10` varchar(100) DEFAULT NULL,
`col11` varchar(100) DEFAULT NULL,
`col12` varchar(100) DEFAULT NULL,
`col13` varchar(100) DEFAULT NULL,
`col14` varchar(100) DEFAULT NULL,
`col15` varchar(100) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `t_innodb_null_noid` (
`col1` varchar(100) DEFAULT NULL,
`col2` varchar(100) DEFAULT NULL,
`col3` varchar(100) DEFAULT NULL,
`col4` varchar(100) DEFAULT NULL,
`col5` varchar(100) DEFAULT NULL,
`col6` varchar(100) DEFAULT NULL,
`col7` varchar(100) DEFAULT NULL,
`col8` varchar(100) DEFAULT NULL,
`col9` varchar(100) DEFAULT NULL,
`col10` varchar(100) DEFAULT NULL,
`col11` varchar(100) DEFAULT NULL,
`col12` varchar(100) DEFAULT NULL,
`col13` varchar(100) DEFAULT NULL,
`col14` varchar(100) DEFAULT NULL,
`col15` varchar(100) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

写个脚本造数据,分别INSERT 1千万条记录,除了主键自增ID,所有字段均为NULL

带主键ID

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
katyusha@127.0.0.1:3306:test 14:24:29>select * from t_myisam_null limit 1;
+----+------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+-------+-------+
| id | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 | col12 | col13 | col14 | col15 |
+----+------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+-------+-------+
| 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+-------+-------+
katyusha@127.0.0.1:3306:test 14:20:21>select count(*) from t_myisam_null;
+----------+
| count(*) |
+----------+
| 10000001 |
+----------+
katyusha@127.0.0.1:3306:test 14:23:28>select count(*) from t_innodb_null;
+----------+
| count(*) |
+----------+
| 10000001 |
+----------+

不带主键ID

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
katyusha@127.0.0.1:3306:test 14:29:05>select * from t_myisam_null_noid limit 1;
+------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+-------+-------+
| col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 | col12 | col13 | col14 | col15 |
+------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+-------+-------+
| NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+-------+-------+
katyusha@127.0.0.1:3306:test 14:23:24>select count(*) from t_myisam_null_noid;
+----------+
| count(*) |
+----------+
| 10000001 |
+----------+
katyusha@127.0.0.1:3306:test 14:23:41>select count(*) from t_innodb_null_noid;
+----------+
| count(*) |
+----------+
| 10000001 |
+----------+

我们来看下数据文件大小
这里的主键ID,我使用了bigint,占8 bytes
没有主键的表,系统会创建主键,占6 bytes

bigint

表数据文件大小

1
2
3
4
5
-rw-rw---- 1 mysql mysql 306M 8月 15 22:19 t_myisam_null.MYD
-rw-rw---- 1 mysql mysql 308M 8月 15 22:19 t_innodb_null.ibd
-rw-rw---- 1 mysql mysql 191M 8月 15 22:30 t_myisam_null_noid.MYD
-rw-rw---- 1 mysql mysql 284M 8月 15 22:30 t_innodb_null_noid.ibd

我们可以看到是否有主键,对于2个存储引擎来说差别还是挺大的,这个这里暂且忽略

从数据文件的大小可以来证明片头的阐述,不管什么存储引擎,VARCHAR类型的NULL值虽然不占空间,但NULL约束的字段的标识值是占空间的

下面我们来看看,有实际数据存储的表是什么样子

这里我们只测试InnoDB带主键ID,且所有字段改为NOT NULL DEFAULT ‘’和上面的区别

1
2
3
4
5
-rw-rw---- 1 mysql mysql 308M 8月 15 22:19 t_innodb_null.ibd
对比下
-rw-rw---- 1 mysql mysql 440M 8月 16 15:20 t_innodb_notnull.ibd

因为NOT NULL DEFAULT ‘’实际存储空字符,所以比NULL DEFAULT NULL的数据文件大一些,而NULL DEFAULT NULL的数据,实际上是行头数据

varcahr

varchar(M)是否与空间有关系

1
-rw-rw---- 1 mysql mysql 440M 8月 16 16:12 t_innodb_notnull_varchar10.ibd

对于varchar(10)和varchar(100),同样存储aaa字符串,文件大小是一样的。varchar(M)中的M除了控制字符串长度,也会影响该字段申请内存的大小,M多大就申请多大内存

我们将2张表改为DEFAULT ‘aaa’

1
2
-rw-rw---- 1 mysql mysql 932M 8月 16 17:26 t_innodb_null_aaa.ibd
-rw-rw---- 1 mysql mysql 912M 8月 16 17:26 t_innodb_notnull_aaa.ibd

NULL DEFAULT ‘aaa’比NOT NULL DEFAULT ‘aaa’的数据文件大一些

PRIMARY KEY ENGINE CONSTRAINT DATA LENGTH (byte) FILE SIZE (byte)
NONE MyISAM NULL DEFAULT NULL 320000032 320000032
NONE InnoDB NULL DEFAULT NULL 279805952 322961408
ID MyISAM NULL DEFAULT NULL 200000020 200000020
ID InnoDB NULL DEFAULT NULL 286081024 297795584
ID InnoDB NULL DEFAULT ‘aaa’ 878706688 977272832
ID InnoDB NOT NULL DEFAULT ‘aaa’ 909115392 956301312

有索引的容量对比

表结构同上,InnoDB 存储引擎,为col10增加非聚集索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
CREATE TABLE `t_innodb_null_key` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`col1` varchar(100) NULL DEFAULT NULL,
`col2` varchar(100) NULL DEFAULT NULL,
`col3` varchar(100) NULL DEFAULT NULL,
`col4` varchar(100) NULL DEFAULT NULL,
`col5` varchar(100) NULL DEFAULT NULL,
`col6` varchar(100) NULL DEFAULT NULL,
`col7` varchar(100) NULL DEFAULT NULL,
`col8` varchar(100) NULL DEFAULT NULL,
`col9` varchar(100) NULL DEFAULT NULL,
`col10` varchar(100) NULL DEFAULT NULL,
`col11` varchar(100) NULL DEFAULT NULL,
`col12` varchar(100) NULL DEFAULT NULL,
`col13` varchar(100) NULL DEFAULT NULL,
`col14` varchar(100) NULL DEFAULT NULL,
`col15` varchar(100) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY idx_col10(col10)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `t_innodb_notnull_key` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`col1` varchar(100) NOT NULL DEFAULT '',
`col2` varchar(100) NOT NULL DEFAULT '',
`col3` varchar(100) NOT NULL DEFAULT '',
`col4` varchar(100) NOT NULL DEFAULT '',
`col5` varchar(100) NOT NULL DEFAULT '',
`col6` varchar(100) NOT NULL DEFAULT '',
`col7` varchar(100) NOT NULL DEFAULT '',
`col8` varchar(100) NOT NULL DEFAULT '',
`col9` varchar(100) NOT NULL DEFAULT '',
`col10` varchar(100) NOT NULL DEFAULT '',
`col11` varchar(100) NOT NULL DEFAULT '',
`col12` varchar(100) NOT NULL DEFAULT '',
`col13` varchar(100) NOT NULL DEFAULT '',
`col14` varchar(100) NOT NULL DEFAULT '',
`col15` varchar(100) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY idx_col10(col10)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1
2
-rw-rw---- 1 mysql mysql 448M 8月 17 15:55 t_innodb_null_key.ibd
-rw-rw---- 1 mysql mysql 584M 8月 17 15:55 t_innodb_notnull_key.ibd

同上,NOT NULL略大一些

增加实际默认值
varchar(100) NULL DEFAULT ‘a’
varchar(100) NOT NULL DEFAULT ‘a’

1
2
-rw-rw---- 1 mysql mysql 780M 8月 17 16:15 t_innodb_null_key_a.ibd
-rw-rw---- 1 mysql mysql 752M 8月 17 16:15 t_innodb_notnull_key_a.ibd

由于MySQL为会为NULL的字段记录标识状态,当写入实际数据的时候,就体现出含NULL类型字段的表文件更大一些,那么占用磁盘空间也就更多些

索引利用情况

下面查询WHERE IS NOT NULL,即表中不存在的数据

varchar(100) NULL DEFAULT NULL
varchar(100) NOT NULL DEFAULT ‘’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
katyusha@127.0.0.1:3306:test 16:51:09>explain select * from t_innodb_null_key where col10 is not null;
+----+-------------+-------------------+-------+---------------+-----------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+-------+---------------+-----------+---------+------+------+-----------------------+
| 1 | SIMPLE | t_innodb_null_key | range | idx_col10 | idx_col10 | 303 | NULL | 1 | Using index condition |
+----+-------------+-------------------+-------+---------------+-----------+---------+------+------+-----------------------+
katyusha@127.0.0.1:3306:test 16:57:32>explain select * from t_innodb_notnull_key where col10 <> '';
+----+-------------+----------------------+-------+---------------+-----------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+-------+---------------+-----------+---------+------+------+-----------------------+
| 1 | SIMPLE | t_innodb_notnull_key | range | idx_col10 | idx_col10 | 302 | NULL | 2 | Using index condition |
+----+-------------+----------------------+-------+---------------+-----------+---------+------+------+-----------------------+

varchar(100) NULL DEFAULT ‘a’
varchar(100) NOT NULL DEFAULT ‘a’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
katyusha@127.0.0.1:3306:test 17:07:40>explain select * from t_innodb_null_key_a where col10 = 'a';
+----+-------------+---------------------+------+---------------+-----------+---------+-------+---------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+------+---------------+-----------+---------+-------+---------+-----------------------+
| 1 | SIMPLE | t_innodb_null_key_a | ref | idx_col10 | idx_col10 | 303 | const | 4478865 | Using index condition |
+----+-------------+---------------------+------+---------------+-----------+---------+-------+---------+-----------------------+
katyusha@127.0.0.1:3306:test 17:07:50>explain select * from t_innodb_notnull_key_a where col10 = 'a';
+----+-------------+------------------------+------+---------------+-----------+---------+-------+---------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------+------+---------------+-----------+---------+-------+---------+-----------------------+
| 1 | SIMPLE | t_innodb_notnull_key_a | ref | idx_col10 | idx_col10 | 302 | const | 4513315 | Using index condition |
+----+-------------+------------------------+------+---------------+-----------+---------+-------+---------+-----------------------+

NULL DEFAUL NULL多出一个标识位空间
文件头有3个字节,NULL标识符占1字节,变长类型占1字节,记录字段长度占1字节
对于UTF8编码,对于VARCHAR(M)类型字段,NULL的key_length=m*3+3,NOT NULL的key_length=m*3+3

ENGINE CONSTRAINT KEY LENGTH (byte) INDEX LENGTH (byte) DATA LENGTH (byte) FILE SIZE (byte)
InnoDB NULL DEFAULT NULL 303 141230080 295567360 469762048
InnoDB NOT NULL DEFAULT ‘’ 302 141230080 431980544 612368384
InnoDB NULL DEFAULT ‘a’ 303 153845760 583008256 817889280
InnoDB NOT NULL DEFAULT ‘a’ 302 146489344 568328192 788529152

结论,如果为一个拥有NULL约束的字段添加索引,会增加数据文件的尺寸;以此类推,这样的索引字段越多就会占用更多的磁盘空间

受系统配置影响的

在上面的注意事项中,我们发现有几个问题是依赖于系统环境变量的,这部分内容与DBA相关,所以再次拿出来列举一下
sql_auto_is_null
我们的设置:sql_auto_is_null=OFF【默认】

explicit_defaults_for_timestamp
我们的设置:explicit_defaults_for_timestamp=ON

总结

我们从开发和运维两个层面分别展示了下NULL,能看到这里的都是英雄

NULL有它便利的地方,但是它给我们带来的麻烦也不少。因此,为了让开发可以更专注在业务逻辑上,让DBA可以优化数据库的性能上

请DBA将NULL to NOT NULL补充到SQL开发规范中
请开发在创建表结构的时候尽可能将字段设置为NOT NULL

PS:pt-osc也提供了选项可以优化NULL to NOT NULL