Lesson 3 of the SQL Optimization Course: Column Not Null & Auto Increment | SQLFlash

Lesson 3 of the SQL Optimization Course: Column Not Null & Auto Increment

Lesson 3 of the SQL Optimization Course: Column Not Null & Auto Increment

Lesson 3 of the SQL Optimization Course: Column Not Null & Auto - Increment

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

๐Ÿง‘โ€๐Ÿ’ป Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.

Lesson 3 Column Non - Null & Auto - Increment

In MySQL, fields come with a variety of attributes, each having a varying impact on performance. Therefore, it’s necessary to explore these attributes in depth.

1. NULL / NOT NULL

For external programs, NULL represents an unknown, indefinite, or inexpressible value. Many companies’ development specifications clearly state that fields must be set to NOT NULL. In fact, scenarios that use NULL can be transformed into meaningful characters or values. Firstly, it enhances data readability and long - term maintainability. Secondly, it simplifies the writing of SQL statements. The characteristics of NULL are as follows:

1.1 Concatenation with NULL fields results in NULL

When you expect to concatenate a field r1 and insert it into a new table t3, but the records in table t3 are all NULL, which is inconsistent with the expectation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
mysql> show create table t1\G
*************************** 1. row ***************************
      Table: t1
Create Table: CREATE TABLE `t1` ( `r1` varchar(10) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql> show create table t2\G
*************************** 1. row ***************************
      Table: t2
Create Table: CREATE TABLE `t2` ( `r1` varchar(10) NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql> create table t3 like t1;
Query OK, 0 rows affected (0.04 sec)

mysql> insert into t3 select concat(r1,'database') from t1 limit 2;
Query OK, 2 rows affected (0.02 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t3;
+------+
| r1   |
+------+
| NULL |
| NULL |
+------+
2 rows in set (0.00 sec)

The correct approach is to use the IFNULL function to handle NULL values specially.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
mysql> insert into t3 select concat(ifnull(r1,''),'database') from t1 limit 2;
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t3;
+----------+
| r1       |
+----------+
| database |
| database |
+----------+
2 rows in set (0.00 sec)

1.2 Inaccurate COUNT values for columns containing NULL

Although tables t1 and t2 have the same number of records, the field r1 in t1 contains NULL values, which causes these values to be ignored in the result.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
mysql> select count(r1) as rc from t1;
+-------+
| rc    |
+-------+
| 16384 |
+-------+
1 row in set (0.01 sec)

mysql> select count(r1) as rc from t2;
+-------+
| rc    |
+-------+
| 32768 |
+-------+
1 row in set (0.03 sec)

In this case, the correct methods are to use functions related to NULL or directly use COUNT(*).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
mysql> select count(ifnull(r1,'')) as rc from t1;
+-------+
| rc    |
+-------+
| 32768 |
+-------+
1 row in set (0.03 sec)

mysql> select count(*) as rc from t1;
+-------+
| rc    |
+-------+
| 32768 |
+-------+
1 row in set (0.02 sec)

Of course, not only COUNT, but most functions are not friendly to NULL values, except for those related to NULL. From the above two points, we can see that handling NULL values requires special treatment, which adds extra difficulty to writing SQL statements.

1.3 Index columns containing NULL

When creating an index on a column containing NULL, it takes one more BIT to store compared to a column without NULL.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mysql> pager grep -i 'key_len'
PAGER set to 'grep -i 'key_len''

mysql> explain select * from t1 where r1 = ''\G
     key_len: 43
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from t2 where r1 = ''\G
     key_len: 42
1 row in set, 1 warning (0.00 sec)

The key_len values are 43 and 42 respectively. Table t1 uses one more byte than t2. You may wonder why it’s one byte instead of one BIT. You can refer to my previous article (Issue 02: The Art of MySQL Data Types - Large Object Fields) for a detailed description of BIT.

1.4 Handling of NULL by different storage engines

After the release of MySQL 8.0, only InnoDB, MyISAM, and Memory support indexing columns containing NULL, while other engines, such as NDB, do not.

2. AUTO_INCREMENT

The auto - increment attribute of a column is generally used to set an integer column to increase gradually by a certain step, similar to sequences in other databases. However, this “sequence” is based on a specific table. The relevant characteristics of the auto - increment attribute are as follows:

2.1 Variable controlling the performance of the auto - increment attribute: innodb_autoinc_lock_mode

  • innodb_autoinc_lock_mode = 0 represents the traditional mode. That is, when inserting records into a field with the auto - increment attribute, a table - level auto - increment lock is held until the statement execution ends. For example, in the following two statements, when SQL 1 is executing, it holds the table - level auto - increment lock on table f1 all the time. Then, when SQL 2 is executed, the lock times out.
  • innodb_autoinc_lock_mode = 1 represents the consecutive mode, which is similar to the traditional mode. The difference is that for simple insert statements like SQL 2, a lightweight mutex lock (at the thread level, not the transaction level) is held only during the process of allocating a new ID, rather than a table lock that is released until the statement ends.
1
2
3
4
5
6
7
8
-- SQL 1
mysql> insert into f1(c2) select rpad(uuid(),100,uuid()) from t1;
Query OK, 16777216 rows affected (3 min 35.92 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

-- SQL 2
mysql> insert into f1(c2) select 'database';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
  • innodb_autoinc_lock_mode = 2 represents the interleaved mode. In this mode, the auto - increment table lock is abandoned, and the generated values may not be consecutive. However, it is the mode with the highest performance, and multiple insert statements can be executed concurrently. MySQL 8.0 uses the interleaved mode by default.
1
2
3
4
5
6
7
8
9
-- SQL 1
mysql> insert into f1(c2) select rpad(uuid(),100,uuid()) from t1;
Query OK, 16777216 rows affected (3 min 35.92 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

-- SQL 2
mysql> insert into f1(c2) select 'sql2';
Query OK, 1 row affected (0.02 sec)
Records: 1  Duplicates: 0  Warnings: 0

Regarding replication safety, for the above three modes, modes 0 and 1 are safe at the statement level, which means that the binary logs generated can be replicated to any other machine with consistent data. Mode 2 is not safe at the statement level. All three modes are safe for the row - based binary log format.

2.2 Controlling the step and offset of the auto - increment attribute

This is generally used in master - master replication or multi - source replication architectures to actively avoid primary key conflicts.

  • auto_increment_increment controls the step.
  • auto_increment_offset controls the offset.

2.3 Meeting the requirement of immediately obtaining the inserted value

If you want to obtain the maximum value of a table at any time, you should always execute SQL 3 instead of SQL 2. The last_insert_id() function in SQL 2 only retrieves the ID at the beginning of the previous statement and is only suitable for simple INSERT statements.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- SQL 1
mysql> insert into f1(c2) values ('xx1'),('xx2'),('xx3');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

-- SQL 2
mysql> select last_insert_id() as last_insert_id;
+----------------+
| last_insert_id |
+----------------+
|              1 |
+----------------+
1 row in set (0.00 sec)

-- SQL 3
mysql> select max(c1) as last_insert_id from f1;
+----------------+
| last_insert_id |
+----------------+
|              3 |
+----------------+
1 row in set (0.00 sec)

-- SQL 4
mysql> select * from f1;
+----+------+
| c1 | c2   |
+----+------+
|  1 | xx1  |
|  2 | xx2  |
|  3 | xx3  |
+----+------+
3 rows in set (0.00 sec)

2.4 Overflow of auto - increment columns

If an auto - increment column reaches the maximum value of its data type, an overflow will occur. For example, if you change the auto - increment column of table f1 to tinyint. SQL 2 explicitly inserts the maximum value 127, and SQL 3 will report an error. Therefore, it is recommended to plan the field type of the auto - increment column in advance and understand the upper limit value.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
mysql> drop table f1;
Query OK, 0 rows affected (0.04 sec)

mysql> create table f1(c1 tinyint auto_increment primary key);
Query OK, 0 rows affected (0.05 sec)

-- SQL 1
mysql> insert into f1 values (127);
Query OK, 1 row affected (0.01 sec)

-- SQL 2
mysql> select * from f1;
+-----+
| c1  |
+-----+
| 127 |
+-----+
1 row in set (0.00 sec)

-- SQL 3
mysql> insert into f1 select null;
ERROR 1062 (23000): Duplicate entry '127' for key 'PRIMARY'

2.5 Explicitly inserting signed values into auto - increment columns

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
mysql> insert into f1 values (-10),(-20),(-30);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from f1;
+-----+
| c1  |
+-----+
| -30 |
| -20 |
| -10 |
| 127 |
+-----+
4 rows in set (0.00 sec)

It is recommended to avoid such situations before the request reaches the database.

Summary

This article discusses whether a field should be NOT NULL. By now, it should be clear that fields are best set to NOT NULL. It also covers the auto - increment attribute of fields, including relevant examples of performance and usage. I hope this article is helpful to you.

๐Ÿ‘‹ See you in the next lesson.

The SQL Optimization Course

The SQL optimization knowledge

๐Ÿ‘‹ See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.