Lesson 5 of the SQL Optimization Course: Can foreign keys be used? | SQLFlash

Lesson 5 of the SQL Optimization Course: Can foreign keys be used?

Lesson 5 of the SQL Optimization Course: Can foreign keys be used?

Lesson 5 of the SQL Optimization Course: Can foreign keys be used?

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

๐Ÿง‘โ€๐Ÿ’ป Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.

Lesson 5 Can foreign keys be used?

The original design purpose of foreign keys is to ensure the consistency and integrity of logically related table data operations at the database level.

In most enterprise development specifications, foreign keys are directly avoided! Foreign keys have both advantages and disadvantages, and it doesn’t mean they are not applicable in every scenario. There’s no need for a one - size - fits - all approach. So, can foreign keys be used? The following will provide answers based on different scenarios.

1. Advantages and Disadvantages of Foreign Keys

Advantages

  • Data Streamlining: Reduce data redundancy and avoid additional maintenance operations for handling a large amount of redundant data in the future.
  • Code Simplification: Lower the complexity of application code and reduce extra exception handling. Data management is entirely handled by the database.
  • Enhanced Readability: Improve the readability of documentation, especially when designing tables and drawing ER diagrams, with simple and clear logic.

Disadvantages

  • Performance Pressure: Foreign keys usually have cascade functions such as cascade updates and deletes. In scenarios with massive data, this can cause significant performance pressure. For example, when inserting a new record, if the table has 10 foreign keys, it is necessary to check the reasonableness of the inserted record in each of the 10 related tables, delaying the normal insertion time. Also, updates to the parent table will lock the related child tables.
  • Limited Flexibility: Other functions, such as table structure updates, may have poor flexibility.

2. Usage of Foreign Keys

List of Foreign Key Referential Actions

  • CASCADE: Cascade. The child table follows the parent table to update the foreign key value.
  • SET NULL: The child table updates the foreign key value to NULL as the parent table is updated.
  • RESTRICT/NO ACTION: Default. Restrict changes to the foreign key value in the parent table.
  • SET DEFAULT: Currently has the same effect as RESTRICT.

Let’s first take a simple look at the usage of foreign keys in MySQL. Only the InnoDB and NDB engines in MySQL support foreign keys. Here, we only focus on InnoDB. The MySQL version in this example is the latest, 8.0.19.

Example: Here, f1 is the parent table, and f2, f3, f6 represent different types of foreign key tables, i.e., child tables.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Reference the base table, i.e., the parent table
mysql-(ytt_fk/3305)->create table f1(id int primary key,
    r1 int, r2 int, r3 int,
    key idx_r1(r1),
    key idx_u1 (r2,r3));
Query OK, 0 rows affected (0.02 sec)

-- Cascade update the foreign key table as the reference table is updated. That is, when the parent table is updated, the foreign key in the child table will be cascaded and updated.
mysql-(ytt_fk/3305)->create table f2(id int primary key,
    f1_r1 int, mark int,
    constraint f1_fk_r1 foreign key (f1_r1) references f1(r1) on update cascade);
Query OK, 0 rows affected (0.02 sec)

-- Update the foreign key value in the child table to NULL as the reference table is updated. That is, when the parent table is updated, the foreign key in the child table will be cascaded and updated to NULL.
mysql-(ytt_fk/3305)->create table f3 (id int primary key,
    f1_id int,
    foreign key (f1_id) references f1(id) on update set null);
Query OK, 0 rows affected (0.02 sec)

-- Multiple - key foreign keys. The child table can reference non - primary keys in the parent table.
mysql-(ytt_fk/3305)->create table f6 ( id int auto_increment primary key,
    f1_r2 int, f1_r3 int,
    foreign key (f1_r2,f1_r3) references f1(r2,r3));
Query OK, 0 rows affected (0.02 sec)

Scenario 1: Strong Requirement for Data Consistency, Weak Program, Strong Database, Small Table Structure Changes, and Low Concurrency

Verify tables f2 and f6 with a single record. From a functional perspective, the advantages of foreign keys are obvious, and the database fully meets the data integrity check.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
mysql-(ytt_fk/3305)->insert into f1 values (1,10,100,1000);
Query OK, 1 row affected (0.00 sec)

mysql-(ytt_fk/3305)->insert into f2 values (1,1);
Query OK, 1 row affected (0.01 sec)

mysql-(ytt_fk/3305)->insert into f6 values (1,100,1000);
Query OK, 1 row affected (0.00 sec)

-- Update the reference table f1
mysql-(ytt_fk/3305)->update f1 set id = 2 where id =1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

-- Table f2 is successfully cascaded and updated.
mysql-(ytt_fk/3305)->select * from f2;
+----+-------+
| id | f1_id |
+----+-------+
|  1 |     2 |
+----+-------+
1 row in set (0.00 sec)

-- The r2 field in the reference table cannot be updated because table f6 has a foreign key constraint on the r2 field.
mysql-(ytt_fk/3305)->update f1 set r2 = 11 ;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`ytt_fk`.`f6`, CONSTRAINT `f6_ibfk_1` FOREIGN KEY (`f1_r2`, `f1_r3`) REFERENCES `f1` (`r2`, `r3`))

Scenario 2: Frequent Data Loading but Strict Requirement for Database - Level Data Consistency

Here, only verify table f6, and clone a new table f6_no_fk with the same table structure as f6 except without foreign keys. Import 4 million sample records.

1
2
3
4
5
6
7
8
9
-- Import into f6 with foreign keys, taking more than 32 seconds.
mysql-(ytt_fk/3305)->load data infile '/var/lib/mysql-files/f1_sub.dat' into table f6;
Query OK, 4000000 rows affected (32.57 sec)
Records: 4000000  Deleted: 0  Skipped: 0  Warnings: 0

-- Import into f6_no_fk without foreign keys, taking more than 25 seconds.
mysql-(ytt_fk/3305)->load data infile '/var/lib/mysql-files/f1_sub.dat' into table f6_no_fk;
Query OK, 4000000 rows affected (25.95 sec)
Records: 4000000  Deleted: 0  Skipped: 0  Warnings: 0

From the above, in a single test of importing 4 million records, the table with foreign keys has no time advantage over the table without foreign keys. Optimize the above scenario by turning off the foreign key check parameter, and then turn it on after the import is completed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
mysql-(ytt_fk/3305)->truncate f6;
Query OK, 0 rows affected (0.04 sec)

-- Turn off the foreign key check.
mysql-(ytt_fk/3305)->set foreign_key_checks=0;
Query OK, 0 rows affected (0.00 sec)

-- Re - import, taking more than 28 seconds.
mysql-(ytt_fk/3305)->load data infile '/var/lib/mysql-files/f1_sub.dat' into table f6;
Query OK, 4000000 rows affected (28.42 sec)
Records: 4000000  Deleted: 0  Skipped: 0  Warnings: 0

-- Turn on the foreign key check.
mysql-(ytt_fk/3305)->set foreign_key_checks=1;
Query OK, 0 rows affected (0.00 sec)

From the above results, after turning off the foreign key check, the import time is similar to that of the table f6_no_fk without foreign keys.

Scenario 3: Low Concurrency and Simple Transaction Blocks

Next, look at a simple transaction block submission method. I simply wrote a stored procedure that commits every 500 records.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
DELIMITER $$
CREATE DEFINER=`ytt`@`127.0.0.1` PROCEDURE `sp_generate_data`(IN `tb_name` VARCHAR(64), IN `f_number` INT)
begin
    declare i int default 0;
    set @@autocommit=0;
    while i < f_number DO
        set @stmt = concat("insert into ",tb_name,"(f1_r2,f1_r3) values (ceil(rand()*10),ceil(rand()*10))");
        prepare s1 from @stmt;
        execute s1;
        set i = i + 1;
        if mod(i,500)=0 THEN
            commit;
        end if;
    end while;
    drop prepare s1;
    commit;
    set @@autocommit=1;
end$$
DELIMITER ;

Next, insert 1 million records.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
-- The total writing time for the foreign key table is 1 minute and 14 seconds.
mysql> call sp_generate_data('f6',1000000);
Query OK, 0 rows affected (1 min 14.14 sec)

-- The writing time for the non - foreign key table is 1 minute and 8 seconds.
mysql> call sp_generate_data('f6_no_fk',1000000);
Query OK, 0 rows affected (1 min 8.45 sec)

-- Turn off the foreign key check.
mysql> set foreign_key_checks=0;
Query OK, 0 rows affected (0.00 sec)

-- The time is 1 minute and 4 seconds.
mysql> call sp_generate_data('f6',1000000);
Query OK, 0 rows affected (1 min 4.28 sec)

mysql> set foreign_key_checks=1;
Query OK, 0 rows affected (0.00 sec)

From the test results, in this scenario, the retrieval time between tables with and without foreign keys is not significantly different.

Scenario 4: Need to Expand the Data Type of the Foreign Key Reference Field in the Parent Table, and the Original Data Overflows

For example, if the data type defined for field r2 is no longer appropriate and needs to be changed to a larger one, directly modifying it will result in an error.

1
2
3
4
5
mysql-(ytt_fk/3305)->alter table f1 change r2 r2 bigint;
ERROR 3780 (HY000): Referencing column 'f1_r2' and referenced column 'r2' in foreign key constraint 'f6_ibfk_1' are incompatible.

mysql-(ytt_fk/3305)->alter table f6 change f1_r2 f1_r2 bigint;
ERROR 3780 (HY000): Referencing column 'f1_r2' and referenced column 'r2' in foreign key constraint 'f6_ibfk_1' are incompatible.

So, how to make the change? First, delete the foreign key, modify the data type, and then add the constraint back. This scenario is not suitable for using foreign keys.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
mysql-(ytt_fk/3305)->alter table f6 drop constraint f6_ibfk_1;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql-(ytt_fk/3305)->alter table f6 change f1_r2 f1_r2 bigint;
Query OK, 0 rows affected (0.04 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql-(ytt_fk/3305)->alter table f1 change r2 r2 bigint;
Query OK, 100000 rows affected (0.73 sec)
Records: 100000  Duplicates: 0  Warnings: 0

mysql-(ytt_fk/3305)->alter table f6 add foreign key (f1_r2,f1_r3) references f1(r2,r3);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

Scenario 5: The Child Table Needs a Trigger to Update Necessary Fields

In this case, the trigger in the child table will not be cascaded and applied as the parent table is updated, that is, the trigger fails at this time. For example, add a before update trigger to table f2.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Pre - update trigger
CREATE TRIGGER `tr_af_update` BEFORE UPDATE ON `f2` FOR EACH ROW set new.mark = new.f1_r1;

mysql-(ytt_fk/3305)->insert into f2 values (1,10,5);
Query OK, 1 row affected (0.00 sec)

mysql-(ytt_fk/3305)->select * from f2;
+----+-------+------+
| id | f1_r1 | mark |
+----+-------+------+
|  1 |    10 |    5 |
+----+-------+------+
1 row in set (0.00 sec)

-- Update the parent table.
mysql-(ytt_fk/3305)->update f1 set r1 = 2 where r1 = 10;
Query OK, 5133 rows affected (0.15 sec)
Rows matched: 5133  Changed: 5133  Warnings: 0

-- The child table f2 is cascaded and changed, but the trigger action is not executed.
mysql-(ytt_fk/3305)->select * from f2;
+----+-------+------+
| id | f1_r1 | mark |
+----+-------+------+
|  1 |     2 |    5 |
+----+-------+------+
1 row in set (0.00 sec)

-- The normal operation should be like this.
mysql-(ytt_fk/3305)->update f2 set id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

-- The mark field is cloned to the value of the f1_r1 field.
mysql-(ytt_fk/3305)->select * from f2;
+----+-------+------+
| id | f1_r1 | mark |
+----+-------+------+
|  2 |     2 |    2 |
+----+-------+------+
1 row in set (0.00 sec)

Scenario 6: The Parent Table Is a Partitioned Table and There Is a Need for Foreign Keys

For partitioned tables, currently, child tables cannot use partitioned tables as parent tables for foreign keys.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
mysql-(ytt_fk/3305)->create table f1_partition like f1;
Query OK, 0 rows affected (0.02 sec)

mysql-(ytt_fk/3305)->alter table f1_partition  partition by key() partitions 4;
Query OK, 0 rows affected (0.10 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql-(ytt_fk/3305)->create table f7 ( id int primary key,
    f1_partition_id int,
    foreign key (f1_partition_id) references f1_partition(id));
ERROR 1506 (HY000): Foreign keys are not yet supported in conjunction with partitioning

Scenario 7: High Daily Concurrency

In this scenario, the scope and magnitude of related transaction locks should be minimized. Take a simple example to see the situation of the parent table being updated and the child table being cascaded and locked when there are foreign keys.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
-- SESSION 1
mysql-(ytt_fk/3305)->begin;
Query OK, 0 rows affected (0.00 sec)

mysql-(ytt_fk/3305)->update f1 set r2 = 101 where r2 = 100;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql-(ytt_fk/3305)->select sys.ps_thread_id(connection_id()) as cid;
+------+
| cid  |
+------+
|   47 |
+------+
1 row in set (0.00 sec)

There are a total of 11 locks. It’s just a simple Update operation on one row.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- SESSION 2
mysql-((none)/3305)->select count(*) from performance_schema.data_locks where thread_id = 47;
+----------+

| count(*) |
+----------+
|       11 |
+----------+
1 row in set (0.00 sec)

-- View the detailed locks. The parent table f1 has 5 locks, and the child table f6 has 6 locks.
-- These are all forced by MySQL to ensure data consistency, which is definitely not suitable in scenarios with high TPS requirements.
mysql-((none)/3305)->select object_name,lock_type,lock_mode,lock_status,lock_data from performance_schema.data_locks where thread_id = 47 order by object_name;
+-------------+-----------+---------------+-------------+------------------------+
| object_name | lock_type | lock_mode     | lock_status | lock_data              |
+-------------+-----------+---------------+-------------+------------------------+
| f1          | TABLE     | IX            | GRANTED     | NULL                   |
| f1          | RECORD    | X             | GRANTED     | supremum pseudo-record |
| f1          | RECORD    | X             | GRANTED     | 100, 100, 1            |
| f1          | RECORD    | X,REC_NOT_GAP | GRANTED     | 1                      |
| f1          | RECORD    | X,GAP         | GRANTED     | 101, 100, 1            |
| f6          | TABLE     | IS            | GRANTED     | NULL                   |
| f6          | RECORD    | S,REC_NOT_GAP | GRANTED     | 100, 100, 12           |
| f6          | TABLE     | IX            | GRANTED     | NULL                   |
| f6          | RECORD    | X,REC_NOT_GAP | GRANTED     | 12                     |
| f6          | RECORD    | X,REC_NOT_GAP | GRANTED     | 101, 100, 12           |
| f6          | RECORD    | S,GAP         | GRANTED     | 101, 100, 12           |
+-------------+-----------+---------------+-------------+------------------------+
11 rows in set (0.00 sec)

3. Limitations of Foreign Keys

  1. Only the InnoDB and NDB engines support foreign keys.
  2. Virtual columns are not supported.
  3. Temporary tables are not supported.
  4. The data types, character sets, and collation rules of the foreign key column and the referenced column must be consistent.
  5. Indexes must be established on both the foreign key column and the referenced column.
  6. When a foreign key references multiple columns, the column order must be consistent.
  7. Large object fields cannot be used as referenced columns.
  8. The constraint name must be unique within a single database.
  9. Foreign key cascade update operations do not trigger triggers on the child table.
  10. Partitioned tables are not supported.

Summary

This article demonstrates through several examples whether foreign keys should be used and in which scenarios. From the scenarios described above, scenarios 1, 2, and 3 are very suitable for using foreign keys; scenarios 4, 5, 6, and 7 are not. In these cases, the foreign key function can be implemented outside the database.

๐Ÿ‘‹ See you in the next lesson.

The SQL Optimization Course

The SQL optimization knowledge

๐Ÿ‘‹ See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.