Lesson 2 of the SQL Optimization Course: The Art of MySQL Data Types:Part 2 | SQLFlash

Lesson 2 of the SQL Optimization Course: The Art of MySQL Data Types:Part 2

Lesson 2 of the SQL Optimization Course: The Art of MySQL Data Types:Part 2

Lesson 2 of the SQL Optimization Course: The Art of MySQL Data Types - Large Object Field

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

🧑‍💻 Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.

Lesson 2: The Art of MySQL Data Types - Large Object Field

We will explain MySQL’s large object types in two parts: TEXT/BLOB and JSON.

1. TEXT/BLOB Types

The difference between TEXT and BLOB is very simple. TEXT stores data in plain text with corresponding character sets and collation rules; BLOB stores data in binary format without character sets and sorting rules, and all comparisons are performed in binary.

Example 1

Create a table named c1 with fields f1 and f2 of types tinytext and tinyblob respectively.

1
2
mysql> create table c1 (f1 tinytext, f2 tinyblob);
Query OK, 0 rows affected (0.03 sec)

Insert sample data.

1
insert into c1 values ('a','a'),('b','b'),('B','B'),('d','d'),('F','F'),('You','You'),('I','I'),('Yes','Yes');

Sort by field f1.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
mysql> select * from c1 order by f1;
+--------+--------+
| f1     | f2     |
+--------+--------+
| a      | a      |
| b      | b      |
| B      | B      |
| d      | d      |
| F      | F      |
| You    | You    |
| I      | I      |
| Yes    | Yes    |
+--------+--------+
8 rows in set (0.01 sec)

Sort by field f2.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
mysql> select * from c1 order by f2;
+--------+--------+
| f1     | f2     |
+--------+--------+
| B      | B      |
| F      | F      |
| a      | a      |
| b      | b      |
| d      | d      |
| You    | You    |
| I      | I      |
| Yes    | Yes    |
+--------+--------+
8 rows in set (0.00 sec)

The sorting results of fields f1 and f2 are inconsistent. f1 uses a case - insensitive collation rule, while f2 uses direct binary checking.

I will explain MySQL’s large object types in detail from the following categories:

1.1 Disk Space Occupancy

TypeMaximum Data OccupancyDescription
tinytext/tinyblob255BAn additional 1B is used to store the total number of bytes
text/blob65KBAn additional 2B is used to store the total number of bytes
mediumtext/mediumblob16MBAn additional 3B is used to store the total number of bytes
longtext/longblob4GBAn additional 4B is used to store the total number of bytes

1.2 Table Storage Formats

1.2.1 redundant/compact

For the redundant format, the first 768 bytes of a large object are stored in the InnoDB data page, and the remaining data is stored in overflow pages. If there are multiple TEXT/BLOB fields, the data page will become extremely bloated, significantly affecting performance. The data page will be filled with almost useless data, resulting in additional resource consumption. At the same time, in a master - slave architecture, all data will be synchronized to the slave, causing additional network consumption. Therefore, in this scenario, generally only the path of the large object is saved in the database, and the actual data is stored on the disk.

1.2.2 dynamic/compressed

For the dynamic format, if the data size stored in a large object field is less than 40 bytes, all data is stored in the data page. In other scenarios, the data page only retains a 20 - byte pointer pointing to the overflow page. In this scenario, if the data stored in each large object field is less than 40 bytes, it has the same effect as varchar(40). Therefore, whether to use large object fields cannot be generalized.

It is not recommended to use large objects in the NDB engine. There are many details involved. For example, associated fields containing large objects do not support engine - related data pushdown; tables containing large objects require additional locking, etc.

When creating an index on a large object field, it must be a prefix index. For example, if field f1 is of type text, create an index idx_f1(f1(10)) for the first 10 characters.

1
2
mysql> alter table t2 add key idx_f100(f100);
ERROR 1170 (42000): BLOB/TEXT column 'f100' used in key specification without a key length

Partition fields in partitioned tables do not support large objects.

1
2
mysql> create table p1(f1 text) partition by list columns(f1) (partition p0 values in ('1','2'));
ERROR 1502 (HY000): A BLOB field is not allowed in partition function

mysql_allowed_packet: This parameter represents the upper limit of a single data packet transmitted between the MySQL server and the client. If there are text/blob fields, this parameter should be set to the maximum value of 1GB. Of course, both the client and the server must be set simultaneously.

1.7 How to Insert/Retrieve

Example 2

Insert the contents of all files in the /var/lib/mysql-files directory.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
root@ytt-pc:/var/lib/mysql-files# ll
total 32M
1324681  14M -rw-r--r-- 1 root  root   14M Dec 30 17:23 test3
1311598 6.6M -rw-r--r-- 1 root  root  6.6M Dec 30 17:23 test1
1316840 6.6M -rw-r--r-- 1 root  root  6.6M Dec 30 17:23 test2
1316693 5.5M -rw-r--r-- 1 root  root  5.4M Dec 30 17:26 random_test
1315066 4.0K -rw-r--r-- 1 root  root  1.9K Dec 30 14:56 sample_y1.txt.nl
1311256 4.0K -rw-r----- 1 mysql mysql 1.7K Dec 30 16:23 y1.txt
1316129 4.0K -rw-r--r-- 1 root  root  1.5K Dec 30 15:28 sample_y1.txt.nl2
1312042 4.0K -rw-r--r-- 1 root  root  1.5K Dec 30 15:07 sample_y1.txt.awk
1316691 4.0K -rw-rw-rw- 1 mysql mysql 1.4K Dec 30 16:25 y1_older.txt
1316168 4.0K -rw-r----- 1 mysql mysql  874 Dec 30 14:41 sample_y1.txt
1315893 4.0K -rw-r----- 1 root  root   874 Dec 30 14:45 sample_y1.txt.bak
1311076 4.0K -rw-r----- 1 mysql mysql  662 Dec 30 13:56 ascii.txt
1312152 4.0K -rw-r--r-- 1 root  root   648 Dec 30 14:37 sample.txt
1325161 4.0K -rw-r--r-- 1 root  root   403 Dec 31 10:59 y1111.txt.10
1312256 4.0K -rw-r----- 1 mysql mysql  164 Dec 31 09:49 y111.txt
1311248 4.0K -rw-r----- 1 mysql mysql  159 Dec 31 09:42 y11.txt
1327711 4.0K -rw-r--r-- 1 root  root   126 Dec  5 09:56 performance_schema_enable.sql
1326639 4.0K -rw-r--r-- 1 root  root    22 Jan 10 11:32 系统名称

The table structure for storing file contents.

1
2
mysql-(ytt/3305)->create table file_list_upload_dir(file_name varchar(100) not null primary key,content longtext, updated datetime);
Query OK, 0 rows affected (0.03 sec)

Insert file contents.

1
2
3
4
root@ytt-pc:/var/lib/mysql-files# for i in $(ls); \
do mysql -uytt -pytt -P3305 -h127.0.0.1  \
-e "use ytt;insert into file_list_upload_dir  \
values ('$i',load_file('$(pwd)/$i'),now())";done;

2. JSON Types

Most people are very familiar with the JSON type. It is a lightweight text - based interaction format that does not depend on any statements. For reference information, please see:

The JSON type is stored in a special binary format inside MySQL, similar to the JSONB type in PostgreSQL. The maximum space it occupies is the same as that of longtext or longblob. Although text can also store JSON objects, it does not have the format validation of the JSON type or the many functions provided internally. For example:

Example 3

Variables @a and @b are in standard JSON format and non - JSON format respectively. Create the json1 table.

1
2
3
4
5
6
7
8
mysql-(ytt/3305)->set @a='{"a":1,"b":2,"c":3,"d":4}';
Query OK, 0 rows affected (0.00 sec)

mysql-(ytt/3305)->set @b="{'a':1,'b':2,'c':3,'d':4}";
Query OK, 0 rows affected (0.00 sec)

mysql-(ytt/3305)->create table json1 (str1 json,str2 longtext);
Query OK, 0 rows affected (0.02 sec)

Inserting @a into str1 succeeds, while inserting @b fails; any characters can be inserted into str2.

1
2
3
4
5
6
7
8
mysql-(ytt/3305)->insert into json1 values (@a,@a);
Query OK, 1 row affected (0.00 sec)

mysql-(ytt/3305)->insert into json1 values (@b,@b);
ERROR 3140 (22032): Invalid JSON text: "Missing a name for object member." at position 1 in value for column 'json1.str1'.

mysql-(ytt/3305)->insert into json1 values (@a,@b);
Query OK, 1 row affected (0.01 sec)

Another example is the retrieval part. Retrieving data from JSON is much easier than from text. Using the previous example, if you need to retrieve the value 1 corresponding to a, it may be more difficult for @b, but much easier for @a. You can use the json_extract function.

1
2
3
4
5
6
7
mysql-(ytt/3305)->select json_extract(@a,'$.a') as 'a';
+------+
| a    |
+------+
| 1    |
+------+
1 row in set (0.00 sec)

Fortunately, for a format like @b, a single field can be split into multiple fields, so that the result can be easily retrieved using SQL statements. The advantages and disadvantages of splitting or not splitting are not discussed in this article and will be introduced in detail in the article on table design concepts.

Well, this is the end of the introduction to MySQL’s large object types. I hope it will be helpful to you.

👋 See you in the next lession.

The SQL Optimization Course

The SQL optimization knowledge

👋 See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.