1

Recently there is a task to parse SQL statement to check the SQL with some custom specification with Python RE module & sqlparse

e.g.

CREATE TABLE `student_info` (
`id` INT (11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'primary',
`stu_name` VARCHAR (10) NOT NULL DEFAULT '' COMMENT 'username',
`stu_class` VARCHAR (10) NOT NULL DEFAULT '' COMMENT 'class',
`stu_num` INT (11) NOT NULL DEFAULT '0' COMMENT 'study number',
`stu_score` SMALLINT UNSIGNED NOT NULL DEFAULT '0' COMMENT 'total',
`tuition` DECIMAL (5, 2) NOT NULL DEFAULT '0' COMMENT 'fee',
`phone_number` VARCHAR (20) NOT NULL DEFAULT '0' COMMENT 'mobile',
`create_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'record created time',
`update_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'record updated time',
`status` TINYINT NOT NULL DEFAULT '1' COMMENT 'some comment',
PRIMARY KEY (`id`),
UNIQUE KEY uniq_stu_num (`stu_num`),
KEY idx_stu_score (`stu_score`),
KEY idx_update_time_tuition (`update_time`, `tuition`)
) ENGINE = INNODB charset = utf8mb4 COMMENT 'Student table';

And I try to catch this statement with RE use some specification

  • fields must have COMMENT
  • must have PRIMARY KEY, and PRIMARY KEY must AUTO_INCREMENT
  • Every field must have DEFAULT value
  • ENGINE must be INNODB
  • charset must be utf8mb4

And I use regex pattern like:

create\s+table\s*`\w*`\s*\(\n\s*`([\w\-_]*)`\s*([\w]*).*(auto_increment)([\n\s\w()',`]*)(primary key)\s*\(`([\w\-_]*)`\).*\n.*engine\s*=\s*(InnoDB).*charset\s*=\s*([\w\-]*);

to group all the key information and process later.
[Regex Demo]

But I cannot group every single field information, may be cause the order, can someone fix that regex expression, or just taught me some clue ?

3
  • I disagree with requiring an AUTO_INCREMENT PK. Commented Aug 25, 2017 at 5:35
  • Observe how lame the 'required' comments are. Commented Aug 25, 2017 at 5:36
  • Thanks for replying, the specification is other DBA's specification, i just try to solve this, and in this case comments are just for test, just for convenient. Commented Aug 25, 2017 at 6:24

1 Answer 1

1

Note: I should note that it is not so good to do everything by regex.
Note: You can use multiple steps to validate a string by regex.

So, A start of using regex for it that I can think of it can be:

Step 1: Check whole create command:

"^\s*create\s+table\s*`([a-z]\w+)`\s*\(([\s\S]+)\)\s*
 engine\s*=\s*innodb\s+
 charset\s*=\s*utf8mb4\s+
 comment\s'[^']+'\s*;\s*$
"giu

\1 = name of table
\2 = body of create statement

[Regex Demo]

Step 2: Check structure of body of create command - from \2 -:

"([\s\S]+)\s*
 primary\s+key\s*\(\s*`([a-z]\w*)`\s*\)\s*
 (,\s*unique\s+key\s+uniq_\w+\s*\(`([a-z]\w*)`\))?
 (,\s*key\s+idx_\w+\s*\((\s*,?\s*`([a-z]\w*)`\s*)+\)\s*)+
"giu

\1 = fields info
\2 = primary key field name
\4 = unique key field name
\7 = keys field name

[Regex Demo]

Step 3: Check fields info

"(`([a-z]\w*)`\s+
 (timestamp|(tiny|small|)int(\s+\(\s*\d+\s*\))?(\s+unsigned)?|varchar\s*\(\d+\)|decimal\s*\(\s*\d+\s*,\s*\d+\s*\)))\s+
 not\s+null\s+
 (auto_increment|default\s+('[^']*'|current_timestamp))\s+
 comment\s+'[^']+',
"giu

\2 = fields name

[Regex Demo]

Step 4: Check fields name of Step 2 with fields name of Step 3
And Check primary key field name of Step 2 with \1 of below regex:

"
 `([a-z]\w)*`.+auto_increment
"giu

[Regex Demo]


If you want to have any sort of engine part and charset part your regex will change to:

^\s*create\s+table\s*`([a-z]\w+)`\s*\(([\s\S]+)\)\s*
 ((engine\s*=\s*innodb\s+)(charset\s*=\s*utf8mb4\s+)?|(charset\s*=\s*utf8mb4\s+)(engine\s*=\s*innodb\s+)?)?
 comment\s'[^']+'\s*;\s*
$

[Regex Demo]

HTH

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @shA.t ;) I'll update code later, and good night from CN
Hi, sorry, there is one more Q, sometime after all fields maybe has different order, like engine=xx charset=xxx, or charset=xxx engine=xxx, how can i fix this, (use ?=) ? thanks again
OMG! it will be so pain full - plz, check this regex ;).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.