Friday, April 6, 2018

Constraint madness...

Constraint Madness

We often talk about constraints in Oracle Databases. Constraints are rules that you can define on columns in a table, that Oracle will check and enforce for you automatically. Constraints are critical to ensuring your data models integrity. Constraints are also critical to performance, for the Oracle optimizer uses the presence and type of constraints as part of the many inputs when it determines what the optimal execution path is for a given query.

In short - constraints are important. They provide the ability to ensure uniqueness, not null (ness?), primary keys (a combination of uniqueness and not null), referential integrity and so on. They can be the subject of holy wars - with those championing not using them usually not understanding the full scope of why constraints are important, and ways to work around problems they can create in real life models (such as data warehouses). These are all topics for another day. In this post, I want to focus on another consideration with respect to constraints, that I find is often ignored - that is the naming of constraints.

By default, when a constraint is created without a name being explicitly assigned to it, the database will assign a random name to it as seen in listing 2018_04_06_01:

Listing 2018_04_06_01 Creating A Not Null Column Constraint
create table b_copy.redaction_tables
( tab_id        number not null,
  tab_owner     varchar2(30),
  tab_name      varchar2(30) )
tablespace db_index;

Table created.


select owner, table_name, constraint_name, constraint_type from dba_constraints where table_name='REDACTION_TABLES';

OWNER                          TABLE_NAME                     CONSTRAINT_NAME                C
------------------------------ ------------------------------ ------------------------------ -
B_COPY                         REDACTION_TABLES               SYS_C00238040                  C


Note the use of the not null  clause in the definition of tab_id in the create table statement. This causes a constraint to be created which will prevent any NULL values from being added to a row in that table. You can see in the select statement after the table was created that the constraint was given a system generated name when it was created, SYS_C00238040. That's a pretty random and meaningless name.

Using Named Constraints

It is generally considered a best practice to explicitly name your constraints. In many cases, you won't see any direct benefit by doing this, but there are some specific use cases where naming constraints becomes really important. In these cases, having named your constraints, from the beginning, will save you a whole lot of trouble down the road.

Constraint Naming Standards 

The first thing to do, when you have decided to name your constraints, is to develop a naming standard that you will apply to those constraints. Consider the following as a possible naming standard to apply:
  • Not Null Constraints - N_{table_name}_{column_names}
  • Unique Constraints - U_{table_name}_{column_names}
  • Primary Key Constriants - P_{table_name}_{column_names}
  • Check Constraints - C_{table_name}_{column_names}
  • Referential Integrity Constraints - F_{child_table}_{parent_table}_{column_name}
Granted, there are problems with any naming standard. The biggest problem with naming of constraints is running out of space for the name itself. if you are running an Oracle database prior to Oracle Database 12.2 you are limited to a name of 30 bytes in length. This restriction is increased to 132 bytes in Oracle Database 12c.

When I need to reduce the size of a name, I will typically do the following:
  1. Remove all but the first vowel in the name. For example, lets assume we have a really nasty combination of table name and column names:

    p_master_table_primary_record_id_secondary_record_id

    That name is way to long. To reduce it's size, I'd first apply rule 1 which would give me the following name:

    p_mastr_tabl_primry_recrd_id_secndry_recrd_id
  2. Standardize commonly used names. For example:
    • TB for Table
    • REC for Record
    • MS for Master

    The name we have created after applying rule #1 is still too long. So, we will apply rule #2 which then gives us the name:

    p_ms_tb_primry_rec_id_secndry_rec_id
  3. Remove Underscores and like characters, from the right side to the left of the name. I like to use underscores in table names just to make it more readable. However, at this point, I'd remove them which would give us this name:

    p_mstbprimryrecidsecndryrecid
At this point, we are at 29 characters and within the length limit - and I'm out of shortening rules too! At this point, if I need to shorten it further, I just pretty much take a hack saw to the name and whack away.

After shortening, the resulting name might look like garbage to the uninitiated, it will mean something to those that know the standards and how they were applied. Additionally, the benefits of naming a constraint go beyond just having a name that is readable and that gives you a hint of what it's associated with. We will get into that in just a bit. 

 

Creating and Naming Constraints With the Alter Table Command

So, how do we create named constraints. There are many ways of doing it. The most common method seems to be using the alter table command to create the named constraint after the table itself has been created. We can see an example of using the alter table command to create a constraint in listing 2018_04_06_02:

Listing 2018_04_06_02 Creating A Not Null Column Constraint
SQL> create table b_copy.redaction_tables
( tab_id        number,
  tab_owner     varchar2(30),
  tab_name      varchar2(30) )
tablespace b_index;
  2    3    4    5
Table created.

SQL> select owner, table_name, constraint_name, constraint_type from dba_constraints where table_name='REDACTION_TABLES';

no rows selected

SQL> alter table b_copy.redaction_tables add constraint c_tab_id check (tab_id is not null);

Table altered.

SQL> select owner, table_name, constraint_name, constraint_type from dba_constraints where table_name='REDACTION_TABLES';

OWNER                          TABLE_NAME                     CONSTRAINT_NAME                C
------------------------------ ------------------------------ ------------------------------ -
B_COPY                         REDACTION_TABLES               C_
REDACTION_TABLES_TAB_ID      C
In this case, I started the name of the constraint with the prefix C_ - this indicates it's a check constraint and then finished out with the table name and the column name.

It seems that creating the table first and then using the alter table command to create constraints later is the form that you will see a lot of DDL generation tools create constraints. However, I prefer to create constraints, as much as possible, within the create table command itself. It turns out that there is not just one way to do this, but two ways of doing this!

 

Creating Explicitly Named Constraints In the Create Table SQL Command

Over the years, I have found that DBA's don't realize that they can define these constraints within the confines of the original create table command. In fact, one of my Jr. DBA's who is taking a class on Oracle Databases now has a textbook that says the only way to create a constraint is to use the alter table command.

There are two different ways of defining constraints within the create table command. These are:
  1. Inline definitions
  2. Out of line definitions
Most constraint types can be created either way, though creating a NOT NULL constraint using an out of line definition is a bit different than if you create it using an inline definition. 

Explicitly Naming Inline Constraints

An inline constraint is a constraint is where the constraint definition is actually included along with the definition of the column in the create table command.  For example, it is common to define a parimry key constraint using an inline constraint definition such as the one we see in listing 2018_04_06_03. Again, since the constraint is not named, it will just receive a random name assigned by the database:

Listing 2018_04_06_03 Creating An Inline Primary Key Constraint with a System Generated Name

create table b_copy.redaction_tables
( tab_id        number primary key,
  tab_owner     varchar2(30),
  tab_name      varchar2(30) )
tablespace db_index;


We could slightly alter the DDL seen in 2018_04_06_03, still using an inline constraint, and explicitly name the primary key constraint. The modified DDL in listing 2018_04_06_04 finds us removing the primary key keyword in the column definition, and replacing it with the constraint keyword. provides an example of how to do this. We changed the primary key clause to use the constraint clause, which provides more options for controlling the creation of the inline constraint. In this case, we named the primary key constraint, and we also defined what tablespace the index should be created in by using the using index clause:

Listing 2018_04_06_04 Creating A Not Null Column Constraint
create table b_copy.redaction_tables
( tab_id        number constraint p_redaction_tables_tab_id primary key
                using index tablespace b_index,
  tab_owner     varchar2(30),
  tab_name      varchar2(30) )
tablespace b_index;


Explicitly Naming Out of Line Constraints


Some people prefer to use out of line constraints within the create table command.  Either way works just fine, and I really don't have a preference either way.

An example of creating a primary key using an out of line constraint can be seen in listing 2018_04_06_05.

Listing 2018_04_06_05 Creating A Not Null Column Constraint

create table b_copy.redaction_tables
( tab_id        number,
  constraint
p_redaction_tables_tab_id primary key (tab_id) using index tablespace b_index,
  tab_owner     varchar2(30),
  tab_name      varchar2(30) )
tablespace b_index;


SQL> select owner, table_name, constraint_name, constraint_type from dba_constraints where table_name='REDACTION_TABLES';

OWNER                          TABLE_NAME                     CONSTRAINT_NAME                C
------------------------------ ------------------------------ ------------------------------ -
B                              REDACTION_TABLES               P_REDACTION_TABLES_TAB_ID      P


Note the main difference between the inline and out of line constraint is that the inline constraint occurs as a part of the column definition and the out of line constraint is independent of the column definition.

It's interesting to note that out of line constraints are not dependent on where they are included in the create table command either. In listing 2018_04_06_06 we see that a primary key constraint is actually defined in the create table command as an out of line constraint before the column that it's based on is actually defined in the DDL statement:

Listing 2018_04_06_06 Putting Your Constraint Before Your Column Definition Works

create table b_copy.redaction_tables
( tab_id        number not null,
  constraint p_redaction_tables_tab_id primary key (tab_num) using index tablespace b_index,
  tab_num       number,
  tab_owner     varchar2(30),
  varchar2(30) )
  tablespace b_index;

Table created.


Why Worry About Creating Explicitly Named Constraints? 

You might think that it really does not matter how you name constraints. After all, everything about that constraint is already included in database metadata anyway, right?

Maybe not.

First of all, assume you need to drop a constraint or disable it. I personally find it much easier to correctly type a command like alter table, that involves an explicit name:

alter table test_schema.my_table disable constraint fk_my_table_my_parent_col1;


as opposed to the alternative which might look like this:

alter table test_schema.my_table disable constraint SYS_C00202110;

Also, explicit names provide an additional check that you are dropping the correct constraint. If I mean to drop a constraint on the RECORDS table, and the constraint name is C_TYPES_TYPE_ID, then I probably am about to drop the wrong constraint. Since the constraint is named, I have a rational check that I don't have with a randomly named constraint. 

System generated constraint names can be problematic in cases where you use some form of database DDL replication. For example, assume you have a three node multi-master replication setup configured. You want to delete a specific check constraint on one of the masters and have that constraint dropped on the other two masters.

If you are using randomly generated constraint names, you have no way of knowing if the randomly generated constraint name is the same on all three masters. So, now, you have to go look on each master, find the constraint you are dropping and note the name of that constraint so that you are sure to drop it too. That's a lot of work, and a lot of room for error that can be addressed by explicit constraint naming.

You might think that in your little database environment you will never need to worry about such things. The only true constant in the world is change - so whatever you face today, will change tomorrow. That is why best practices are so important. Yes, they are there for the challenges of today, but to protect you from the unknown of tomorrow.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.