Coder Perfect

What’s the best way to make a unique index on a NULL column?

Problem

I’m working with SQL Server 2005. I’d like to make a column’s values unique while still allowing NULLs.

My current solution entails creating a unique index on a view that looks like this:

CREATE VIEW vw_unq WITH SCHEMABINDING AS
    SELECT Column1
      FROM MyTable
     WHERE Column1 IS NOT NULL

CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)

Any better ideas?

Asked by Nuno G

Solution #1

You can create a filtered index in SQL Server 2008: http://msdn.microsoft.com/en-us/library/cc280372.aspx. (I notice Simon added this as a remark, but I thought it warranted its own response because it’s easy to overlook.)

Another method is to use a trigger to check for uniqueness, however this may have a negative impact on performance.

Answered by Phil Haselden

Solution #2

The “nullbuster” calculation column method is well-known; my notes credit Steve Kass:

CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X  int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)

Answered by onedaywhen

Solution #3

This is against the point of uniques, therefore don’t do that.

This guy, on the other hand, appears to have a reasonable workaround: http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html

Answered by willasaywhat

Solution #4

Filter predicates can be used to specify which rows should be included in the index.

From the documentation:

Example:

CREATE TABLE Table1 (
  NullableCol int NULL
)

CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;

Answered by Martin Staufcik

Solution #5

A unique nullable column (or collection of columns) can only be NULL (or a record of NULLs) once, because having the same value (including NULL) several times clearly violates the unique constraint.

However, this does not negate the validity of the concept of “unique nullable columns”; in order to implement it in any relational database, we must remember that these databases must be normalized in order to function properly, and normalization typically entails the addition of several (non-entity) extra tables to establish relationships between the entities.

Let’s start with a simple example that just has one “unique nullable column,” but it’s simple to expand to include more.

Assume we have the following information in the form of a table:

create table the_entity_incorrect
(
  id integer,
  uniqnull integer null, /* we want this to be "unique and nullable" */
  primary key (id)
);

Rather than having uniqnull “within” the entity, we may do it by separating uniqnull and adding a second table to establish a relationship between uniqnull values and the entity:

create table the_entity
(
  id integer,
  primary key(id)
);

create table the_relation
(
  the_entity_id integer not null,
  uniqnull integer not null,

  unique(the_entity_id),
  unique(uniqnull),
  /* primary key can be both or either of the_entity_id or uniqnull */
  primary key (the_entity_id, uniqnull), 
  foreign key (the_entity_id) references the_entity(id)
);

We need to create a row in the relation to correlate a value of uniqnull with a row in the entity.

We simply do not create a row in the relation for rows in the entity where no uniqnull values are associated (i.e. for the ones we would put NULL in the entity incorrect).

Note that uniqnull values will be unique for all the relation, and that there can only be one value in the relation for each value in the entity, as the primary and foreign keys on it ensure this.

If we want to correlate a value of 5 for uniqnull with a the entity id of 3, we must:

start transaction;
insert into the_entity (id) values (3); 
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;

And if the entity’s id value of 10 has no uniqnull counterpart, we merely do:

start transaction;
insert into the_entity (id) values (10); 
commit;

To denormalize this data and get the data that a table like the entity incorrect would have, we’ll need to do the following:

select
  id, uniqnull
from
  the_entity left outer join the_relation
on
  the_entity.id = the_relation.the_entity_id
;

When no matching columns are available in the relation, the “left outer join” operator ensures that all rows from the entity appear in the result, putting NULL in the uniqnull column.

Remember that a few days (or weeks or months) spent developing a well-normalized database (and the denormalizing views and processes that go with it) will save you years (or decades) of agony and lost resources.

Answered by roy

Post is based on https://stackoverflow.com/questions/191421/how-to-create-a-unique-index-on-a-null-column