Coder Perfect

[closed] SQL varchar column length best practices

Problem

Every time I create a new SQL table or insert a new varchar column into an existing table, I question what the ideal length value is.

So, let’s say you have a varchar column called name. As a result, you must decide on the length. I can’t think of anything longer than 20 characters, but you’ll never know. I always round up to the next 2n number instead of using 20. In this scenario, I’d go with a length of 32. I do this because a number 2n appears to me to be more even than other numbers from a computer scientist’s perspective, and I’m just presuming that the architecture underlying can handle such numbers slightly better than others.

When you select to add a varchar column in MSSQL server, for example, the default length value is set to 50. That has piqued my interest. What is the significance of 50? Is it just a random number, or does it take into account the average column length?

It’s also possible – and likely – that different SQL server implementations (such as MySQL, MSSQL, Postgres, etc.) have varying optimal column length values.

Asked by esskar

Solution #1

There is no “optimization” in any DBMS that I am aware of that will make a VARCHAR with a 2n length perform better than one with a max length that is not a power of 2.

Early SQL Server versions, I believe, viewed a VARCHAR with a maximum length of 255 differently than one with a longer maximum length. I’m not sure if this is still true.

The actual storage required by almost all DBMS is decided only by the number of characters you enter, not by the maximum length you choose. So, from a storage (and most likely performance) standpoint, it makes no difference whether you designate a column as VARCHAR(100) or VARCHAR(100) (500).

The maximum length for a VARCHAR column should be viewed as a constraint (or business rule) rather than a technical/physical limitation.

The optimum PostgreSQL setup is to employ unrestricted text with a CHECK CONSTRAINT that limits the number of characters to whatever your business demands.

Changing the check constraint is significantly faster than changing the table if the requirement changes (because the table does not need to be re-written)

The same may be said for Oracle and other databases; however, with Oracle, VARCHAR(4000) would be used instead of text.

In SQL Server, I’m not sure if there’s a physical storage difference between VARCHAR(max) and, say, VARCHAR(500). However, it appears that using varchar(max) instead of varchar has a performance impact (8000).

Take a look at this link (posted by Erwin Brandstetter as a comment)

Edit 2013-09-22

Regarding bigown’s comment:

A change to the column description in Postgres versions before to 9.2 (which was not available when I published the initial answer) would rewrite the entire table, as seen here. This is no longer the case in version 9.2, and a brief test revealed that raising the column size for a table with 1.2 million rows took only 0.5 seconds.

This appears to be true for Oracle as well, based on the time it takes to change a large table’s varchar column. However, I was unable to locate any supporting documentation.

“In most circumstances, ALTER TABLE makes a temporary replica of the original table,” according to the MySQL documentation. And my own testing back this up: increasing the size of a column with an ALTER TABLE on a table with 1.2 million rows (the same as in my Postgres test) took 1.5 minutes. In MySQL, however, the “workaround” of using a check constraint to limit the number of characters in a column is not possible.

I couldn’t locate a clear statement on this in SQL Server, however the time it takes to increase the size of a varchar column (again, using the 1.2 million rows data mentioned earlier) implies that no rewriting occurs.

Edit 2017-01-24

It appears that I was (at least somewhat) mistaken regarding SQL Server. See Aaron Bertrand’s solution, which demonstrates how the declared length of a nvarchar or varchar column has a significant impact on performance.

Answered by a_horse_with_no_name

Solution #2

VARCHAR(255) and VARCHAR(2) both use the same amount of storage space! Only if you have a specific need for it to be smaller should you limit it. If not, make them all 255.

Larger columns take up more space while sorting, therefore if this slows down performance, you should be concerned and reduce the size of the columns. But if you only ever select one row from that table, it doesn’t matter if you make them all 255.

For more information, see What are the best varchar sizes for MySQL?

Answered by Ariel

Solution #3

When I create a new SQL table, I have the same feeling about 2n being more “even”… but, to summarize the responses, declaring varchar(2n) or even varchar(2n) has no meaningful impact on storage space (MAX).

When selecting a large varchar() limit, you should still consider the potential storage and performance effects. Let’s imagine you want to store product descriptions in a varchar(MAX) column with full-text indexing. If 99 percent of your descriptions are just 500 characters long, and then someone comes along and replaces them with wikipedia articles, you may experience unanticipated storage and performance issues.

Another point from Bill Karwin to consider:

Simply come up with some plausible business limitations and test on a wider scale. Family names in the United Kingdom are often between 1-35 characters long, as @onedaywhen pointed out. If you change it to varchar(64), it won’t do any harm… unless you’re storing this guy’s family name, which is said to be up to 666 characters long. In that scenario, varchar(1028) might be a better choice.

And, just in case it’s useful, here’s what varchar 25 through 210 might look like if they were filled:

varchar(32)     Lorem ipsum dolor sit amet amet.

varchar(64)     Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donecie

varchar(128)    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donecie
                vestibulum massa. Nullam dignissim elementum molestie. Vehiculas

varchar(256)    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donecie
                vestibulum massa. Nullam dignissim elementum molestie. Vehiculas
                velit metus, sit amet tristique purus condimentum eleifend. Quis
                que mollis magna vel massa malesuada bibendum. Proinde tincidunt

varchar(512)    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donecie
                vestibulum massa. Nullam dignissim elementum molestie. Vehiculas
                velit metus, sit amet tristique purus condimentum eleifend. Quis
                que mollis magna vel massa malesuada bibendum. Proinde tincidunt
                dolor tellus, sit amet porta neque varius vitae. Seduse molestie
                lacus id lacinia tempus. Vestibulum accumsan facilisis lorem, et
                mollis diam pretium gravida. In facilisis vitae tortor id vulput
                ate. Proin ornare arcu in sollicitudin pharetra. Crasti molestie

varchar(1024)   Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donecie
                vestibulum massa. Nullam dignissim elementum molestie. Vehiculas
                velit metus, sit amet tristique purus condimentum eleifend. Quis
                que mollis magna vel massa malesuada bibendum. Proinde tincidunt
                dolor tellus, sit amet porta neque varius vitae. Seduse molestie
                lacus id lacinia tempus. Vestibulum accumsan facilisis lorem, et
                mollis diam pretium gravida. In facilisis vitae tortor id vulput
                ate. Proin ornare arcu in sollicitudin pharetra. Crasti molestie
                dapibus leo lobortis eleifend. Vivamus vitae diam turpis. Vivamu
                nec tristique magna, vel tincidunt diam. Maecenas elementum semi
                quam. In ut est porttitor, sagittis nulla id, fermentum turpist.
                Curabitur pretium nibh a imperdiet cursus. Sed at vulputate este
                proin fermentum pretium justo, ac malesuada eros et Pellentesque
                vulputate hendrerit molestie. Aenean imperdiet a enim at finibus
                fusce ut ullamcorper risus, a cursus massa. Nunc non dapibus vel
                Lorem ipsum dolor sit amet, consectetur Praesent ut ultrices sit

Answered by Kit

Solution #4

The best value is the one that corresponds to the data in the underlying domain.

VARCHAR(10) is appropriate for the Name property; however, VARCHAR(255) may be the best option for other domains.

Answered by Oded

Solution #5

Adding to a_horse_with_no_name’s answer you might find the following of interest…

-- try to create a table with max varchar length
drop table if exists foo;
create table foo(name varchar(65535) not null)engine=innodb;

MySQL Database Error: Row size too large.

-- try to create a table with max varchar length - 2 bytes for the length
drop table if exists foo;
create table foo(name varchar(65533) not null)engine=innodb;

Executed Successfully

-- try to create a table with max varchar length with nullable field
drop table if exists foo;
create table foo(name varchar(65533))engine=innodb;

MySQL Database Error: Row size too large.

-- try to create a table with max varchar length with nullable field
drop table if exists foo;
create table foo(name varchar(65532))engine=innodb;

Executed Successfully

Remember to include the length byte(s) and the nullable byte, as follows:

varchar name (100) 1 byte (length) Plus up to 100 characters if not null (latin1)

varchar name (500) 2 bytes (length) Plus up to 500 characters if not null (latin1)

varchar name (65533) 2 bytes (length) + up to 65533 characters if not null (latin1)

2 bytes (length) + up to 65532 chars (latin1) + 1 null byte = name varchar(65532)

I hope this information is useful:)

Answered by Jon Black

Post is based on https://stackoverflow.com/questions/8295131/best-practices-for-sql-varchar-column-length