Sunday, February 19, 2012

Efficient table record size

I understand that I can put a massive 8K varchar in one data page, but I want to know if that is the best design.

I have a data stream that can be from 10 to 7000 characters. 90% of the time the field data length is less than 700 characters. Is it better to create one massive varchar field or to create multiple smaller varchar fields? (Perhaps varchar(700) fields ten times.) Or does it matter?

(Also, the record is primarily inserted once then just read from then on.)

The query with the smaller fields would look like this:

SELECT RTRIM(
LEFT( ReportStream1 + SPACE(700), 700) +
LEFT(ISNULL(ReportStream2 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream3 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream4 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream5 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream6 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream7 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream8 , '') + SPACE(700), 700) +
LEFT(ISNULL(ReportStream9 , '') + SPACE(700), 700)
)
FROM ReportStream
WHERE StreamID = 123

It depends on what you want to do with the data. If the data is always selected as one chunk, performance wise, it is better to store it as a single column.

For each varchar column, you get a little bit storage overhead (about 2 bytes per column) that will be stored as well, so having one column is better from storage point of view.

However, if you frequently want to query subsets of your string (i.e. you only are interested in the first few characters, and want to do queries based on that), it makes more sense to have multiple columns, so that you can more easily query and index these substrings.

Thanks,

Marcel van der Holst
[MSFT]

No comments:

Post a Comment