Thursday, January 2, 2014

T-SQL query optimizations with data type mismatches

We were looking into a performance issue on a database. It was the classic "the more data you have the worse the performance" performance issue. As anyone knows, you add an index to go from a full table scan to a hash-like lookup.

Well we did that. It didn't help. After further investigations, one of my awesome engineers realized that the SQL QP was not engaging the index. The reason was silly IMO. The datatype of one column was NCHAR and the other was just CHAR. Of course. It's obvious :). As soon as he converted the data type, the index kicked in and the performance flattened. Yeah for SQL :(

Here's a side note. The thing I hate to admit is that the NCHAR/CHAR columns were used to store GUID values. Now, my team didn't design the database, but whoever did was not aware of SQL Server's native data type for GUID (and the benefits of doing compares on them). I wonder if it was laziness or neglect. Either way, living with a database wherein you are deeply worried about minimizing upgrade risks makes this a life-long ownership problem. If only data was like code and you could throw it out with each new version...