Wednesday, May 9, 2007

Blob performance concerns

Maybe someone can help me answer this question. I originally wrote about this on 3/2/2006 but never really found a good answer.

Suppose you are writing a program in C# that uses SQL Server 2005 and you'd like to store some files as Blobs (Binary Large OBjectS) in the database.

This is a very simple thing to do. Take the bytes, send them to SQL Server 2005 where you store them in a varbinary(max) column.

The code for this in C# looks like:

FileStream fs = new FileStream( "file" );
byte[] buff = new byte[fs.Length];
fs.Read( buff, 0, fs.Length );

SqlCommand sc = new SqlCommand(...);
sc.Parameters.Add( ...buff... );


My concern is that this code is going to read the entire file into memory. Then its going to send all that memory to SQL server in one shot. If this is a sufficiently large file I'd expect there to be problems on both the client and the server because of this.

If the client is somewhat lacking for memory, or if the file is really large, then I'd expect this code to cause a lot of disk paging. Possibly even paging out the beginning of the file you just read from disk to make room for the end of the file, just to turn around and page out the end of the file to make room for the beginning so it can send those bytes to SQL. I'm not clear such a thing could ever actually happen, but I do suspect some paging will occur.

What I'd really like to do is specify a stream as the parameter instead of a byte array. That way I could trust the driver was only loading a portion of the full file into a buffer at a time. It seems the JDBC driver supports this, but the SqlClient driver for ADO.NET doesn't.

So my questions are, am I blowing this way out of proportion? Is there a better way to insert a Blob in SQL from C#? Why does no one else on the internet address this concern?

Partly because of this problem, along with some other performance concerns, I ended up storing the files on a shared file server instead of as Blobs in SQL. This allowed to me to handle the transfer myself using the typical buffered approach.