Implemented 'column tetris' in OrmLite for PGSql and others

We needed a simple way to compress columns on disk for PGSQL with OrmLite when a table is created. This is also called ‘column tetris’. I made some changes to OrmLite in order to limit the wasted space per row for PGSQL. The code is adaptable for other DB systems as well and I could implement this for MSSql as well.

Benefits of reordering of columns is that a row takes less space on disk since wasted space is eliminated. The implementation is a proof of concept ans I plan to extend it to a ColumnOrderingStrategy which can be plugged into any DbProvider.

I need a second pair of eyes to make the code generically usable and fit for inclusion in the OrmLite codebase. Also, I need confirmation that I changed the correct bits and pieces. How can I donate this code and make it part of the OrmLite codebase?

I’ve added an API to dialect providers which will let you change the order of columns that are created in this commit. So you can change the order of columns PostgreSQL creates with:

PostgreSqlDialectProvider.Instance.CreateTableFieldsStrategy = modelDef => mySort(modelDef.FieldDefinitions);

This change is available from v5.1.1 that’s now available on MyGet.

As this has never been requested I don’t want to bake any sorting impl in OrmLite yet, IMO the best chance of adoption is to create a post that documents the benefits gained from the custom ordering impl.

OK, this is a good start. For this to work, I need some modifications to the existing converters too, or I need the information included in the FieldDefinition.
The information needed is the ByteSize of a column in a row and wether or not is is a Fixed size column. I added the information to all PGConverters like this:

    public override List<FieldDefinition> OptimizeColumnOrder(List<FieldDefinition> fieldDefinitions)
{
  /* Ordering columns, assume 8 byte boundaries
   Stratergy:
   - Collect all fixed width columns and all non-fixed columns
   - Use ID column first and fill up to 8 with int and bool columns       
  */
  var temp = new List<FieldDefinition>(fieldDefinitions);
  var result = new List<FieldDefinition>();
  var byteCount = 0;
  var idColumn = temp.FirstOrDefault(x => x.IsPrimaryKey);
  if (idColumn != null)
  {
    temp.Remove(idColumn);
    result.Add(idColumn);

    byteCount += GetRowBytes(idColumn);
    var fixedColumns = temp.Where(x => IsFixedSize(x)).OrderByDescending(x => GetRowBytes(x));
    if (byteCount % 8 != 0)
    {
      var bytesNeeded = 8 - byteCount;
      var fillColumn = fixedColumns.FirstOrDefault(x => GetRowBytes(x) == bytesNeeded);
      if (fillColumn != null)
      {
        temp.Remove(fillColumn);
        result.Add(fillColumn);
        byteCount += GetRowBytes(fillColumn);
      }
    }
    // Add other fixed columns
    result.AddRange(fixedColumns);

    // Add non-fixed columns
    result.AddRange(temp.Where(x => !IsFixedSize(x)));
  }
  return result;
}

private int GetRowBytes(FieldDefinition fieldDefinition)
{
  var converter = GetConverter(fieldDefinition.FieldType);
  if (converter != null)
  {
    return converter.RowBytes;
  }
  return 1;
}

private bool IsFixedSize(FieldDefinition fieldDefinition)
{
  var converter = GetConverter(fieldDefinition.FieldType);
  if (converter != null)
  {
    return converter.IsFixedSize;
  }
  return false;
}

If you have any idea on how to do this now, please let me know :smile:
(this is the code before I added the change you added)

The FieldDefinition only maintains info about the code class/property metadata. You’ll need to maintain any additional info you need in your own dictionary/collections.