Skip to main content
Added the content in reference.
Source Link
Vaillancourt
  • 16.4k
  • 17
  • 56
  • 61

Could anyone explain this a little bit? Thanks.

Here is the quote:

For performance reasons, the program treats boneMatrix as an array of float4 vectors rather than an array of float3x4 matrices. The matrixIndex` array contains floating-point values instead of integers, and so the addressing of a single array of vectors is more efficient than accessing an array of matrices. The implication of this is that the indices in the matrixIndex vector should be three times the actual matrix index. So, the program assumes 0 is the first matrix in the array, 3 is the second matrix, and so on. The indices are fixed for each vertex, so you improve performance by moving this "multiply by 3" outside the vertex program.

And here is the Cg program it's referring to:

// Example 6-5. The C6E5v_skin4m Vertex Program
void C6E5v_skin4m(float3   position    : POSITION,
                  float3   normal      : NORMAL,
                  float2   texCoord    : TEXCOORD0,
                  float4   weight      : TEXCOORD1,
                  float4   matrixIndex : TEXCOORD2,
              out float4   oPosition   : POSITION,
              out float2   oTexCoord   : TEXCOORD0,
              out float4   color       : COLOR,
          uniform Light    light,
          uniform float4   boneMatrix[72], // 24 matrices
          uniform float4x4 modelViewProj)
{
  float3 netPosition = 0, netNormal = 0;

  for (int i = 0; i < 4; i++) {
    float index = matrixIndex[i];
    float3x4 model = float3x4(boneMatrix[index + 0],
                              boneMatrix[index + 1],
                              boneMatrix[index + 2]);

    float3 bonePosition = mul(model, float4(position, 1));
    // Assume no scaling in matrix, just rotate & translate
    float3x3 rotate = float3x3(model[0].xyz,
                               model[1].xyz,
                               model[2].xyz);

    float3 boneNormal = mul(rotate, normal);
    netPosition += weight[i] * bonePosition;
    netNormal   += weight[i] * boneNormal;
  }

  netNormal = normalize(netNormal);
  oPosition = mul(modelViewProj, float4(netPosition, 1));
  oTexCoord = texCoord;
  color = computeLighting(light, netPosition, netNormal);
}

Could anyone explain this a little bit? Thanks.

Could anyone explain this a little bit?

Here is the quote:

For performance reasons, the program treats boneMatrix as an array of float4 vectors rather than an array of float3x4 matrices. The matrixIndex` array contains floating-point values instead of integers, and so the addressing of a single array of vectors is more efficient than accessing an array of matrices. The implication of this is that the indices in the matrixIndex vector should be three times the actual matrix index. So, the program assumes 0 is the first matrix in the array, 3 is the second matrix, and so on. The indices are fixed for each vertex, so you improve performance by moving this "multiply by 3" outside the vertex program.

And here is the Cg program it's referring to:

// Example 6-5. The C6E5v_skin4m Vertex Program
void C6E5v_skin4m(float3   position    : POSITION,
                  float3   normal      : NORMAL,
                  float2   texCoord    : TEXCOORD0,
                  float4   weight      : TEXCOORD1,
                  float4   matrixIndex : TEXCOORD2,
              out float4   oPosition   : POSITION,
              out float2   oTexCoord   : TEXCOORD0,
              out float4   color       : COLOR,
          uniform Light    light,
          uniform float4   boneMatrix[72], // 24 matrices
          uniform float4x4 modelViewProj)
{
  float3 netPosition = 0, netNormal = 0;

  for (int i = 0; i < 4; i++) {
    float index = matrixIndex[i];
    float3x4 model = float3x4(boneMatrix[index + 0],
                              boneMatrix[index + 1],
                              boneMatrix[index + 2]);

    float3 bonePosition = mul(model, float4(position, 1));
    // Assume no scaling in matrix, just rotate & translate
    float3x3 rotate = float3x3(model[0].xyz,
                               model[1].xyz,
                               model[2].xyz);

    float3 boneNormal = mul(rotate, normal);
    netPosition += weight[i] * bonePosition;
    netNormal   += weight[i] * boneNormal;
  }

  netNormal = normalize(netNormal);
  oPosition = mul(modelViewProj, float4(netPosition, 1));
  oTexCoord = texCoord;
  color = computeLighting(light, netPosition, netNormal);
}
Source Link

Why is addressing an array of vectors more efficient than addressing an array of matrices in Cg?

According to Nvidia's Cg tutorial(in the note section right under 6.5.2), addressing an array of vectors seems to be more efficient than addressing an array of matrices. The reason it mentions is because the index is floating point value instead of integer.

Could anyone explain this a little bit? Thanks.