Aggregate 2D arrays by row or by column (especially relevant to dynamic arrays)
Assuming the new dynamic array functionality receives widespread usage, one might expect to see more in the way of 2D arrays. Traditionally aggregation over rows (say) to give a column of row sums (or row minima, Booleans, text joins etc.) would be handled by relative referencing but this is not consistent with the concept of dynamic arrays.
My suggestion is to introduce functions BYROW and BYCOLUMN (and, possibly, BYCELL) that would cause aggregation operations to treat the 2D array as a collection of rows/columns/individual cells. The operation should return an array of results.
Example
= AND( BYROW( CriterionRange = Criterion ) )
should return TRUE/FALSE for every row of a spilt array.
A completely different aggregation might then be applied to the resulting column vector.

2 comments
-
First260 commented
Another possible approach could be to allow 'arrays as references' within database, conditional aggregation and other function arguments that currently only support ranges so that nested arrays such as INDEX({1,2;3,4},,{1,2}) would be handled in a similar way to arrays of ranges and return spilled arrays as result sets.
Stepping through the Evaluate Formula tool with ranges or arrays as function arguments suggests a link between argument type and results from passing an INDEX formula of this kind, as summarized below:
ByVal (SUMPRODUCT, MMULT) array truncation
ByRef (SUBTOTAL, SUMIFS, DSUM) array of errors
ByRef/ByVal (SUM, AGGREGATE, IMSUM) multi-cell results only with CSEMy impression is that it'd be simpler to target the second group of functions and replace the errors with calculated results (eg {4,6}) without any significant change to array calc. That said, if the multi-cell CSE results from the third group of functions could be converted to spilled arrays in a consistent manner to this request all the better.
-
Peter Bartholomew commented
Now as I have settled in to using dynamic arrays, I find that this limitation affects about 50% of the models I build. The observable symptoms are blocks of 'fill-down' formulas involving @k#
[k being a defined name applied to the anchor cell of a spilt array].Even worse is
= INDEX( twoDArray#, @k#, 0 )
[defining a row of the array]
The functions then applied to such objects might be simple aggregations or functions such as FILTER and UNIQUE.The formula needs to be applied for every value of 'k#', hence the fill, but there may be no way of telling in advance how many terms there are within the array. This makes how far to fill a matter of guesswork.