Array saving + FILTER function to replace VLOOKUP, SUMIF family, and database functions
Hopefully this is the be-all and end-all of the VLOOKUP modifications, but if it isn't, feel free to comment.
We should really have is a FILTER() function that returns an array of results. We could use it to replace VLOOKUP, HLOOKUP, the database functions (e.g. DSUM, DCOUNT), and the SUMIF family (e.g. SUMIFS, COUNTIF) and still give more functionality! Let me show you what I mean:
FILTER(index, match type, value if no match, range to filter, criteria array)
-index is the index of the match to be returned; 0 for all matches (kind of like INDEX); default to 0 if not provided
-match type is like it is in MATCH, but with 2 extra values: exact match on sorted data (1 for ascending, 1 for descending); these would probably have values 2 and -2, respectively; default to 0 if not provided
-value if no match is the error handler (kind of like IFERROR(VLOOKUP(), value if no match)); default to #N/A if not provided
-range to filter is the exactly what is sounds like
-criteria array is an array of TRUE/FALSE
-the function returns the array/element where criteria array has a TRUE value for it; range to filter and criteria array must be the same size
To get the criteria array, use ARRAYAND, ARRAYOR, and ARRAYXOR:
ARRAYAND(criteria array 1, comparison 1, criteria 1 [, criteria array 2, comparison 2, criteria 2] ...)
similar deal for ARRAYOR and ARRAYXOR
-criteria array works like the criteria ranges in SUMIFS, except that they're arrays (we'll need arrays instead of ranges when we want complex criteria)
-comparison is a number or string indicating =, <, >, <>, <=, >= (seriously, why do we have to do string concatenation?); default to = if not provided; possibly even allow for IN or NOT IN operators
-criteria works as it does in SUMIFS, but without the concatenation of a comparison operator; possibly accept arrays for IN or NOT IN operators
-the function returns an array of TRUE/FALSE; all criteria ranges must be the same size
You can't have the criteria ranges as part of FILTER because if we are to have both AND and OR, we would also need to determine where brackets go to override the order of operations between AND and OR. That would be too many arguments.
Now to have complex criteria, we do this:
Obviously, with this setup, we don't need the database functions since now we can specify AND and OR criteria easily, and we don't need the extra range!
We won't need VLOOKUP either, since we can get any match that we want.
We don't need the SUMIF family either. Filter your range and use your aggregate function on that. For example:
Now we don't need a new [AGGREGATE]IF and [AGGREGATE]IFS function for every aggregate function. We just use the ones that we have. We can use this technique for functions like STDEV.S or SMALL, which don't have [AGGREGATE]IF equivalents.
As powerful as the FILTER + ARRAYAND functions might be, it's not a new idea. SQL queries have been had this functionality for forever (SELECT range to filter FROM table WHERE condition1 AND/OR condition2...). Why Microsoft didn't implement this from the beginning, I have no idea. If they thought that reducing functionality meant simplicity and user-friendliness, they were wrong. We need that functionality and the workarounds are atrocious.
On a related note, people are discouraged from using multiple single-cell array formulas because the arrays have to evaluate for each cell, which is slow. The FILTER approach described above would be slow for this reason. However, if someone's doing multiple FILTERs, they're likely reusing many of the same arrays.
Whether FILTER actually gets implemented or not, Microsoft needs to address the issue of array reuse. Currently, if you want to evaluate an array only once, you have to use the Name Manager (or at least I think it only evaluates once). I list some potential approaches below:
- Read formulas after a paste or fill down to detect the same array being used multiple times. Evaluate once, save into memory, and all of the same arrays just refer to the one in memory.
- Array Manager
- Allow arrays to be stored in cells. Since cells always evaluate to a value (not array), we would either need a GETARRAY function to return the array returned by a cell's formula, or we could just let arrays be returned by cells. This could even be indicated with a triangle on one of the lower corners of the cell, similar to an inconsistent formula (green, top left) or comment (red, top right).
For the list of posts that this post steals from, see the comments at https://excel.uservoice.com/forums/304921-excel-for-windows-desktop-application/suggestions/10801263-re-engineer-vlookup-to-add-the-flexibility-of-inde.
i wish that i can generate the report by family name
Kenneth Barber commented
Kenneth Barber commented
-"range to filter" should be "array to filter"
-All criteria arrays (not ranges) must be the same size
Also, "comparison" in ARRAYAND & variants could be used to indicate if the next argument is a regular expression as opposed to a normal string.
I also forgot about having "unique?" and "sorted" parameters:
Another parameter to have is "options" similar to the AGGREGATE function. Then AGGREGATE and SUBTOTAL could be replaced as well.