Make a simple safer version of VLOOKUP and INDEX MATCH
The majority of Excel users use VLOOKUP to get exact matches from a row or column, It would be good if they all used INDEX MATCH but it's more complex so how about this simpler version for the masses.
=GETMATCH(ReferenceToMatch, MatchColumn/Row, GetColumn/Row, Optional Value if no match)
If seen a few other references to updating VLOOKUPS and there are lots of opportunities, but it would be good to address the simple most widely used issue first.
Thanks for the votes! Based on your feedback/votes we’ve just announced the new XLOOKUP function. You can read more about it here: https://techcommunity.microsoft.com/t5/Excel-Blog/Announcing-XLOOKUP/ba-p/811376
MATCH with 3rd argument +/- 1 is meant for sorted ranges, and that's what I meant. Tables can be sorted on columns other than the first one. Also, I understand AND EXPECT +/- 1 to be used only when one KNOWS lists are sorted.
LOOKUP always assumes sorted lists. VLOOKUP and HLOOKUP allow for tables sorted on first column or row, respectively, in ascending order or unordered, but not sorted in descending order. MATCH allows ascending, descending or unordered, and that's part of its strength compared to the lookup functions.
The reason for indicating sorted or unsorted is the algorithm the lookup functions and MATCH use. If you know you have sorted lists, those functions can use binary search, which is O(log(N)). For unsorted lists, those functions have to use linear search, which is O(N). I've dealt with workbooks with lots of unnecessary linear search; it can really slow down recalc.
Finally, checking whether a list is sorted is O(N). Meaning NOTHING GAINED checking whether a list is sorted in order to use binary search. You should only use the binary search variants when you KNOW lists are sorted. That is, there's no sensible alternative to KNOWING whether a list is sorted, then TELLING Excel to treat it as sorted for lookups.
Harlan, if you can think of a use for MATCH(,,±1) in unsorted data, please tell me. Otherwise, we can assume that MATCH(,,±1) is only useful in sorted data, which is its greatest weakness. Why should the user of MATCH(,,±1) have to assume that the data will always be sorted? Even if you add sort detection to your MATCH(,,±1) formula, you still have to manually sort the data if it is not sorted (assuming no UDFs). This is why the match types, as we know them, should not be implemented in GETMATCH.
Of course, one main use for MATCH(,,1) is to get the position of the highest match that is ≤ our lookup value. MATCH(,,-1) is similar. If the match types behaved this way but without the requirement for sorted data, then they'd be OK in GETMATCH.
Another reason why you might want GETMATCH to have match types is to speed up searches in sorted data. I think that lookup functions should switch to binary search if they detect that the column is sorted, rather than be told which search method to use. See the link below:
Totally agree Vlookuo in 2010 version doesn't work as updates and patches are applied
Vlookup alternative for neophytes to match up datat from one spreadsheet to another based on a common data link. End-users scream at us that Vlookup is impossible unless they are programmers. I agree Vlookup is just way too complicated.
Paul K Dick commented
Of the GETMATCH proposal, I especially like the last paramater (value if not found). I think its a pain to have to code an IFERROR with each lookup I do. I must admit that the IFERROR is a **BIG** improvement to ISERROR.
VLOOKUP (and HLOOKUP) still work well for threshold lookups on sorted tables, though INDEX+MATCH also works for that. Indeed, since MATCH can handle ascending and descending orders, INDEX+MATCH is more flexible for that too.
OK, find, time to deprecate the lookup functions, but no need for GETMATCH without providing at least the flexibility of INDEX(...,MATCH(...,...,1)), INDEX(...,MATCH(...,...,0)) and INDEX(...,MATCH(...,...,-1)).
For me, the absolute ideal would be structured references from Excel, regular expressions from LibreOffice Calc, and the FILTER function from Google Sheets. Maybe also the resurrection of SQL.REQUEST and SQL queries against tables.
I should point out that GETMATCH and FILTER don't entirely cover the same scope. GETMATCH is for simple lookups, but for the more complicated stuff, we need FILTER. I'd be happy enough if we had both. Anything besides VLOOKUP.
GETMATCH being the only function aside from IFERROR to provide an alternative return value doesn't make sense. If all the lookup functions did so, much better.
Excel is becoming as encrusted as PL/I back in the bad old days of the early 1970s, just before C and its descendants took over. At the very least, it's time to deprecate the Lotus 1-2-3-like database functions and the rather pointless *IF/*IFS functions in favor of the truly more useful and not difficult to explain FILTER function.
While variants of formulas can serve experienced users brilliantly we still need basic safe and simple formulas for the general Excel user population.
That's who my suggestion is aimed at. Keep it simple with one formula with one purpose.
Simple to use, simple to review / audit
Furthering Kenneth's point, FILTER would eliminate the need for
and very likely over half of actual use of LARGE, SMALL and ROW.
As for Wyn's GETMATCH, just add a 4th argument to LOOKUP to let it do exact and threshold matching. Make that optional 4th arge 1/0/-1 like MATCH, so the lookup array/range could be sorted in either order.
To summarize what Harlan said, FILTER can easily handle multi-criterion lookups and can simulate a "get the nth match" function using INDEX(FILTER(),n). When n = 1, it acts like a multi-criterion INDEX/MATCH.
For me, that's only half of the beauty of FILTER. The other half is doing away with functions like SUMIFS or MAXIFS. FILTER is like a lemma. It's not very useful on its own, but it is extremely versatile when paired with other functions.
Wyn, I don't entirely agree with your idea of trying to integrate IFERROR into a lookup function, since there are other functions that you could try to integrate. For example, you could integrate INDEX by having an "nth match" parameter, or MATCH by having a parameter to indicate whether or not the range being searched is sorted.
However, integrating IFERROR into a lookup function is better than the approach that Microsoft seemed to take. IFNA was recently introduced to deal with the "no match" scenario for VLOOKUP and MATCH. It's like IFERROR, but it's specific to the #N/A error.
Hi Harlan, I like the idea of a multiple criteria lookup function and I think there are a few suggestions for this elsewhere on the UserVoice.
I think both our approaches are the same just the syntax is re-arranged?
The simpler the better, the more options that are provided to users the more daunting most users find it and the more likely they end up with an incorrect formula.
My suggestion of having an optional value if no match again does away with the need for an IFERROR wrapper.
If you have Google Sheet's FILTER,
=INDEX(X:X,MATCH(foo,G:G,0)) == =GETMATCH(foo,G:G,X:X) == =FILTER(X:X,G:G=foo)
Note that if GETMATCH returned #N/A when no match was found, it's easy enough to handle alternative return values using IFERROR.
However, even better,
=INDEX(X:X,MATCH(TRUE,IF(G:G=foo,J:J=bar),0)) == =FILTER(X:X,G:G=foo,J:J=bar)
For that matter,
=INDEX(X:X,SMALL(IF(G:G=foo,ROW(X:X)),n)) == =INDEX(FILTER((X:X,G:G=foo),n)
An equivalent for Google Sheets's FILTER function would be more flexible than INDEX+MATCH or GETMATCH (which is LOOKUP with exact matching).
Glenn Russell commented
Looks good to me!
Harlan, if you use index match then any column can be the key field. That's why I prefer that approach to VLOOKUP. This suggested formula would make that even simpler.
Daniel, one extra thought is you could rename the columns as part of the query to make them more meaningful (either within MSQuery or using PowerQuery)
Daniel, sounds like you're using a sensible approach for your situation
I agree Tables makes life much easier, but depending on where the information comes from the column headers can be meaningless. I am constantly using MSQUERY to farm information from our AS400 back into an Excel Table but I still define the names of table to make my formulas more meaningful/understandable. Defining the names may take an extra couple minutes up front but in my opinion being able to understand the formula for future modifications makes it well worth the time spent.
Anonymous, your SCHLOOKUP notation can get shorter still if we use MATCH's -1/0/1 instead of VLOOKUP's TRUE/FALSE for the last parameter. What is SCH by the way?
I disagree with the trade-off that you make (i.e. assuming the first column rather than using an extra parameter). In my work, I have sites as the column being searched. However, these sites fall into more generalized categories, which are also in the table. Naturally, I want the most general information on the left and the most specific information on the right, and so the sites are not the first column.
The Table idea is good, but Tables would be better if they also had key fields which didn't need to be first column. That might allow for Table lookups like
Exact matching would always be assumed, but the key field would need to have only distinct values, so could be indexed, so key lookups could be fast.
Generalized lookups would be nice too, but nicer still would be Google Sheets' FILTER function. With that, the need for LOOKUP, VLOOKUP, HLOOKUP and INDEX+MATCH disappears.