Harlan Grove
My feedback

4 votes

6 votes3 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
=ADDRESS(1,COLUMN(AB1),4) returns "AB1", so SUBSTITUTE(ADDRESS(1,COLUMN(AB1),4),1,"") returns "AB". I realize 3 function calls are more than 1, but are 3 function calls for something as rarely needed as this an excessive burden?

19 votes12 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
@Ohil,
Excel would be far less pleasant for hundreds of millions to use if it had thousands of builtin functions to address the needs of hundreds of users each.
I realize COUNTIF(ref_to_string,"*"&substring&"*") may not be obvious, but it does address the need without all that much mysterious syntactic baggage.
I hate to put it this way, but with only 18 votes in favor, don't get your hope up on seeing this in Excel before Excel gets regular expression functions.
Harlan Grove commented@Roy, 'routine, and direct solutions to the general problems faced fairly often'
What if there are many (dozens if not hundreds) of variations on certain general problems? Text matching and manipulation are standard computing problems, but not financial arithmetic problems, so spreadsheets seldom address them.
Why not the most general solutions possible?
Anyway, blame Lotus 123 for this as well as MSFT's decision to copy 123's @function semantics so completely. Had SEARCH or FIND returned a positive integer for substring matches or 0 for no match rather than #VALUE!, this particular issue would never have arisen. However, backwards compatibility requires FIND and SEARCH continue to return #VALUE! for no match.
If Excel's eventual REGEXMATCH works the same as Google Sheets's function of the same name, it'll take care of this since if substring contained no metacharacters, REGEXMATCH(string,substring) would return TRUE when substring appeared within string and FALSE otherwise.
Harlan Grove commented@Roy, wildcad searching usually stops once a match has been established, so COUNTIF(something,"*abc*") would usually note the presence of the initial *, so find the 1st 'a' in something, then check whether the next 2 characters match 'bc'. If so, end and return 1, otherwise find the next 'a' in something.
Excel's * and ? wildcards are the most basic form of regular expressions, and regular expressions implemented correctly are quire efficient. Thus, you can't really complain COUNTIF would be inefficient.

1 vote4 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
The Excel team has never done a good job with boundary cases for special numeric functions. For a long time, GAMMALN(1) and GAMMALN(2) didn't return 0. I figure the Excel team figures those who really need precision for such things would be using MATLAB or other math packages rather than Excel. Excel really isn't a general mathematical tool.

61 votes7 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
There's a new UNIQUE function in the offing. It's in some Office Insider versions. Once it becomes standard, =COUNTA(UINIQUE(range)) would give the count if distinct items in range.

74 votes12 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
Thing is dynamic arrays can also spill into a potentially variable number of columns as well as rows. There's no obvious or perhaps even sensible way to handling column indexing which doesn't require numeric column indexing. Why should rows get special treatment?
To some extent, what's really needed is a sensible SEQUENCE function like the one in Google Sheets. It'd also be handy to have backtofront indexing like in Icon and SNOBOL languages so that index 1 referred to the LAST item in an array, 2 to the 2nd to last item, etc.

23 votes11 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
With new dynamic arrays, does it really make sense to have total rows and columns below or to the right, respectively, any longer? There's always been good reason to put such totals above or to the left of the range over which they'd evaluate, and in the coming era of dynamic arrays and spilled formulas, above/to the left seem to have become the only sensible way to do this.
That said, do we really need B3:Z3: =SUMCOLS(B5:Z999) rather than B3: =SUM(B5:B999) filled right into C3:Z3?
If spilled formulas could truly be unpredictable, so that the largest extent in terms of rows and columns really isn't known, would it ever make sense to put ANYTHING to the right or below any formula returning a variablesize array result?
I had to learn some PL/I way, way back, and I'd prefer Excel didn't copy the approach of adding anything anyone could think of to it. Minimalism may suggest a 2step approach, e.g., one new function which returns the full array extent of the results given by the topleft cell in its single reference argument or the largest single area range containing the same R1C1 formula as the topleft cell in its single reference argument. For this sort of thing, GETRANGE(B5) which would return a reference to B5:Z999, and that could be chopped up using INDEX(GETRANGE(B5),0,COLUMNS($B$3:D3)) for the sum of column D within the result range.

746 votes125 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
Thanks for the votes! Based on your feedback/votes we’ve just announced the new XLOOKUP function. You can read more about it here: https://techcommunity.microsoft.com/t5/ExcelBlog/AnnouncingXLOOKUP/bap/811376
Harlan Grove commentedWhile a useful addition, not that big a deal. Plus, given the frequency of support forum requests for finding nth exact matches, the eventual FILTER function would be better,
=INDEX(FILTER(Value_Column,Lookup_Column=Lookup_Value),n)
or for wildcard matches
=INDEX(FILTER(Value_Column,ISNUMBER(SEARCH(Lookup_Value,Lookup_Column))),n)
Harlan Grove commented@Anonymous,
PowerQuery is great for pulling data from external sources. It's overkill and slower getting data already within the same workbook as the formulas accessing that data.
Harlan Grove commentedSam,
Excel will return something for MATCH(1,(Rng=Criteria)^1), but it's the index of the last item in Rng which satisfies Criteria, and it only does that due to the bracketing binary search requires. It's useful for getting the LAST index, but it's debatable that's what most Excel users want.
Harlan Grove commentedFirst re tangent: I have a utility macro which does just 2 things: replaces all instances of =+ with = (@#$%&*! 123 users), then deletes all non3Dreference instances of the worksheet's name in formulas. I thought about creating Applicationlevel event handlers and putting that in BeforeSave, but I decided not to risk long shutdowns.
If you have multiple arrays of booleans, don't bother with N(booleanarray1)*N(booleanarray2). Just use (booleanarray1)*(booleanarray2). Excel converts TRUE/FALSE to 1/0 in arithmetic expressions. Also, it has limitations. If you had a RANGE of boolean values, say, X3:X10, N(X3:X10) doesn't return an 8x1 array, just the first cell converted to 1/0.
Multiple criteria and linear search: binary search REQUIRES the ability to bracket the value sought, that is, find contiguous subsets of the lookup set where ALL values before (location) the value sought are less than (value comparison) the value sought and ALL values after (location) the value sought are greater than (value comparison) the value sought. Multiple criteria would often if not usually produce the same composite FALSE value on both sides of the value sought, possibly even within 3items subsets (FALSE, TRUE, FALSE). Even if you have a table sorted on col A then col B, the composite criteria A:A,"<10",B:B,"=*X*" need not be sorted, which in the context of boolean values means all FALSE values before all TRUE values or vice versa. The A:A,"<10" portion would provide an initial subset, but B:B,"=*X*" wouldn't necessarily be sorted, so would require linear search.
Maybe it's possible to determine sorted state for several fields as a composite, but I figure the expected instances of sorted composites would be well below 1% of all uses. If so, the overhead to determine composite sorted state would overwhelm the rare advantage binary search would provide.
Harlan Grove commentedTangent first. I agree about the annoyance of Excel STUPIDLY adding the worksheet name to cell references in the same worksheet as the formula. If you're in SheetX!X99, start typing a formula, select a range in another worksheet, then select a range in SheetX, Excel should be smart enough to understand that it doesn't need the worksheet qualifier. At the very least there should be an option to drop the unnecessary worksheet qualifier.
MATCHIFS? What would it be? Would it return the topmost/leftmost row/column index of the first instance in which all criteria were satisfied? Or would it be a generalized lookup? The former might be interesting, but would always have to use linear search, though it could also use shortcircuit evaluation, so if the Nth item in range1 didn't satisfy criteria1, the corresponding items in the other ranges wouldn't need to be checked. Yet another way to perform lookups is problematic, and it gets back to something like the FILTER function in Google Sheets. For me, as general as possible beats all alternatives.
Harlan Grove commentedKenneth,
Your dynamic sort state defeats any benefits from binary search.
For unsorted matching, all MATCH does is iterate through its 2nd arg (which should be a 1D list) comparing it to its 1st arg, stops when it finds a match, and returns the match's index. If no match is found, it returns #N/A.
For ascending sorted matching, all MATCH does is check that its 1st arg is >= the first item in its 2nd arg (if not it returns #N/A quickly), then checks if its 1st arg is > than the last item in its 2nd arg (if so, returns the last item's index quickly). If still going, MATCH checks its 1st arg against the middle item in its 2nd arg list, and effectively replaces the whole list with the portion within which its 1st arg falls. It repeats this last bit until it finds an exact match or until item K is < 1st arg and item K+1 > 1st arg. Descending sorted matching works the opposite of this.
With N items in the list, unsorted matching involves no more than N comparisons, and sorted matching involves no more than INT(LOG(N,2))+2 comparisons. Your dynamic matching would involve 2(N1) comparisons before the matching would begin. How could that aid efficiency?
Harlan Grove commentedKenneth,
Maybe for you most large tables have filters. Not for me. One of the models I work with several times a day has a 2000+ row table in a hidden worksheet. Filters in hidden worksheets would be kinda pointless. There are over 6,000 lookup formulas into this table in other worksheets. I want that table sorted, and I DON'T want lookup functions which would only use binary search when there was an autofilter showing it was sorted.
You're talking about interrelating various pieces of functionality. I'd prefer keeping separate pieces of functionality as orthogonal as possible.
Also, tables can include columns with formulas. Autofiltered tables can be sorted on columns containing formulas, and the autofilter arrow will show sorted. However, those formulas could recalculate and change values, rendering the column no longer sorted, but the autofilter would still show it sorted. Point: relying on autofilter sorted state isn't robust. I've seen too many BIG MISTAKES arising from small assumptions in spreadsheets to be comfortable relying on any assumptions.
My Tables comment refers to looking up using columns other than the first one, and if those other columns were sorted, MATCH could use binary search. Your whole first paragraph in you previous comment showed you didn't understand that this was the situation to which I was referring.
Harlan Grove commentedKenneth,
MATCH with 3rd argument +/ 1 is meant for sorted ranges, and that's what I meant. Tables can be sorted on columns other than the first one. Also, I understand AND EXPECT +/ 1 to be used only when one KNOWS lists are sorted.
LOOKUP always assumes sorted lists. VLOOKUP and HLOOKUP allow for tables sorted on first column or row, respectively, in ascending order or unordered, but not sorted in descending order. MATCH allows ascending, descending or unordered, and that's part of its strength compared to the lookup functions.
The reason for indicating sorted or unsorted is the algorithm the lookup functions and MATCH use. If you know you have sorted lists, those functions can use binary search, which is O(log(N)). For unsorted lists, those functions have to use linear search, which is O(N). I've dealt with workbooks with lots of unnecessary linear search; it can really slow down recalc.
Finally, checking whether a list is sorted is O(N). Meaning NOTHING GAINED checking whether a list is sorted in order to use binary search. You should only use the binary search variants when you KNOW lists are sorted. That is, there's no sensible alternative to KNOWING whether a list is sorted, then TELLING Excel to treat it as sorted for lookups.
Harlan Grove commentedKenneth,
VLOOKUP (and HLOOKUP) still work well for threshold lookups on sorted tables, though INDEX+MATCH also works for that. Indeed, since MATCH can handle ascending and descending orders, INDEX+MATCH is more flexible for that too.
OK, find, time to deprecate the lookup functions, but no need for GETMATCH without providing at least the flexibility of INDEX(...,MATCH(...,...,1)), INDEX(...,MATCH(...,...,0)) and INDEX(...,MATCH(...,...,1)).
For me, the absolute ideal would be structured references from Excel, regular expressions from LibreOffice Calc, and the FILTER function from Google Sheets. Maybe also the resurrection of SQL.REQUEST and SQL queries against tables.
Harlan Grove commentedWyn,
GETMATCH being the only function aside from IFERROR to provide an alternative return value doesn't make sense. If all the lookup functions did so, much better.
Excel is becoming as encrusted as PL/I back in the bad old days of the early 1970s, just before C and its descendants took over. At the very least, it's time to deprecate the Lotus 123like database functions and the rather pointless *IF/*IFS functions in favor of the truly more useful and not difficult to explain FILTER function.
Harlan Grove commentedFurthering Kenneth's point, FILTER would eliminate the need for
COUNTIF
COUNTIFS
SUMIF
SUMIFS
AVERAGEIF
AVERAGEIFS
MAXIFS
MINIFS
DAVERAGE
DCOUNT
DCOUNTA
DGET
DMAX
DMIN
DPRODUCT
DSTDEV
DSTDEVP
DSUM
DVAR
DVARP
INDEX+exact MATCH
HLOOKUP (exact)
VLOOKUP (exact)and very likely over half of actual use of LARGE, SMALL and ROW.
As for Wyn's GETMATCH, just add a 4th argument to LOOKUP to let it do exact and threshold matching. Make that optional 4th arge 1/0/1 like MATCH, so the lookup array/range could be sorted in either order.
Harlan Grove commentedWyn,
If you have Google Sheet's FILTER,
=INDEX(X:X,MATCH(foo,G:G,0)) == =GETMATCH(foo,G:G,X:X) == =FILTER(X:X,G:G=foo)
Note that if GETMATCH returned #N/A when no match was found, it's easy enough to handle alternative return values using IFERROR.
However, even better,
=INDEX(X:X,MATCH(TRUE,IF(G:G=foo,J:J=bar),0)) == =FILTER(X:X,G:G=foo,J:J=bar)
For that matter,
=INDEX(X:X,SMALL(IF(G:G=foo,ROW(X:X)),n)) == =INDEX(FILTER((X:X,G:G=foo),n)
An equivalent for Google Sheets's FILTER function would be more flexible than INDEX+MATCH or GETMATCH (which is LOOKUP with exact matching).

115 votes17 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
Alternatively, make ALL informational XLM functions proper worksheet functions since GET.WORKBOOK(1) returns a horizontal array of all worksheets in the active workbook and GET.DOCUMENT(1,GET.CELL(66,A1)) returns a horizontal array of all worksheets in the workbook containing this formula.
@Tony Valko: Especially annoying since Lotus 123 Release 3's (1989, so 27 years ago) @CELL could return the worksheet name, its @INFO could return the number of worksheets in a workbook, and a combination of @CELL, @COORD and @@ could be used to return all worksheet names in a workbook. I believe Quattro Pro also had this functionality in the early 1990s. OTOH, Excel requires VBA or XLM functions.

53 votes9 comments · Excel for Windows (Desktop Application) » Tables, Sorting and Filtering · Flag idea as inappropriate… · Admin →Harlan Grove supported this idea ·

62 votes13 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove commented
Kenneth,
I didn't mention stack access other than the top. I don't want to recreate FORTH.
Being able to create names on the fly would be good. About time most of the XLM functions became available in nonmacro worksheets. Unfortunately, SET.NAME() raises the ticklish question whether expressions in cell formulas could change ALREADY DEFINED names, and whether ad hoc names created while evaluating formulas should be deleted when formula evaluation completes.
Harlan Grove commentedPoor example, as =MIN(5,[Formula]) would suffice.
I get the point, but PUSH and POP functions would work as long as PUSH returned its argument's value. That is, something like
=IF(PUSH(very_complex_expression)>=0,SQRT(PUSH(POP())),IF(PUSH(POP())<0,PUSH(POP())&" is negative",IF(NOT(ISERROR(PUSH(POP()))),PUSH(POP())&" not numeric",POP())))
OK, PUSH(POP()) would be very common, so maybe a PPOP() or GET().

34 votes10 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →Harlan Grove supported this idea ·

62 votes29 comments · Excel for Windows (Desktop Application) » Data Import · Flag idea as inappropriate… · Admin →
Thanks for the feedback!
Can you comment on whether you expect this to basically import all the lines as text to column A, or if you have some other expectation?
Also, I’m assuming people refer here to the file/open path, and not to the Data/FromText or NewQuery/FromText path.
Harlan Grove commentedFor example, genetic data often looks like dates, and opening CSVs converts such data into dates. Since opening CSVs doesn't launch Text to Columns, there's no way to prevent Excel from making this conversion other than by changing the CSV file's extention to .TXT and opening is as a text file rather than as a CSV file.
IOW, ONLY separate CSV records into columns, but otherwise leave as text with no conversion of numbers into numeric values, no conversion of anything looking like dates into dates, etc.
Harlan Grove supported this idea · 
10 votes2 comments · Excel for Windows (Desktop Application) » Opening and Saving Files · Flag idea as inappropriate… · Admin →Harlan Grove shared this idea ·

45 votes11 comments · Excel for Windows (Desktop Application) » Other · Flag idea as inappropriate… · Admin →Harlan Grove supported this idea ·
If your formulas in cells X99 and Y100 were =A5 and =B6, respectively, and you CUT cell B6 and pasted it into cell A5, you'd want both the X99 and Y100 formulas to be =A5? If Y100 were returning the correct value of the former B6 but now A5 but cell X99 returns #REF!. How would you be able to diagnose the problem in cell X99? If you made X99 the active cell, pressed [F2] to enter Edit mode, then pressed [Enter] to reenter it without changing anything, would you expect it still to return #REF! or the new value in A5?
Likewise for deleting row 5 or column A.
In any of these actions, you're DESTROYING a range to which your formulas refer. I'd argue it's better to have Excel replace the A5 reference with #REF!. It'd be better still if Excel displayed a confirmation dialog stating that you're about to destroy a range to which other cells refer and prompting you whether you really want to do that. However, if you did perform the action, Excel should alter the formulas. Maybe another approach would be Excel adding the original formula as a cell comment. However, the formula itself should be changed to reflect the fact that you've destroyed a formerly valid reference.