4 votes1 comment · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
If your formulas in cells X99 and Y100 were =A5 and =B6, respectively, and you CUT cell B6 and pasted it into cell A5, you'd want both the X99 and Y100 formulas to be =A5? If Y100 were returning the correct value of the former B6 but now A5 but cell X99 returns #REF!. How would you be able to diagnose the problem in cell X99? If you made X99 the active cell, pressed [F2] to enter Edit mode, then pressed [Enter] to reenter it without changing anything, would you expect it still to return #REF! or the new value in A5?
Likewise for deleting row 5 or column A.
In any of these actions, you're DESTROYING a range to which your formulas refer. I'd argue it's better to have Excel replace the A5 reference with #REF!. It'd be better still if Excel displayed a confirmation dialog stating that you're about to destroy a range to which other cells refer and prompting you whether you really want to do that. However, if you did perform the action, Excel should alter the formulas. Maybe another approach would be Excel adding the original formula as a cell comment. However, the formula itself should be changed to reflect the fact that you've destroyed a formerly valid reference.
6 votes3 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
=ADDRESS(1,COLUMN(AB1),4) returns "AB1", so SUBSTITUTE(ADDRESS(1,COLUMN(AB1),4),1,"") returns "AB". I realize 3 function calls are more than 1, but are 3 function calls for something as rarely needed as this an excessive burden?
19 votes12 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
Excel would be far less pleasant for hundreds of millions to use if it had thousands of built-in functions to address the needs of hundreds of users each.
I realize COUNTIF(ref_to_string,"*"&substring&"*") may not be obvious, but it does address the need without all that much mysterious syntactic baggage.
I hate to put it this way, but with only 18 votes in favor, don't get your hope up on seeing this in Excel before Excel gets regular expression functions.
@Roy, 'routine, and direct solutions to the general problems faced fairly often'
What if there are many (dozens if not hundreds) of variations on certain general problems? Text matching and manipulation are standard computing problems, but not financial arithmetic problems, so spreadsheets seldom address them.
Why not the most general solutions possible?
Anyway, blame Lotus 1-2-3 for this as well as MSFT's decision to copy 1-2-3's @-function semantics so completely. Had SEARCH or FIND returned a positive integer for substring matches or 0 for no match rather than #VALUE!, this particular issue would never have arisen. However, backwards compatibility requires FIND and SEARCH continue to return #VALUE! for no match.
If Excel's eventual REGEXMATCH works the same as Google Sheets's function of the same name, it'll take care of this since if substring contained no metacharacters, REGEXMATCH(string,substring) would return TRUE when substring appeared within string and FALSE otherwise.
@Roy, wildcad searching usually stops once a match has been established, so COUNTIF(something,"*abc*") would usually note the presence of the initial *, so find the 1st 'a' in something, then check whether the next 2 characters match 'bc'. If so, end and return 1, otherwise find the next 'a' in something.
Excel's * and ? wildcards are the most basic form of regular expressions, and regular expressions implemented correctly are quire efficient. Thus, you can't really complain COUNTIF would be inefficient.
1 vote4 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
The Excel team has never done a good job with boundary cases for special numeric functions. For a long time, GAMMALN(1) and GAMMALN(2) didn't return 0. I figure the Excel team figures those who really need precision for such things would be using MATLAB or other math packages rather than Excel. Excel really isn't a general mathematical tool.
61 votes7 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
There's a new UNIQUE function in the offing. It's in some Office Insider versions. Once it becomes standard, =COUNTA(UINIQUE(range)) would give the count if distinct items in range.
74 votes12 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
Thing is dynamic arrays can also spill into a potentially variable number of columns as well as rows. There's no obvious or perhaps even sensible way to handling column indexing which doesn't require numeric column indexing. Why should rows get special treatment?
To some extent, what's really needed is a sensible SEQUENCE function like the one in Google Sheets. It'd also be handy to have back-to-front indexing like in Icon and SNOBOL languages so that index -1 referred to the LAST item in an array, -2 to the 2nd to last item, etc.
23 votes11 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
With new dynamic arrays, does it really make sense to have total rows and columns below or to the right, respectively, any longer? There's always been good reason to put such totals above or to the left of the range over which they'd evaluate, and in the coming era of dynamic arrays and spilled formulas, above/to the left seem to have become the only sensible way to do this.
That said, do we really need B3:Z3: =SUMCOLS(B5:Z999) rather than B3: =SUM(B5:B999) filled right into C3:Z3?
If spilled formulas could truly be unpredictable, so that the largest extent in terms of rows and columns really isn't known, would it ever make sense to put ANYTHING to the right or below any formula returning a variable-size array result?
I had to learn some PL/I way, way back, and I'd prefer Excel didn't copy the approach of adding anything anyone could think of to it. Minimalism may suggest a 2-step approach, e.g., one new function which returns the full array extent of the results given by the top-left cell in its single reference argument or the largest single area range containing the same R1C1 formula as the top-left cell in its single reference argument. For this sort of thing, GETRANGE(B5) which would return a reference to B5:Z999, and that could be chopped up using INDEX(GETRANGE(B5),0,COLUMNS($B$3:D3)) for the sum of column D within the result range.
746 votes125 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
Thanks for the votes! Based on your feedback/votes we’ve just announced the new XLOOKUP function. You can read more about it here: https://techcommunity.microsoft.com/t5/Excel-Blog/Announcing-XLOOKUP/ba-p/811376
While a useful addition, not that big a deal. Plus, given the frequency of support forum requests for finding nth exact matches, the eventual FILTER function would be better,
or for wildcard matches
PowerQuery is great for pulling data from external sources. It's overkill and slower getting data already within the same workbook as the formulas accessing that data.
Excel will return something for MATCH(1,(Rng=Criteria)^-1), but it's the index of the last item in Rng which satisfies Criteria, and it only does that due to the bracketing binary search requires. It's useful for getting the LAST index, but it's debatable that's what most Excel users want.
First re tangent: I have a utility macro which does just 2 things: replaces all instances of =+ with = (@#$%&*! 1-2-3 users), then deletes all non-3D-reference instances of the worksheet's name in formulas. I thought about creating Application-level event handlers and putting that in BeforeSave, but I decided not to risk long shutdowns.
If you have multiple arrays of booleans, don't bother with N(booleanarray1)*N(booleanarray2). Just use (booleanarray1)*(booleanarray2). Excel converts TRUE/FALSE to 1/0 in arithmetic expressions. Also, it has limitations. If you had a RANGE of boolean values, say, X3:X10, N(X3:X10) doesn't return an 8x1 array, just the first cell converted to 1/0.
Multiple criteria and linear search: binary search REQUIRES the ability to bracket the value sought, that is, find contiguous subsets of the lookup set where ALL values before (location) the value sought are less than (value comparison) the value sought and ALL values after (location) the value sought are greater than (value comparison) the value sought. Multiple criteria would often if not usually produce the same composite FALSE value on both sides of the value sought, possibly even within 3-items subsets (FALSE, TRUE, FALSE). Even if you have a table sorted on col A then col B, the composite criteria A:A,"<10",B:B,"=*X*" need not be sorted, which in the context of boolean values means all FALSE values before all TRUE values or vice versa. The A:A,"<10" portion would provide an initial subset, but B:B,"=*X*" wouldn't necessarily be sorted, so would require linear search.
Maybe it's possible to determine sorted state for several fields as a composite, but I figure the expected instances of sorted composites would be well below 1% of all uses. If so, the overhead to determine composite sorted state would overwhelm the rare advantage binary search would provide.
Tangent first. I agree about the annoyance of Excel STUPIDLY adding the worksheet name to cell references in the same worksheet as the formula. If you're in SheetX!X99, start typing a formula, select a range in another worksheet, then select a range in SheetX, Excel should be smart enough to understand that it doesn't need the worksheet qualifier. At the very least there should be an option to drop the unnecessary worksheet qualifier.
MATCHIFS? What would it be? Would it return the topmost/leftmost row/column index of the first instance in which all criteria were satisfied? Or would it be a generalized lookup? The former might be interesting, but would always have to use linear search, though it could also use short-circuit evaluation, so if the Nth item in range1 didn't satisfy criteria1, the corresponding items in the other ranges wouldn't need to be checked. Yet another way to perform lookups is problematic, and it gets back to something like the FILTER function in Google Sheets. For me, as general as possible beats all alternatives.
Your dynamic sort state defeats any benefits from binary search.
For unsorted matching, all MATCH does is iterate through its 2nd arg (which should be a 1D list) comparing it to its 1st arg, stops when it finds a match, and returns the match's index. If no match is found, it returns #N/A.
For ascending sorted matching, all MATCH does is check that its 1st arg is >= the first item in its 2nd arg (if not it returns #N/A quickly), then checks if its 1st arg is > than the last item in its 2nd arg (if so, returns the last item's index quickly). If still going, MATCH checks its 1st arg against the middle item in its 2nd arg list, and effectively replaces the whole list with the portion within which its 1st arg falls. It repeats this last bit until it finds an exact match or until item K is < 1st arg and item K+1 > 1st arg. Descending sorted matching works the opposite of this.
With N items in the list, unsorted matching involves no more than N comparisons, and sorted matching involves no more than INT(LOG(N,2))+2 comparisons. Your dynamic matching would involve 2(N-1) comparisons before the matching would begin. How could that aid efficiency?
Maybe for you most large tables have filters. Not for me. One of the models I work with several times a day has a 2000+ row table in a hidden worksheet. Filters in hidden worksheets would be kinda pointless. There are over 6,000 lookup formulas into this table in other worksheets. I want that table sorted, and I DON'T want lookup functions which would only use binary search when there was an autofilter showing it was sorted.
You're talking about interrelating various pieces of functionality. I'd prefer keeping separate pieces of functionality as orthogonal as possible.
Also, tables can include columns with formulas. Autofiltered tables can be sorted on columns containing formulas, and the autofilter arrow will show sorted. However, those formulas could recalculate and change values, rendering the column no longer sorted, but the autofilter would still show it sorted. Point: relying on autofilter sorted state isn't robust. I've seen too many BIG MISTAKES arising from small assumptions in spreadsheets to be comfortable relying on any assumptions.
My Tables comment refers to looking up using columns other than the first one, and if those other columns were sorted, MATCH could use binary search. Your whole first paragraph in you previous comment showed you didn't understand that this was the situation to which I was referring.
MATCH with 3rd argument +/- 1 is meant for sorted ranges, and that's what I meant. Tables can be sorted on columns other than the first one. Also, I understand AND EXPECT +/- 1 to be used only when one KNOWS lists are sorted.
LOOKUP always assumes sorted lists. VLOOKUP and HLOOKUP allow for tables sorted on first column or row, respectively, in ascending order or unordered, but not sorted in descending order. MATCH allows ascending, descending or unordered, and that's part of its strength compared to the lookup functions.
The reason for indicating sorted or unsorted is the algorithm the lookup functions and MATCH use. If you know you have sorted lists, those functions can use binary search, which is O(log(N)). For unsorted lists, those functions have to use linear search, which is O(N). I've dealt with workbooks with lots of unnecessary linear search; it can really slow down recalc.
Finally, checking whether a list is sorted is O(N). Meaning NOTHING GAINED checking whether a list is sorted in order to use binary search. You should only use the binary search variants when you KNOW lists are sorted. That is, there's no sensible alternative to KNOWING whether a list is sorted, then TELLING Excel to treat it as sorted for lookups.
VLOOKUP (and HLOOKUP) still work well for threshold lookups on sorted tables, though INDEX+MATCH also works for that. Indeed, since MATCH can handle ascending and descending orders, INDEX+MATCH is more flexible for that too.
OK, find, time to deprecate the lookup functions, but no need for GETMATCH without providing at least the flexibility of INDEX(...,MATCH(...,...,1)), INDEX(...,MATCH(...,...,0)) and INDEX(...,MATCH(...,...,-1)).
For me, the absolute ideal would be structured references from Excel, regular expressions from LibreOffice Calc, and the FILTER function from Google Sheets. Maybe also the resurrection of SQL.REQUEST and SQL queries against tables.
GETMATCH being the only function aside from IFERROR to provide an alternative return value doesn't make sense. If all the lookup functions did so, much better.
Excel is becoming as encrusted as PL/I back in the bad old days of the early 1970s, just before C and its descendants took over. At the very least, it's time to deprecate the Lotus 1-2-3-like database functions and the rather pointless *IF/*IFS functions in favor of the truly more useful and not difficult to explain FILTER function.
Furthering Kenneth's point, FILTER would eliminate the need for
and very likely over half of actual use of LARGE, SMALL and ROW.
As for Wyn's GETMATCH, just add a 4th argument to LOOKUP to let it do exact and threshold matching. Make that optional 4th arge 1/0/-1 like MATCH, so the lookup array/range could be sorted in either order.
If you have Google Sheet's FILTER,
=INDEX(X:X,MATCH(foo,G:G,0)) == =GETMATCH(foo,G:G,X:X) == =FILTER(X:X,G:G=foo)
Note that if GETMATCH returned #N/A when no match was found, it's easy enough to handle alternative return values using IFERROR.
However, even better,
=INDEX(X:X,MATCH(TRUE,IF(G:G=foo,J:J=bar),0)) == =FILTER(X:X,G:G=foo,J:J=bar)
For that matter,
=INDEX(X:X,SMALL(IF(G:G=foo,ROW(X:X)),n)) == =INDEX(FILTER((X:X,G:G=foo),n)
An equivalent for Google Sheets's FILTER function would be more flexible than INDEX+MATCH or GETMATCH (which is LOOKUP with exact matching).
115 votes17 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
Alternatively, make ALL informational XLM functions proper worksheet functions since GET.WORKBOOK(1) returns a horizontal array of all worksheets in the active workbook and GET.DOCUMENT(1,GET.CELL(66,A1)) returns a horizontal array of all worksheets in the workbook containing this formula.
@Tony Valko: Especially annoying since Lotus 1-2-3 Release 3's (1989, so 27 years ago) @CELL could return the worksheet name, its @INFO could return the number of worksheets in a workbook, and a combination of @CELL, @COORD and @@ could be used to return all worksheet names in a workbook. I believe Quattro Pro also had this functionality in the early 1990s. OTOH, Excel requires VBA or XLM functions.
53 votes9 comments · Excel for Windows (Desktop Application) » Tables, Sorting and Filtering · Flag idea as inappropriate… · Admin →
62 votes13 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
I didn't mention stack access other than the top. I don't want to recreate FORTH.
Being able to create names on the fly would be good. About time most of the XLM functions became available in non-macro worksheets. Unfortunately, SET.NAME() raises the ticklish question whether expressions in cell formulas could change ALREADY DEFINED names, and whether ad hoc names created while evaluating formulas should be deleted when formula evaluation completes.
Poor example, as =MIN(5,[Formula]) would suffice.
I get the point, but PUSH and POP functions would work as long as PUSH returned its argument's value. That is, something like
=IF(PUSH(very_complex_expression)>=0,SQRT(PUSH(POP())),IF(PUSH(POP())<0,PUSH(POP())&" is negative",IF(NOT(ISERROR(PUSH(POP()))),PUSH(POP())&" not numeric",POP())))
OK, PUSH(POP()) would be very common, so maybe a PPOP() or GET().
34 votes10 comments · Excel for Windows (Desktop Application) » Formulas and Functions · Flag idea as inappropriate… · Admin →
62 votes29 comments · Excel for Windows (Desktop Application) » Data Import · Flag idea as inappropriate… · Admin →
Thanks for the feedback!
Can you comment on whether you expect this to basically import all the lines as text to column A, or if you have some other expectation?
Also, I’m assuming people refer here to the file/open path, and not to the Data/From-Text or New-Query/From-Text path.
For example, genetic data often looks like dates, and opening CSVs converts such data into dates. Since opening CSVs doesn't launch Text to Columns, there's no way to prevent Excel from making this conversion other than by changing the CSV file's extention to .TXT and opening is as a text file rather than as a CSV file.
IOW, ONLY separate CSV records into columns, but otherwise leave as text with no conversion of numbers into numeric values, no conversion of anything looking like dates into dates, etc.
10 votes2 comments · Excel for Windows (Desktop Application) » Opening and Saving Files · Flag idea as inappropriate… · Admin →Harlan Grove shared this idea ·
45 votes11 comments · Excel for Windows (Desktop Application) » Other · Flag idea as inappropriate… · Admin →