How can we improve Excel for Windows (Desktop Application)?

Power Query - cache shared nodes

Update Power Query in Excel to take advantage of caching in cases where a parent node refers to a child node that has already been refreshed (as exists in Power BI desktop today).

This issue creates significant performance problems with refresh times when creating highly interdependent financial and operational models. This is a show stopper from a usability and customer acceptance standpoint.

1,240 votes
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    I agree to the terms of service
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Josh Blackman shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    122 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      Submitting...
      • Kirk Phillips commented  ·   ·  Flag as inappropriate

        Until this issue gets resolved I’m going to build my solutions in MS Access. Power Query refresh times are so bad the application is unusable.

      • Josh commented  ·   ·  Flag as inappropriate

        The tool's functionality is amazing, but unless it can run faster it is not a viable option for many business uses. Please improve the performance!

      • Anonymous commented  ·   ·  Flag as inappropriate

        Power Query is great tool but refreshing is getting very slow. I hope prioritization will be given to this project to resolve the issues as soon as possible.

      • Joe Stachyra commented  ·   ·  Flag as inappropriate

        This slowness is a major hurdle for me to convince busy users to adopt this otherwise incredible tool. I could never do a live demo for my classes.

      • Bruno Crotman commented  ·   ·  Flag as inappropriate

        This is the SECOND TIME that THE SAME THING happens:

        The PQ Query will never load when REloaded to the Excel Data Model, that is, when the query is already loaded in the Excel Data Model and it is refreshed.

        The problem was in a line that was referencing a query of the type “list”. A buffer solved the issue again:

        Before solving:

        #"ADD LINES TO REMOVE" = Table.AddColumn(#"Tipo Alterado7", "Remove line?", each List.Contains( #"List of lines to remove" , [Índice]) ),

        After solving:

        #" ADD LINES TO REMOVE " = Table.AddColumn(#"Tipo Alterado7", " Remove line?", each List.Contains( Table.Buffer( #"List of lines to remove ") , [Índice]) ),

        Now the query load fine to the Excel Data Model even if reloaded

      • Bruno Crotman commented  ·   ·  Flag as inappropriate

        I've found a BIZARRE behavior that can explain many problems. And I think it's a query folding issue

        Look at that M code after my comments.

        Sorry about things in portuguese.

        In this piece of code, we have three lines, two commented. They result in the same action.

        The first one tries to split lines from a text file using the positions given by another query that returns a list. This line results in a catastrofic performance when reloading a query that is already in the Excel Data Model.

        The second one works nice! I've simply copied the list generated by the auxilar queries to the code!

        The third one is the real solution for this specific problem: the auxiliar queries are BUFFERED. And this works very nice even when I'm realoding a query that is in the Excel Data Model!!!

        Pay atention to your references to other queries and to previous lines in your M code! Try to use Table.Buffer and List.Buffer

        let

        Fonte = Table.FromColumns({Lines.FromBinary(File.Contents("C:\CURSO EXCEL\Dados Cotações\COTAHIST_A2017_cortado.txt"), null, null, 1252)}),

        //#"Dividir Coluna pela Posição" = Table.SplitColumn(Fonte, "Column1", Splitter.SplitTextByPositions(#"Lista de Posições", false), #"Lista de Campos")

        //#"Dividir Coluna pela Posição" = Table.SplitColumn(Fonte, "Column1", Splitter.SplitTextByPositions({0,2,10,12,24,27,39,49,52,56,69,82,95,108,121,134,147,152,170,188,201,202,210,217,230,242,245}, false), {"Tipo de Registro","Data do Pregão","Código BDI","Código de Negociação do Papel","Tipo de Mercado","Nome resumido da empresa","Especificação do papel","Prazo em dias do mercado a termo","Moeda de referência","Preço de Abertura do Papel no pregão","Preço máximo pregão","Preço minimo pregão","Preço médio pregão","Preço último negócio","Preço melhor oferta de compra","Preço melhor oferta de venda","Número de negócios","Qtd total de títulos negociados","Volume total de títulos","PREÇO DE EXERCÍCIO PARA O MERCADO DE OPÇÕES OU VALOR DO CONTRATO PARA O MERCADO DE TERMO SECUNDÁRIO","INDOPC - INDICADOR DE CORREÇÃO DE PREÇOS DE EXERCÍCIOS OU VALORES DE CONTRATO PARA OS MERCADOS DE OPÇÕES OU TERMO SECUNDÁRIO","DATVEN - DATA DO VENCIMENTO PARA OS MERCADOS DE OPÇÕES OU TERMO SECUNDÁRIO","FATCOT - FATOR DE COTAÇÃO DO PAPEL","PTOEXE - PREÇO DE EXERCÍCIO EM PONTOS PARA OPÇÕES REFERENCIADAS EM DÓLAR OU VALOR DE CONTRATO EM PONTOS PARA TERMO SECUNDÁRIO","CODISI - CÓDIGO DO PAPEL NO SISTEMA ISIN OU CÓDIGO INTERNO DO PAPEL","D"})

        #"Dividir Coluna pela Posição" = Table.SplitColumn(Fonte, "Column1", Splitter.SplitTextByPositions(List.Buffer(#"Lista de Posições"), false), List.Buffer(#"Lista de Campos"))

        in
        #"Dividir Coluna pela Posição"

      • Anonymous commented  ·   ·  Flag as inappropriate

        I've been loading all my cross reference tables into power pivot and making connections there instead of merging, and that has helped a ton with the time it takes for the queries to load and refresh.

      • Anonymous commented  ·   ·  Flag as inappropriate

        Bruno call is legit, and i'v been experiencing this issues since i started working with PQ 2 years ago. This issues are pretty much about Query Folding ( you can read more about that topic here if you not familiar with the concept: http://radacad.com/not-folding-the-black-hole-of-power-query-performance ).
        However, for me it makes no sense avoiding to do merges and other operations with power pivot, when you have complex ETL processes to do and your IT is so drowned in work that can't help you with it. Power query is a tool made to extract, transform and load information and as such it should have a proper calculation engine working under the hood.
        Hopefully Excel team will have time to look about this issue and to get python on excel aswell! :)

      • Anonymous commented  ·   ·  Flag as inappropriate

        Everyone, read @Bruno's comment from October 12th. Very helpful! Simplify your tables as much as possible, avoid query merges whenever possible, use numbers instead of text for your categorical columns and map in your model.

      • Anonymous commented  ·   ·  Flag as inappropriate

        Please! it's surprising how slow refreshes are for ~500 row datasets (coming from Tableau, etc)

      • Chris White commented  ·   ·  Flag as inappropriate

        @Bruno

        Great suggestion here and I want to confirm the issue with queries and the data model. I have a simple query that calls a SQL stored procedure with some parameters. When simply a table in a workbook the query takes 30-40 seconds to refresh and typically returns around 200-300 records. After I add it to the data model, with no other table in it, the query takes more than 2 minutes to refresh. I don't know what is going on here but this alone makes it difficult to use this tool for analysis.

        I'll try disabling updates when I refresh to see what happens, but it definitely seems like a bug as the data model should pause updating until the data is fully loaded.

        Cheers, Chris W.

      • Bruno Crotman commented  ·   ·  Flag as inappropriate

        This is an importante issue. In the last weeks, I've been strugling to deal with this problem, and I think there are some workarounds that can make things a little better. But this is far from good, yet! I THINK IT´S IMPORTANT THAT SOMEONE IN MICROSOFT READ THIS:

        1 - My tests indicate that, at least in my case, its MUCH, MUCH SLOWER to load the PQ queries to the Excel Data Model than to a Worksheet.

        2 - I have some Power Pivot Tables linked to my relatively complex Excel Data Model. These Power Pivot Tables use some complex DAX measures. Clearly, these Power Pivot Tables are updated at EACH query load. If you update 5 PQ Queries, all the PV Pivot Tables are updated 5 times. I've built a VBA script that disables those updates, setting the property ManualUpdate to "True" in the Pivot Table object. This saved me half of the time. I think that it would be easy to correct this BUG.

        3 - We have to be extremely cautious with the our Excel Data Model. Each new column added to the model increases it's complexity. This gets worse when we are talking about columns that have any distinct values. So take some time to clean your model. Eliminate columns that you added "just in case". And, most important, get rid of text columns that you dont use in your final reports. When you refresh your PQ queries, something is updated inside your Excel Data Model, probably many more times than it would be necessary. So keep your Excel Data Model as clean as possible. This helped me very much.

      • Dylan commented  ·   ·  Flag as inappropriate

        Giving up on this...
        Other vendors are solving this problem, and we are not getting replies here...

      • Dylan commented  ·   ·  Flag as inappropriate

        Giving up on this...
        Other vendors are solving this problem, and we are not getting replies here...

      ← Previous 1 3 4 5 6 7

      Feedback and Knowledge Base