Feedback by UserVoice

How can we improve Excel for Windows (Desktop Application)?

Power Query - cache shared nodes

Update Power Query in Excel to take advantage of caching in cases where a parent node refers to a child node that has already been refreshed (as exists in Power BI desktop today).

This issue creates significant performance problems with refresh times when creating highly interdependent financial and operational models. This is a show stopper from a usability and customer acceptance standpoint.

1,868 votes
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Josh Blackman shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →
    started  ·  AdminExcel Team [MSFT] (Admin, Office.com) responded  · 

    Hi all,

    I wanted to share with you that the new query caching mechanism in Excel has been deployed to Office Insiders starting from Excel version 1801 (build 9001.*). At this stage, we would like to allow some “baking” time as we monitor the feature health metrics.

    In this scope, we need your help to ensure a quality release! I encourage you to try the following scenarios and share your feedback:
    1. Run Refresh All on a complex workbook with multiple query dependencies. Does it work faster?

    2. Run Refresh All on a complex workbook with multiple query dependencies. Do you see any issues with your data?

    3. Refresh a single query several times. Do you see any issues with your data?

    - The Excel Team

    276 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      Signed in as (Sign out)
      Submitting...
      • Andrey Minakov commented  ·   ·  Flag as inappropriate

        But it would be definitely interesting to understand, why do you make 2 equal request, and not just 1? Of course is much better in comparison with the current standard release, where you make from 4 to 6 equal requests, but why you decided to make 2 instead of 1?

      • Andrey Minakov commented  ·   ·  Flag as inappropriate

        Guys, you definitely have improved web requests, thanks for that! At least from that point of view Excel now works better than PBI Desktop ;-). So now, if I make any number of references to a web query in Excel (loaded to a model, of course), there is NO any additional requests to a web source while refreshing the model. And in PBIDesctop I still have additional requests per reference. Just cool for Excel users!

      • Anonymous commented  ·   ·  Flag as inappropriate

        I think this is obvious to anyone using PQ along with PowerPivot, but let's just make this distinction really clear:
        1- If you are not using PowerPivot, always make sure your table is not loading to the data model ("load to" window)
        2- If you are using PowerPivot, make sure you ONLY load to the data model the queries that you absolutely need.

        In my case, with a large, complex data set, I made sure to create data model specific final queries where the last step was to "Remove other columns". That serves as a gatekeeper to stop any unnecessary columns from taking up space in my data model.

        Another tip for PowerPivot users: download the PowerPivot Utilities add-in to measure the weight of each column, usage of your columns and metrics, etc. It is a fantastic tool for optimizations.

      • Anonymous commented  ·   ·  Flag as inappropriate

        To everyone that's unchecking the load to data model box; the reason it speeds up the refresh is because it's not importing the data to Power Pivot which of course has it's own downsides if you are managing millions of rows and want to generate relationships between the tables.

      • Neil Good commented  ·   ·  Flag as inappropriate

        I agree with Darrell Ripkowski below - having the Load to Data Model unticked does make it lad a lot faster. No idea why of course! As Laura says, Power Query is very good but I get the feeling MS do not want to go down this road.....

      • Laura commented  ·   ·  Flag as inappropriate

        Love the capability of Power Query, but I am a prisoner to the s . l . o . w . . . update time. Yes I have 5M rows, appending from 25 underlying tables, but it takes an unbelievable amount of time to update. I would abandon it all together, if it had not revolutionized the reporting capability to the Vice Presidents of the company I am contracted at. I am a consultant, so I must do what they ask, and not offer opinions about the time wasted. To them the result is phenomenal, no matter how long it takes. : (

      • Carlos Cortinas commented  ·   ·  Flag as inappropriate

        Hi Darrel Ripkowski, can you tell me please where a I found that Store button or checkbox, please?

        thanks

      • Darrell Ripkowski commented  ·   ·  Flag as inappropriate

        Well I was wondering why power query was slow myself and so I tried a couple things.

        I finally found out the main cause of slowness what checking the box Store in data model while creating the table.

        If I just loaded to table and unchecked the load to data model the refresh was drastically faster. I am talking from taking around 5 minutes and more and making excel almost unusable to less than a minute to load the new data.

        The data I am refreshing are from 12 different files on a share point online server so it has a lot to gather.

      • Wally Wilinsky commented  ·   ·  Flag as inappropriate

        Just so the Microsoft team is aware... I ran an Excel Power Query test and Excel is connecting to that database and retrieving data much more than is needed.

        Setup:
        Excel 2016 64 bit
        Workbook with multiple power queries to a data model
        All data sources are from the same SQL database
        Wireshark installed to monitor network traffic between my PC and the SQL database.

        Result:
        For every query refreshed (either manually or through refresh all) Power did the following:
        1. Queried the database and returned a result set of every table and view I had access to
        2. Queried the database and returned a result set of every stored procedure and user defined function I had access to
        3. Queried the database and returned the calling parameter for the user defined functions I was calling
        4. Queried the database and returned the table structure of the result set my user defined function was going to return.
        5. Queried the database and returned the primary key index names of a seemingly random list of database tables.
        6. Queried the database and returned the version of SQL
        7. And FINALLY executed the query it was designed to execute.

        My questions is WHY SO MUCH OVERHEAD??????? I already designed the queries. I understand getting some of this information when I open the query builder but this was a right click refresh on the query list in Excel. Is this the root of all the Power Query slowness in Excel?

        If the query is already designed, why can't it just execute the query? If I don't have permission get me that error.

        Microsoft PLEASE HELP!!!!!!! This is killing my user base.

      • Sam commented  ·   ·  Flag as inappropriate

        There is no significant improvement is speeds after the update
        (Ver 1809 Build 10820.20006)

      • Ed Hansberry commented  ·   ·  Flag as inappropriate

        @Adam - anyone with 1801. The only people that won't ultimately get it are those that purchased a perpetual Office 2016 license.

        It will be in Office 2019 this fall too.

      • Adam Bender commented  ·   ·  Flag as inappropriate

        Hello,

        In an earlier response you indicated that this was available for users with version 1801 and beyond.

        Does that only apply to Office Insiders, or anyone with that version? I am on the semi-annual channel.

        Thanks,

        Adam

      • Chad kukorola commented  ·   ·  Flag as inappropriate

        @Neil Good, if you can use incremental refresh capabilities that have recently been released, then using Azure would allow you to setup a much better process for bigger models.

        Check out https://docs.microsoft.com/en-us/power-bi/service-premium-incremental-refresh or related articles. I believe it’s still in preview and only for Power BI Premium.

        I don’t/can’t use it currently, some am left with workarounds for my larger models.

      • Neil Good commented  ·   ·  Flag as inappropriate

        Hello!
        Is there any way working with Azure can alleviate this issue? Lots of people seem to have CSV files which need to be imported/supplemented so would going via Azure sort this out?

      • Adam Bender commented  ·   ·  Flag as inappropriate

        @Ed - thanks for the response. I think PQ is an amazing tool. We have started heavily integrating it into our finance and accounting operations and would not be able to do some of the things we do without it.

        I only wish MSFT had built PQ into Excel a decade ago. Keep improving it - this is where things are going!

      • Chad kukorola commented  ·   ·  Flag as inappropriate

        I work almost exclusively with .csv files, and pulling in from File and append.

        The two algorithms begin for run-length encoding and dictionary encoding consume the time on refresh. Each time it’s refreshed, it must run these again. For example, I have a model with 3 million record fact table, a few dimensions (one using same source as fact table). Refresh time is about 12-13 minutes.

        Now, you need to know it’s 46 fields in the fact table (needed), and the processor is very old. On my newer machines with a newer processor, it’s a little more than half that time.

        The compression and mapping algorithms must run regardless (in excel it auto partitions at 1 million records), then it’s the hardware that matters; fast processor, large processor cache, and as fast of RAM as you can buy.

        Until we can legit partition when needed, those are your bottlenecks on refresh. However, bottom line; WOW, what a great capability regardless!

      • Ed Hansberry commented  ·   ·  Flag as inappropriate

        @nickolas, it depends heavily on your data source. If it is a bunch of CSV files, then those are pulled in. If they are on Sharepoint/Onedrive, that can take even longer. If it is SQL Server, then it depends on how efficient your queries are and take advantage of query folding. This is one of those "it depends" based on the source (of which Power Query supports several dozen) and how efficiently you are using them.

      ← Previous 1 3 4 5 13 14

      Feedback and Knowledge Base