Large Data Sets (3/3)

Now that we have discussed paging and sorting let’s talk about getting the experience a little better for your user. Our first step is to add to the paging solution so that the DataProvider looks ahead when the user is approaching the end of a page. After that we’ll improve the performance of the DataGrid by turning off the “live scrolling” so that the DataGrid doesn’t try to update as the user is navigating. Finally we’ll build on that DataGrid to provide context information to the user so they can navigate more accurately. The examples can be found here and are a superset of what we did in articles 1 and 2.


Lookahead

The paging solution introduced in the first article is useful in many situations to reduce the amount of data put on the wire at once and allow the user to interact with your app sooner. However, it’s possible to improve things even more by loading a few pages even before the user even needs them. When the DataProvider notices that someone has requested an item within a certain threshold it will go ahead and load the next page. This means that by the time the user actually gets to needing that page it will already exist, having been brought down in the background. This solution is great when a user is moving through a list looking at a few records at a time sequentially. However, if the usage model is based on a lot of random access this technique may not be appropriate as you won’t gain much by downloading a page the user is unlikely to see. Example 6 shows the lookahead solution by using a new DataProvider implementation called LookaheadPagingDataProvider. Much of the code is the same as SimplePagingDataProvider, there’s just been some refactoring and the addition of lookahead logic. If you run the example you’ll see that I log when a page is loaded based on a Request (the user wants to view the data now) or if we’re loading based on Lookahead. Lookahead works for both ends of a page, if you’re near the top of a page we’ll look to load the previous one, the bottom looks for the next page.

Live Scrolling

One issue when using a DataGrid with lots of data is that it may be slow to scroll when using the scrollbar. Remember that every time the DataGrid wants to display data it calls getItemAt on the DataProvider, and if you’ve looked at the source code you can see that there’s a lot more logic than simply returning an element in an array. This means that there’s a lot going on as you move the scrollbar “thumb” around, and performance can degrade. If you’ve played with the Flex layout controls you might have noticed that the DividedBox classes have a property called “liveDragging.” When this is true the UI will attempt to re-layout everything as the divider is dragged around. When false, the UI will wait to re-layout until the divider has been released, making the dragging much faster. We can achieve noticeable performance gains if we apply this same logic to the DataGrid. Example 7 is essentially the same as Example 6 except that I’ve created a subclass of DataGrid (ingeniously called FasterDataGrid) that allows you to disable live scrolling. Note that this implementation is only really possible because I have access to the source code so I could see what I needed to do. It is probably not the best way to achieve this effect, but rest-assured we are looking into adding this capability into our controls in a future release so you don’t have to. With that in mind, please note that this is “unsupported” and the underlying code that’s relied on may change at any time. So consider yourself warned if you use this in your own code.

ToolTips

The last improvement that I want to make is strictly for usability. When you’re dealing with a large amount of data it can be difficult to know where you are in context to the rest of the data. If you’ve used recent versions of Microsoft Word you may have seen a feature where as you drag the scrollbar it shows you the page number and perhaps a heading to indicate where in the document you would land should you release the scrollbar there. I’ve tried to apply that idea to my last example. In our case we’ll take the field that was used for sorting and create a complete list of values for that field that can be displayed when the user is moving the scrollbar. Since we’re only trying to load one field we don’t have to worry about creating thousands of complex objects, in our case it’s just Strings. In my experiments I was able to bring down all 20,000 values for the sort keys in approximately 8 seconds. That’s a lot faster than bringing down complete records, especially if you consider that a user will take a few seconds to think and process what they’re seeing before moving the scrollbar. In order to allow user interaction while downloading I page the sorted key list, but I use a much larger page size (7500 worked OK for me on my machine). I also don’t wait for the user to need a page from this list, as soon as the first page loads we go ahead and request the next one. Example 8 shows the implementation with SortKeyProvider.as and ToolTipDataGrid.as being the important files. You may want to play with the SortKeyProvider pageSize attribute to see what works best for you as the interface does pause while the keys are being read in.

Conclusion

Many applications use a large amount of data and unfortunately there isn’t a magic bullet to making everything ready the moment the user wants it. A common “first-step” to working with large data is to retrieve it in pages. This allows the user to see data immediately and gives you the opportunity to download more in the background or simply wait until the user actually wants to see it. The user will often want to move around the data even if it is not all on the client so it is up to you as the developer to provide appropriate navigation metaphors. One simple example is putting up a tooltip as the user scrolls so they can see what will be loaded when they stop. Remember, the user only needs the information that they can see. If you can deliver visible information quickly you will have the opportunity to download the details later in the process when the user doesn’t mind a small delay.

This concludes the current series of discussions on large data sets. The topic will continue to come up regularly since I focus on the data side of Flex, but I do think about other things and would like to discuss them as well. Hope you’ll stay with me!

19 Responses to Large Data Sets (3/3)

  1. Al Choudhury says:

    Running the examples I’ve noticed an issue with scrolling on the “Lookahead” page reader. If the mouse scroll is used to scroll down the grid, occasionally, the “fetch next chunk” trigger is not fired or the fetched chunk is not rendered and the user is left with the impression that there is no more data. It”s happened to me a few times now, but intermittently, and I”m unable to consistently reproduce it, any thoughts?Previous article examples have never exhibited this phenomenon, just this recent article.RegardsAl Choudhury

  2. Matt says:

    Hmm, I didn’t really do much testing using the mouse wheel. Do you see it if you use the keyboard or press on the scroll bar arrows? You can add debugging back in by catching the miss and load events of the data provider to see when if it’s firing a miss but not firing a load.How fast are you scrolling the wheel? Perhaps the lookahead logic is still retreiving a page when the datagrid wants it for real and that’s somehow causing a conflict?I have to admit since I’m just trying to show the concepts I won’t be able to spend time debugging this unless we try to integrate this logic for real.

  3. Kai Langenbach says:

    Hi Matt.Thanks for this great article about large data sets. It helped me to build a paged DataGrid in MX 04 with lookahead, sorting and filters. Unfortunately there’s no ToolTip class in Flash, but maybe i’ll write my own.Best regards,Kai Langenbach

  4. Prismix Blog says:

    Contact List (Part 1 – Exploring Large Data Sets)

    Macromedia’s Matt Chotin in his three part series about Large Data Sets in Flex is a informative hands-on tutorial about handling enterprise – level quantities of data between the Flex presentation tier and a services tier. We have started testing…

  5. Rory Douglas says:

    Hi there,I must be kind of slow, but I really can’t find Slider.swc, as required by some of the example code included with this thread.I’ve looked in the samples.war that came with Flex, and I still can’t find it. Any help would be appreciated.thanks!Rory

  6. Rory Douglas says:

    Apologies, please ignore above comment. I changed to and updated my app with Updater 2 stuff, all works fine. Thanks.

  7. Adrian says:

    It would have been nice if flex had something like the cfoutput tag that has currentrow,startrow and maxrow.

  8. Mark Wilson says:

    I have the samples working fine but for our users the performance when scrolling or doing page up/down is just too slow. The effect is worse when viewing full screen rows. This spoils an otherwise good product. Our issue is not with lookahead or prefetch etc thats all great. Once all data has been retrieved just paging up/down the list is too slow. Turning off live scrolling etc is trying to work around the issue. If we are to deliver a true RIA we really need that paging to be quick. Are there any plans to address this in future releases?

  9. Matt says:

    We’re always trying to improve performance. Some of the Maelstrom enhancements that we sneaked at MAX will address this as well as whatever other optimizations we try.

  10. Mark Wilson says:

    Sounds hopeful, look forward to trying them out, any rough ideas when this will be available?

  11. Matt says:

    Nope 🙂 1.5 JUST released

  12. Mark Wilson says:

    Should give them plenty of time to get it right then 😉 here’s hoping.

  13. Mark Wilson says:

    Interestingly paging down through a pre-loaded list using the page down key works fine. However, using the page up key is much slower. If you take one of the samples, page down several times to load a reasonable amount of data, then press page up a few times (rapidly) the grid can take a few seconds to catch up. Doing the same with the page down key is fine. Any ideas? Is this a known “feature” of the current grid implementation, something you’ve come across?

  14. Matt says:

    I don’t remember seeing it. Are you using the example code I posted for 1.5 around MAX?

  15. Mark says:

    I’m using code fromhttp://www.markme.com/mchotin/files/data-4.zipAlthough the effect is the same for any grid. I am using Flex 1.5, Flash 7,0,19,0 non-debug, trace disabled, production mode.

  16. Matt says:

    Well, you can try http://www.markme.com/mchotin/files/data-5.zip since that’s the version updated for 1.5 (1.4 is still meant for 1.0). Not sure if you’ll see improvements, I’m not sure why it would act the way you’re describing.

  17. Mark says:

    Hi, same thing. I had Andrew Shorten from Macromedia UK office here today. Trying the same code on his laptop (T21), it didn’t appear to suffer as much but page up was still slower. We suspected it may have been the PC hardware I was using. However, running it at home on a 3G P4 with GeForce FX 5200 with 128 MB video memory and I still have the same issue. Very odd. Like I said page down is fine. Going down the grid you can press the down key as quick as you can (not keeping it down) and it keeps up fine. Do the same thing going up and it soon hangs for a few seconds while it catches up. Interestingly using the scroll bar even with live scrolling switched on is faster!

  18. Mark says:

    Andrew has build 7,0,35,29 installed compared to 7,0,19,0 that I got from the download site. Dont know if that would make a difference?

  19. Matt says:

    I really don’t know and don’t have much time to play around with it right now. Sorry…