Sitecore Analytic Double Page Removal
To go along with this weeks posts on Sitecore DMS, I wanted to touch on a bit of a unique scenario. Imagine your site has a lot of AJAX post backs. Your analytic reports will be filled with the same page name over and over. One for each call that occurs. In some scenarios it might be good to see all these page calls, but in other scenarios marketing might just want to see how long a user spent on that page and what events occurred while they were there. So in order to do what we need, we need to first understand a couple things. The first thing is that time spent on a page is measured by the next page's time stamp subtracted by that pages initial time stamp. Or in short the length of time between time stamps. So as long as we maintain the very first page of our doubles we will always have the initial time stamp and not break our time spent data. Secondly, page calls may also have events, so it is important to wrap this in as well. Grabbing all events from the page we are about to delete and adding them to the initial. As for the code, I have added it as the last call on my main layout to make sure I gather any events added prior to this.
public static void RemoveDoubleTrack()
{
// check if current page and previous page are the same. If so remove page.
if (Settings.Analytics.Enabled && Sitecore.Context.Site.EnableAnalytics)
{
VisitorDataSet.PagesRow currentPage = Tracker.Visitor.CurrentVisit.CurrentPage;
if (currentPage == null)
{
return;
}
Item item = Sitecore.Context.Database.GetItem(new ID(currentPage.ItemId));
if (item == null)
{
return;
}
VisitorDataSet.PagesRow prevPage = Tracker.Visitor.CurrentVisit.PreviousPage;
if (prevPage == null)
{
return;
}
if (Tracker.Visitor.CurrentVisit.CurrentPage.ItemId != prevPage.ItemId)
{
return;
}
foreach (VisitorDataSet.PageEventsRow pageEventsRow in Tracker.Visitor.CurrentVisit.CurrentPage.PageEvents)
{
RegisterPageEvent(pageEventsRow.Name, pageEventsRow.Text, pageEventsRow.Data, pageEventsRow.DataKey, prevPage);
}
// cancel page
Tracker.CurrentPage.Cancel();
}
}
Before we get to the actual work we do several checks. First we always want to make sure that Analytics is enabled and it is enabled within the context we are calling it from. Then we check to make sure we have a current page as well as that the page is an item. Checking that the page is an item isn't really necessary, it is more of an option on what you want contained in your double track checking. Finally we make sure that the previous page has a record then we compare our current and previous page. If all these pass we then loop through the current page's events and register them as events on our previous page. This register of page events is located in my previous article. And finally we cancel the tracking of the current page. Even though Analytics has been building a new record through out our entire page request process, it does not fully commit until page load completion. So we can cancel at this point. In summary this is a pretty simple fix to eliminate double page views in Analytics. However having grown up in a development world where being your own DBA was just part of the job I can protest a little and say that it is much better to retain all data. If I had a second go at this, I might look at how to modify the analytic reports to filter double page data. That way the data would always be there if you ever had a need to look at it a different way.