Loading Aggregate Data in Yukon 2005
In some cases, the source data is pre-aggregated in one, some or all dimensions. In the simplest example, employee salary is loaded to all employees. Because a manager’s salary is not the aggregate of his reports, it should not be aggregated up the employee dimension. There are several other such examples. This document discusses how pre-aggregated data can be loaded directly into Analysis Services 2005.
Note: the approach described here applies only to Analysis Services 2005 and not earlier releases.
Introduction
In Analysis Services, an assumption is usually made about the content of members. Leaf members contain data derived from underlying data sources; nonleaf members contain data derived from aggregations performed on child members.
This is not always true. The example above is one such – the salary of a manager is not the sum of his employee’s salaries. Analysis Services introduced datamembers – each non-leaf member has its own system-generated datamember. By default, datamembers containing data are aggregated long with the other children of the member. This behavior can be changed with calculations.
Loading data to non-leaf members
Data for non-leaf members is not always the aggregate of its children. Examples include salary, sales, evaluation scores, whatever. This behavior is achieved by replacing a member’s data with its datamember’s data as follows:
Calculate;
this =
[Employee].[Employee].currentmember.datamember;
And if there are two such dimensions where a member’s datamember contains the value for the member, add another similar statement:
Calculate;
this =
([Employee].[Employee].currentmember.datamember, [Channel].[Channel].currentmember.datamember);
This approach can be continued to scenarios where all aggregates are stored in the source database and where Analysis Services is used for its metadata and client application support (eg, aggregates already calculated for other purposes).
About recursion:
Experienced MDXers may wonder what happens when the currentmember is a datamember and the above assignment assigns the value of a cell to itself. In Analysis Services 2000, this generated an error. In Analysis Services 2005, the engine is a bit smarter. The engine resolves recursion by using the previous value of the cell on the right-hand side of the statement to determine the left-hand side. Things just work.
Conditional Substitution
Just as common as always loading data to non-leaf members is the conditional substitution of such data. For example, a budget might include an estimate for a full year’s revenue even if revenue exists for the first few months of the year; or a manager’s budget can replace the values of his supervisors. In this situation, the data loaded to a datamember substitutes for the member if such data exists, otherwise data aggregates normally.
For example, consider a product parent-child hierarchy ([Product].[Parent Product] where budget is made at any level in the hierarchy. If a budget value exists for a product, it is the budget for that product even if some other budget estimates was made for children of the product. This is done as:
Calculate;
this =
iif (not isleaf([Product].[Parent Product].currentmember),
iif( not isempty([Product].[Parent
Product].currentmember.datamember),
[Product].[Parent Product].currentmember.datamember,
Aggregate(
[Product].[Parent Product].currentmember.children )
),
measures.CURRENTMEMBER
);
The assignment replaces non-leaf members with their datamember only if the datamember’s value is not empty, otherwise the parent is populated with the aggregate of its children.
Advanced
note: One might wonder – why is
the aggregate function required? Why can’t the expression read as follows:
Calculate;
this =
iif (not isleaf([Product].[Parent
Product].currentmember),
iif( not isempty([Product].[Parent
Product].currentmember.datamember),
[Product].[Parent
Product].currentmember.datamember,
measures.CURRENTMEMBER
),
measures.CURRENTMEMBER
);
(Where the change is
in bold). This is simpler, but
it’s wrong. The measures.currentmember causes
recursion to kick in and goes to the prior pass. The prior pass does not
contain the assignment and data in any child datamembers
will be aggregated along with its siblings.
And if there are more than one such hierarchies in the cube, the statements can simply be stacked on top of one another. Say the budget is made by distribution channel as well. Then the script becomes:
Calculate;
this =
iif (not isleaf([Product].[Parent Product].currentmember),
iif( not isempty([Product].[Parent
Product].currentmember.datamember),
[Product].[Parent Product].currentmember.datamember,
Aggregate( [Product].[Parent Product].currentmember.children
)
),
measures.CURRENTMEMBER
);
this =
iif (not isleaf( [Channel].[Parent Channel].currentmember),
iif( not isempty([Channel].[Parent
Channel].currentmember.datamember),
[Channel].[Parent Channel].currentmember.datamember,
Aggregate(
[Channel].[Parent Channel].currentmember.children )
),
measures.CURRENTMEMBER
);
Without Parent-Child Hierarchies
Up until now, everything has been done with
parent-child hierarchies. The automated generation of the datamembers is simple
and there are built-in mdx functions to navigate to the datamember from the
parent member. But parent child hierarchies have some disadvantages. Most
importantly, parent-child hierarchies are not included in aggregations. For
small hierarchies, this is not important. But if the hierarchy is large, using
a parent child hierachy might result in a performance hit one is unable or
unwilling to accept. As welll, morphing your normal hierarchy to a parent-child
hierarchy is a pain and adds to adminitration costs.
The same approach is possible with regular
hierarchies (ie, not parent-child), but some extra work is required. Extra
members have to be created to substitute for the datamembers, these members
have to be hidden, and the expressions have to be modified to get at these
substitute datamembers (because the datamember is no longer available).
The common approach for generating these
members is to create members with the same name at every level below the level
at which the member appears as in the diagram below:

The green members represent the “user datamembers”
(which will be the term used to refer to them henceforth) created to load the
data.
To hide these user datamembers, set the
property “HideMemberIf” on each level to “ParentName”. These members will
vanish. (HideMemberIf accepts other values to indicate when a child is hidden.
The complete set of value are Never, OnlyChildWithNoName,
OnlyChildWithParentName, NoName,
ParentName. One of these
others can be used so cities like
Where parent-chilld hierarchies are used, the
“Datamember” mdx function was used to get at each member’s datamember. No such
function is available with user datamembers, so an alternative approach is
required. One simple approach is to use the strtomember function to construct
the user datamember name from the current member’s name. If the name of each
user datamember is unique on its level, the user datamember for each data
member is :
STRTOMEMBER(Customer.Geography.City.Level.UniqueName + "." + Customer.Geography.CurrentMember.Name)
(There are many different approaches for this
should the name uniqueness rule not hold up…)
Summary
Loading aggregate data is a common
requirement from non-trivial cubes. Simple mdx allows this to be done to allows
replace a member’s data with the source data in one or many dimensions, or to
do it conditionally based on the existence of the data.