Tuesday, June 19, 2018

ESSBASE OPTIMIZATION

Essbase Optimization

One of the reasons of Essbase optimization can be somewhat tricky, to put it mildly, is because Essbase cube performance is almost directly linked to the design of the cube, i.e. the dimensions, hierarchies and members in each dimension, stored vs. dynamic members, etc. This is unlike other (relational) systems where there are general technical items to check for optimization, which has nothing to do with the data in the system (except maybe for the amount of data).

In Essbase the data dictates the cube structure, and the way the cube storage works in blocks and indexes, and therefore every cube is different and has different performance implications. Also Essbase is unique in that just increasing hardware specifications (specifically memory and CPU power), while causing some improvements as a matter of course, will not cause as dramatic performance improvement as changing the cube design to be optimal for the data set used.

Please note this is applicable to block storage cubes.

The Essbase optimization main items checklist :

Block size

Large block size means bigger chunks of data to be pulled into memory with each read, but might also mean more of the data you need is in memory IF your operations are done mainly in-block. Generally we prefer smaller block sizes, but there is not a specific guide. In the Essbase Admin Guide they say blocks should be between 1-100Kb in size, but now a days with more memory on servers it can be larger. Blocks is all dependent on the actual data density in the cube. Do not be afraid to experiment with dense and sparse settings to get to the optimal block size, we have done numerous cubes with just one dimension as dense (typically a large account dimension), and cubes where neither the account nor time dimension is dense.

Block density

This gives an indication of the average percentage of each block which contains data. In general data is sparse, therefore a value over 1% is actually quite good. If your block density is over 5%, then your dense/sparse setting is generally spot-on. A large block with high density is OK, but large blocks with very low density (< 1%) not.

Cache settings

Never ever leave a cube with the default cache settings. Often a client complains about Essbase performance, and sure enough when I look at the cache settings it is the default settings. This is never enough (except for a very basic cube). Rule of thumb here is to see if you can get the entire index file into the cache, and make the data cache 3 times index cache, or at least some significant size. Also check that the cube statistics to see the hit ratio on index and data cache, this gives an indication what % of time the data being searched is found in memory. For index cache this should be as close to 1 as possible, for data cache as high as possible.

Outline dimension order

Remember the hourglass principle. This means order the dimensions in your outline as follows – first put the largest (in number of members) dense dimension, then the next largest dense dimension, and continue until the smallest dense dimension. Now put the smallest sparse dimension, then next smallest, and continue until the largest sparse dimension. Because of the way the Essbase calculator works, this arrangement optimizes number of passes through a cube. A variation of this which also seems to work well is the hourglass on a stick, where you put all non-aggregating sparse dimensions (i.e. years, verison, scenario) beneath the largest sparse dimension.

Commit Block settings

This controls how often blocks in memory are written to disk while busy loading or calculating a cube. You want to minimize disk writes, as this takes up a lot of processing time, so set this to be quite high. The default setting is 3000 blocks, if your block size is relatively small (< 10KB) make this much higher, 20000 to 50000. This setting alone can cause dramatic performance improvements specifically on Calc All operations and cube loads.

Use of FIX..ENDFIX in calc scripts or in BR’s

One of the most misunderstood and common mistakes in calc scripts is the usage of FIX..ENDFIX. Always FIX first on sparse dimensions only, and then within this FIX again on dense members, or use dense members in IF statements within the FIX statements. The reason for this is that if you FIX only on sparse members first, it filters on just specific blocks, which is faster than trying to fix within blocks (i.e. dense members).

Optimizing data loads

The best technique to make large data loads faster is to have the optimal order of dimensions in the source file, and to sort this optimally. Do this, order the fields in your source file (or SQL statement) by having as your first field your largest sparse dimension, your next field your next largest sparse dimension, and so on. So if you are using the hourglass dimension order, you data file should have dimensions listed from the bottom dimension upwards. Your dense dimensions should always be last, and if you have multiple data columns these should be dense dimension members. Then you should sort the data file in the same order, i.e. by largest sparse dimension, next largest sparse dimension, etc. This will cause blocks to be created and filled with data in sequence, making the data load faster and the cube less fragmented.

These are just the initial general optimization points which can cause huge performance improvements without too much effort,  generally these ones should handle 70% of our optimization issues.

With Warm Regards
SST.!

No comments:

Post a Comment

Oracle Planning and Budgeting Cloud (PBCS) - August 2018 Updates

The following announcements and considerations are outlined in the upcoming  Oracle Planning and Budgeting Cloud (PBCS)  update: Oracl...