Friday 27 March 2009

Memory Usage and OS differences

The work that we have done so far on reducing memory usage has concentrated on the windows platform, but we also run our products on UNIX (AIX & HP) and z/OS.

Each platform is different with respect to when memory is allocated, so an understanding of this is important when trying to reduce memory usage, especially if your code will operate on different platforms.

The examples that we are using relate to generated C or COBOL code and not other platforms like Java or .Net.

For C code, the storage for local views is allocated on initial load of the executable. For static modules this will be when the executable is loaded, and for DLLs, this will also be when the executable is invoked. However, once it has been invoked, the memory will remain allocated until the executable terminates. For an executable loaded by the Transaction Enabler (TE), this might be some time!

You can avoid loading a DLL that might not be invoked for each execution with the /DELAYLOAD linker option on Windows and equivalent options for UNIX (e.g. -blazy on AIX). We are experimenting with this and will report back on findings in a later posting, but so far we have seen a benefit of Windows without a noticeable overhead. The main benefit on Windows is that we can now keep many more executables loaded in the TE and thus do not have the overhead of these being swapped in and out.

On z/OS, the COBOL working storage is allocated from the heap for re-entrant code. (Normally Gen programs are compiled with the RENT option). This means that the working storage is allocated on first execution of the action block and then not freed until the executable terminates. If you have an executable with a lot of action blocks that are rarely invoked, then they will not present a memory overhead compared with C code where the storage for all action blocks will be allocated on initial load of the executable.

Wednesday 25 March 2009

Memory Usage Caveat

In the Mythical Man Month, Fred Brooks provides an example of how a developer 'wasted' a few bytes of storage in the OS/360 control program by putting in a routine to handle leap years. In his view this was not necessary since the operator could reset the date once every 4 years. In those days memory was very expensive, and one of the main ways of charging for the machine, so freeing up a few bytes to the application program was seen as a good thing.

We are now used to machines with many gigabytes of memory and so you may be wondering why this blog started with a post on reducing memory usage, especially since memory is cheap and developers are expensive.

Let me start by stating that I am not advocating spending a lot of precious time trying to save a few bytes. In our case, because of large group views, large number of called action blocks and reuse of code in shared libraries, some of our server load modules were requiring over 100Mb. If you then have 20 of these loaded by the TE, you are now using 2Gb.

Paging memory to disk or swapping load modules in and out of the TE has a performance impact, so our thinking is that if you can reduce the memory usage of the application, then you can keep more load modules loaded and/or give more memory to the DBMS, which will be able to benefit from it and hence improve performance.

A second point is that most of the techniques that we are using are highly automated. We have developed tools to search for and remove unused views, and report on imperfect view matching (optionally only for repeated calls or those involving group views). There is therefore a very low overhead in implementing these good practices.

You should however consider the cost/benefit case for any coding standards that you adopt, and spending a lot of time trying to save a few bytes (or even a few Mb) is unlikely to be justifiable.

Friday 20 March 2009

Memory Usage

Reducing the memory usage of our load modules is something that we are working on at present, so to kick off the blog with a very technical subject...

Much of the code that we develop relies on repeating group views to process lists of objects.

Group views in Gen need to have a fixed, maximum size, and memory for the group view is allocated at program load time. This means that there is a trade-off between the memory usage of a load module and the limits imposed by the group view size. In many cases you can size the group view to cater for the maximum number of rows that are needed, or repeat processing with some sort of 'start from' value.

However there are quite a few instances where this is not possible, and then the group view size becomes a fixed limit on the data that can be processed by the function.

Over the years we have increased these limits based on customer requirements, and this has resulted in an increased memory requirement for the load modules. This is particularly noticeable in Windows and UNIX environments where the server load modules are kept in memory by the Transaction Enabler. (Note that there are tuning parameters for the number of load modules that are loaded by the TE).

We have also made more use of DLLs in Windows and UNIX (Operations Libraries) and these can also increase the memory requirement if they result in a load module ending up loading a DLL for one action block and therefore also allocating the memory for all of the other action blocks in the opslib and any dependent opslibs. Our models have a lot of shared and reused code, and complex dependencies between modules has meant that many of the DLLs are loaded multiple times because of code dependencies, even if they are not required for each execution.

We are therefore working on some strategies to reduce memory usage. Ones that we have come up with so far include:

1) Delete unused views and attributes
2) Improve use of perfect view matching
3) Reduce use of uninitialised local views
4) Share group views between action blocks
5) Delay loading of DLLs
6) Delay allocation of local view storage

The first four are all standard coding techniques.

1) Maintaining an action block over many years (some of our action blocks are now 20 years old) can result in quite a few unused views. Finding and deleting these can be tedious and time-consuming, so we developed some tools in VerifIEr and genIE to automate finding and then deleting the unused views and also unused attribute views.

2) If view matching is not perfect, extra working storage is required in the generated code, so improving view matching (especially for large group views) will help.

3) In C generated code, having even one uninitialised local view will double the memory usage of all of the local views, so avoid this if you can.

4) An example of sharing memory is where many action blocks require a large local group view with the same structure. For example, we have a case where many action blocks need a 1Mb group view and each is called from the same parent. If each action block has its own local view, then each AB needs 1Mb of storage. However if the group view is defined as an exportable import view and is passed from a local view in the parent, then they all share the same storage. You then need to ensure that the view is properly initialised, but there is a big memory saving. In one load module we managed to halve the memory usage using this technique.

5) On windows we are experimenting with using the /DELAYLOAD: linker option to delay loading of opslibs. If the route through the code never requires a function in the opslib, it will not get loaded.

6) This is very much work in progress, so more on this later...

Welcome

Welcome to the IET Gen Development Blog. I decided to start this blog to share some of the tips and techniques that we have learnt at IET in developing our products with CA Gen. We are passionate about CA Gen and the main aim of this blog is to help other Gen users learn from our experiences (and mistakes!).

The blog is likely to be composed of random information and is not designed to be a structured training course in Gen development!