Normally, when a flow chart executes, Flux will routinely need to access the database to obtain information about the flow chart. If a flow chart runs in a loop, for example, Flux may need to obtain a copy of the flow chart from the database each time the loop executes. This can cause performance problems if the flow charts are very large or run very frequently, causing Flux to query the database often and for large amounts of data.
Caching is designed to increase engine performance in these situations. With caching enabled, the first time a flow chart executes, a copy of that flow chart is stored in memory. The next time Flux needs to obtain the flow chart, it can use this in-memory copy, allowing the engine to minimize database queries and improve runtime performance.
There are two types of caching, local and networked (described in greater detail below). By default, Flux uses the networked cache.
NOTE: Caching is configured separately on all engines in the cluster. Setting a cache type of "LOCAL" or "NONE" on a single engine will not update any other engines in the cluster, so be sure to update each engine appropriate when changing the cache type.
With local caching, every engine uses its own cache. If the engine is participating in a cluster, other engines in the cluster cannot access this cache. For this reason, the local cache is best used when the engine is not running in a cluster, or when you do not want clustered engines to share information across the network.
To enable the local cache, set the CACHE_TYPE engine configuration option like so:
With networked caching, all engines in the cluster use a shared network cache. RMI is used to coordinate the cache across the network, using the REGISTRY_PORT and RMI_PORT settings configured on each engine in your cluster.
You can enable the networked cache by setting the engine configuration option CACHE_TYPE as follows:
Networked caching is enabled on all engines by default. To disable caching, set the CACHE_TYPE engine configuration option to "NONE":
The cache size specifies the total number of flow charts that can be loaded into memory at one time. This is configured using the engine configuration option CACHE_SIZE. For example:
CACHE_SIZE = 100
The number on the right of the equals sign is the total number of flow charts that should be allowed in memory. The default value is 200.
Inside the cache, each flow chart is given a weight. When the cache is full, this weight is used to determine which flow charts should be removed to make room for new, incoming flow charts. A flow chart's weight is determined by the following formula:
<size> * <date last accessed> = <weight>
Larger flow charts and flow charts that are frequently accessed receive the higher weight. Likewise, small flow charts or flow charts that are rarely used receive the smallest weight (since querying the database to retrieve those again has a lower cost).
Cache Size and Memory
The amount of memory an engine's cache will use in the JVM depends on both the size of the cache itself, and the size of the flow charts within the cache. Since the cache only restricts the total number of flow charts that can be stored, and not the total size, your memory usage can vary depending on how big both the cache and your flow charts are.
Because of this, as you increase your cache size and flow chart size, you will likely need to increase the amount of memory available to the Flux JVM. By default, a JVM will have 64 MB of memory. You can adjust this by supplying the "-Xmx" option in the startup of the JVM. For example, you might use the following command to start an engine:
java --Xmx512m --classpath .\;flux.jar;<JAR files> flux.Main server start
The "-Xmx512m" option above specifies that the JVM is allocated a maximum of 512 MB of memory.
The default cache size in Flux requires at least 512 MB of memory available. You may need to adjust the cache size depending on your memory requirements.
The default scripts that ship with Flux (start-unsecured-flux-engine and start-engine) automatically set the memory to 512 MB.
When Should I Use Caching?
Caching is most effective when you are using large, complex workflows that run frequently in a loop. Small workflows and workflows that run infrequently (or run only once) are not typically affected by caching – for these cases, we suggest disabling caching to reduce memory usage on the system.
Additionally, if you are noticing a large number of queries to the database for your workflows – especially to the FLUX_VARIABLE table – enabling caching may help increase the performance for these workflows and reduce the load on the database. For this reason, we recommend enabling caching if you see a heavy load on the FLUX_VARIABLE table, or other queries related to your workflows.