FAQ

What is the format of the configuration file for an AWS Blu Age Compare Tool project?

The JSON format

Which operating system (OS) should the AWS Blu Age Compare Tool be run on?

Linux and Windows

Can I mix up both Binary and Flatfile in .bacmp to compare EBCDIC vs ASCII?

No, you can’t compare the two data which are not the same name and format.
Both Binary and FlatFile comparison modes in Bluage Compare Tool have the following limitations:
Requires matching filenames: The tool expects identical filenames in both left and right folders. It cannot compare files with different names (e.g., file.ebc vs file.asc).
Single configuration per file: The tool applies the same processing configuration to both sides of the comparison, making it unable to handle different encoding formats simultaneously.
No cross-format support: When different file extensions are configured, the tool searches for each filename in both folders, resulting in file not found errors.

My browser got frozen while viewing comparison report contains VB with large arrays. What should I do?

Enable variable-length (VB) column aggregation in your configuration file by adding these two parameters:

"isVBColumnsToAggregate": true,
"minVBOccursToAggregate": 100

This will aggregate VB fields with more than 100 occurrences into single formatted columns with lazy loading, preventing browser freezing. The HTML report will initially show 20 array elements with an "Expand" button to load more on demand.
If you still experience performance issues, lower the minVBOccursToAggregate value to 50 or 25 to aggregate even smaller arrays. If your VB fields have fewer occurrences and don't need aggregation, set isVBColumnsToAggregate to false.
Note: Currently this option is available only for Binary file comparison

When should I use Histogram algorithm instead of Myers diff algorithm?

Use Histogram algorithm for better performance with large files and file splitting:

{
 "compareAlgo": "Histogram",
 "splitLines": 100000,
 "splitSize": 500
}

Use Histogram when:

File size > 1GB
Using splitLines or splitSize options
Processing high-volume comparisons
Memory-constrained environments

Performance: Histogram is significantly faster than Myers for large files and handles split processing more efficiently.

How do I capture heap dumps and handle memory issues when the tool exits due to OutOfMemoryError or running for days?

Configure JVM for memory debugging:

java -Xmx32g -Xms32g \
    -XX:+UseG1GC \
    -XX:+HeapDumpOnOutOfMemoryError \
    -XX:HeapDumpPath=/tmp/heap_dump.hprof \
    -jar CompareTool.jar config.bacmp

Prevent memory issues:

{
 "compareAlgo": "Histogram",
 "splitLines": 50000,
 "splitSize": 100,
 "maximumDifferences": 1000000,
 "comparisonDataStorage": "OnDisk"
}

Memory requirements:

Small files (<1GB): 8GB heap
Medium files (1-5GB): 32GB heap
Large files (>5GB): 48GB heap

If Out Of Memory error occurs: Reduce splitLines and splitSize values, increase heap size, or use OnDisk storage mode.