Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions lib/shortcuts/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ a Lambda permission.</p>
<dt><a href="#GlueDatabase">GlueDatabase</a></dt>
<dd><p>Create a Glue Database.</p>
</dd>
<dt><a href="#GlueIcebergTable">GlueIcebergTable</a></dt>
<dd><p>Create a Glue table backed by Apache Iceberg format on S3.</p>
</dd>
<dt><a href="#GlueJsonTable">GlueJsonTable</a></dt>
<dd><p>Create a Glue Table backed by line-delimited JSON files on S3.</p>
</dd>
Expand Down Expand Up @@ -202,6 +205,40 @@ const db = new cf.shortcuts.GlueDatabase({

module.exports = cf.merge(myTemplate, db);
```
<a name="GlueIcebergTable"></a>

## GlueIcebergTable
Create a Glue table backed by Apache Iceberg format on S3.

**Kind**: global class
<a name="new_GlueIcebergTable_new"></a>

### new GlueIcebergTable(options)

| Param | Type | Default | Description |
| --- | --- | --- | --- |
| options | <code>Object</code> | | Options for creating an Iceberg table. |
| options.LogicalName | <code>String</code> | | The logical name of the Glue Table within the CloudFormation template. |
| options.Name | <code>String</code> | | The name of the table. |
| options.DatabaseName | <code>String</code> | | The name of the database the table resides in. |
| options.Location | <code>String</code> | | The physical location of the table (S3 URI). Required. |
| options.Schema | <code>Object</code> | | Full Iceberg schema definition with Type: "struct" and Fields array. Each field must have Id (integer), Name (string), Type (string or object for complex types), and Required (boolean). See [AWS documentation](https://docs.aws.amazon.com/glue/latest/webapi/API_IcebergSchema.html). |
| [options.PartitionSpec] | <code>Object</code> | | Iceberg partition specification. See [AWS documentation](https://docs.aws.amazon.com/glue/latest/webapi/API_IcebergPartitionSpec.html). |
| [options.WriteOrder] | <code>Object</code> | | Iceberg write order specification. See [AWS documentation](https://docs.aws.amazon.com/glue/latest/webapi/API_IcebergSortOrder.html). |
| [options.CatalogId] | <code>String</code> | <code>AccountId</code> | The AWS account ID for the account in which to create the table. |
| [options.IcebergVersion] | <code>String</code> | <code>&#x27;2&#x27;</code> | The table version for the Iceberg table. |
| [options.EnableOptimizer] | <code>Boolean</code> | <code>false</code> | Whether to enable the snapshot retention optimizer. |
| [options.OptimizerRoleArn] | <code>String</code> | | The ARN of the IAM role for the retention optimizer. Required if EnableOptimizer is true. |
| [options.SnapshotRetentionPeriodInDays] | <code>Number</code> | <code>5</code> | The number of days to retain snapshots. |
| [options.NumberOfSnapshotsToRetain] | <code>Number</code> | <code>1</code> | The minimum number of snapshots to retain. |
| [options.CleanExpiredFiles] | <code>Boolean</code> | <code>true</code> | Whether to delete expired data files after expiring snapshots. |
| [options.EnableCompaction] | <code>Boolean</code> | <code>false</code> | Whether to enable the compaction optimizer. |
| [options.CompactionRoleArn] | <code>String</code> | | The ARN of the IAM role for the compaction optimizer. Required if EnableCompaction is true. |
| [options.EnableOrphanFileDeletion] | <code>Boolean</code> | <code>false</code> | Whether to enable the orphan file deletion optimizer. |
| [options.OrphanFileDeletionRoleArn] | <code>String</code> | | The ARN of the IAM role for the orphan file deletion optimizer. Required if EnableOrphanFileDeletion is true. |
| [options.OrphanFileRetentionPeriodInDays] | <code>Number</code> | <code>3</code> | The number of days to retain orphan files before deleting them. |
| [options.OrphanFileDeletionLocation] | <code>String</code> | | The S3 location to scan for orphan files. |

<a name="GlueJsonTable"></a>

## GlueJsonTable
Expand Down
187 changes: 187 additions & 0 deletions lib/shortcuts/glue-iceberg-table.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
'use strict';

/**
* Create a Glue table backed by Apache Iceberg format on S3.
*
* @param {Object} options - Options for creating an Iceberg table.
* @param {String} options.LogicalName - The logical name of the Glue Table within the CloudFormation template.
* @param {String} options.Name - The name of the table.
* @param {String} options.DatabaseName - The name of the database the table resides in.
* @param {String} options.Location - The physical location of the table (S3 URI). Required.
* @param {Object} options.Schema - Full Iceberg schema definition with Type: "struct" and Fields array.
* Each field must have Id (integer), Name (string), Type (string or object for complex types), and Required (boolean).
* See [AWS
* documentation](https://docs.aws.amazon.com/glue/latest/webapi/API_IcebergSchema.html).
* @param {Object} [options.PartitionSpec] - Iceberg partition specification. See [AWS
* documentation](https://docs.aws.amazon.com/glue/latest/webapi/API_IcebergPartitionSpec.html).
* @param {Object} [options.WriteOrder] - Iceberg write order specification. See [AWS
* documentation](https://docs.aws.amazon.com/glue/latest/webapi/API_IcebergSortOrder.html).
* @param {String} [options.CatalogId=AccountId] - The AWS account ID for the account in which to create the table.
* @param {String} [options.IcebergVersion='2'] - The table version for the Iceberg table.
* @param {Boolean} [options.EnableOptimizer=false] - Whether to enable the snapshot retention optimizer.
* @param {String} [options.OptimizerRoleArn=undefined] - The ARN of the IAM role for the retention optimizer. Required if EnableOptimizer is true.
* @param {Number} [options.SnapshotRetentionPeriodInDays=5] - The number of days to retain snapshots.
* @param {Number} [options.NumberOfSnapshotsToRetain=1] - The minimum number of snapshots to retain.
* @param {Boolean} [options.CleanExpiredFiles=true] - Whether to delete expired data files after expiring snapshots.
* @param {Boolean} [options.EnableCompaction=false] - Whether to enable the compaction optimizer.
* @param {String} [options.CompactionRoleArn=undefined] - The ARN of the IAM role for the compaction optimizer. Required if EnableCompaction is true.
* @param {Boolean} [options.EnableOrphanFileDeletion=false] - Whether to enable the orphan file deletion optimizer.
* @param {String} [options.OrphanFileDeletionRoleArn=undefined] - The ARN of the IAM role for the orphan file deletion optimizer. Required if EnableOrphanFileDeletion is true.
* @param {Number} [options.OrphanFileRetentionPeriodInDays=3] - The number of days to retain orphan files before deleting them.
* @param {String} [options.OrphanFileDeletionLocation=undefined] - The S3 location to scan for orphan files.
*/
class GlueIcebergTable {
constructor(options) {
if (!options) throw new Error('Options required');
const {
LogicalName,
Name,
DatabaseName,
Location,
Schema,
PartitionSpec,
WriteOrder,
CatalogId = { Ref: 'AWS::AccountId' },
IcebergVersion = '2',
EnableOptimizer = false,
OptimizerRoleArn,
SnapshotRetentionPeriodInDays = 5,
NumberOfSnapshotsToRetain = 1,
CleanExpiredFiles = true,
EnableCompaction = false,
CompactionRoleArn,
EnableOrphanFileDeletion = false,
OrphanFileDeletionRoleArn,
OrphanFileRetentionPeriodInDays = 3,
OrphanFileDeletionLocation
} = options;

// Validate required fields
const required = [LogicalName, Name, DatabaseName, Location, Schema];
if (required.some((variable) => !variable))
throw new Error('You must provide a LogicalName, Name, DatabaseName, Location, and Schema');

if (EnableOptimizer && !OptimizerRoleArn)
throw new Error('You must provide an OptimizerRoleArn when EnableOptimizer is true');

if (EnableCompaction && !CompactionRoleArn)
throw new Error('You must provide a CompactionRoleArn when EnableCompaction is true');

if (EnableOrphanFileDeletion && !OrphanFileDeletionRoleArn)
throw new Error('You must provide an OrphanFileDeletionRoleArn when EnableOrphanFileDeletion is true');

// Build the Iceberg table resource (no TableInput!)
this.Resources = {
[LogicalName]: {
Type: 'AWS::Glue::Table',
Properties: {
CatalogId,
DatabaseName,
Name,
OpenTableFormatInput: {
IcebergInput: {
MetadataOperation: 'CREATE',
Version: IcebergVersion,
IcebergTableInput: {
Location,
Schema
}
}
}
}
}
};

// Add optional PartitionSpec if provided
if (PartitionSpec) {
this.Resources[LogicalName].Properties.OpenTableFormatInput.IcebergInput.IcebergTableInput.PartitionSpec = PartitionSpec;
}

// Add optional WriteOrder if provided
if (WriteOrder) {
this.Resources[LogicalName].Properties.OpenTableFormatInput.IcebergInput.IcebergTableInput.WriteOrder = WriteOrder;
}

// Optionally add TableOptimizer for configuring snapshot retention
if (EnableOptimizer) {
const optimizerLogicalName = `${LogicalName}RetentionOptimizer`;
this.Resources[optimizerLogicalName] = {
Type: 'AWS::Glue::TableOptimizer',
DependsOn: LogicalName,
Properties: {
CatalogId,
DatabaseName,
TableName: Name,
Type: 'retention',
TableOptimizerConfiguration: {
RoleArn: OptimizerRoleArn,
Enabled: true,
RetentionConfiguration: {
IcebergConfiguration: {
SnapshotRetentionPeriodInDays,
NumberOfSnapshotsToRetain,
CleanExpiredFiles
}
}
}
}
};
}

// Optionally add TableOptimizer for compaction
// NOTE: CloudFormation does not support CompactionConfiguration properties
// (strategy, minInputFiles, deleteFileThreshold). These must be configured
// via AWS CLI/API after stack creation, or will use AWS defaults.
// See: https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/2257
if (EnableCompaction) {
const compactionLogicalName = `${LogicalName}CompactionOptimizer`;
this.Resources[compactionLogicalName] = {
Type: 'AWS::Glue::TableOptimizer',
DependsOn: LogicalName,
Properties: {
CatalogId,
DatabaseName,
TableName: Name,
Type: 'compaction',
TableOptimizerConfiguration: {
RoleArn: CompactionRoleArn,
Enabled: true
}
}
};
}

// Optionally add TableOptimizer for orphan file deletion
if (EnableOrphanFileDeletion) {
const orphanLogicalName = `${LogicalName}OrphanFileDeletionOptimizer`;
const icebergConfiguration = {
OrphanFileRetentionPeriodInDays
};

// Only add Location if specified, otherwise it defaults to table location
if (OrphanFileDeletionLocation) {
icebergConfiguration.Location = OrphanFileDeletionLocation;
}

this.Resources[orphanLogicalName] = {
Type: 'AWS::Glue::TableOptimizer',
DependsOn: LogicalName,
Properties: {
CatalogId,
DatabaseName,
TableName: Name,
Type: 'orphan_file_deletion',
TableOptimizerConfiguration: {
RoleArn: OrphanFileDeletionRoleArn,
Enabled: true,
OrphanFileDeletionConfiguration: {
IcebergConfiguration: icebergConfiguration
}
}
}
};
}
}
}

module.exports = GlueIcebergTable;
1 change: 1 addition & 0 deletions lib/shortcuts/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ module.exports = {
GlueJsonTable: require('./glue-json-table'),
GlueOrcTable: require('./glue-orc-table'),
GlueParquetTable: require('./glue-parquet-table'),
GlueIcebergTable: require('./glue-iceberg-table'),
GluePrestoView: require('./glue-presto-view'),
GlueSparkView: require('./glue-spark-view'),
hookshot: require('./hookshot'),
Expand Down
Loading