I have a simple Glue ETL Job that extracts a few tables from an RDS instance and sends the data to an external data lake. That works fine, but I need to add a Custom Transform to modify some of the data. I’m following this AWS guide. I’ve copied and pasted the contents of the example json and python files into files named customFilterState.json
and customFilterState.py
, and uploaded these to the S3 bucket that contains my Job assets (the scripts and whatnot). I placed these two files in a folder called transforms
at the base level, per the documentation. However, the Custom Transform does not show up in the Visual ETL Editor.
Here are the two files mentioned above:
{
"name": "custom_filter_state",
"displayName": "Filter State",
"description": "A simple example to filter the data to keep only the state indicated.",
"functionName": "custom_filter_state",
"parameters": [
{
"name": "colName",
"displayName": "Column name",
"type": "str",
"description": "Name of the column in the data that holds the state postal code"
},
{
"name": "state",
"displayName": "State postal code",
"type": "str",
"description": "The postal code of the state whole rows to keep"
}
]
}
from awsglue import DynamicFrame
def custom_filter_state(self, colName, state):
return self.filter(lambda row: row[colName] == state)
DynamicFrame.custom_filter_state = custom_filter_state
Here’s my S3 bucket, which, again, is where the ETL Job’s scripts are stored.
And here is the list of Transforms, without the custom Filter State.
What am I missing? Why won’t Custom Transforms show up?