Relative Content

Tag Archive for amazon-web-servicesaws-glue

AWS Glue Job Argument error: option –key_name(argument) not recognized while running a glue job triggered by Lambda which passes the arguments to Job

I am running a glue job with python script shell(version 3.9) and glue version is 3.0. I am passing 8 arguments to the glue job and accessing it using getResolvedOptions(args, options). One of the argument is not recognized and checked the logs it is saying error: option –key_name(argument) not recognized . May I know the reason for this error?

How to create a multi-column custom visual transform?

In our ETL process we are building out a pipeline where someones takes input files (ex. csv) and maps the columns to existing column names in our DB. After the mapping is complete a glue workflow will perform all of the needed data post processing and eventually loading the data into our database. We will be ingesting 3k+ data sources. Each of them will use the same workflow after data mapping is complete.

AWS Classifier column updates not reflecting in catalog

Have an classifier (.csv) having 6 columns. Source system sends data without header, recently the source has has swapped the column position of phone & address. In classifier have updated the change as well.

How to parallelize map function in Spark?

I’m trying to call a function over a DataFrame. The function takes an id as input and queries a DynamoDB table. If the id exists in the table, it goes on to perform other tasks (like call another AWS service). I’m trying to call this function over all rows in the DF similar to python code:
df.apply(lambda x: func1(x))
but I keep running into Error Category: UNCLASSIFIED_ERROR; PicklingError: Could not serialize object: TypeError: cannot pickle '_thread.lock' object

How to update an oracle table in on prem

I wanted to run an update statement from cloud aws glue job to on prem load oracle table when the load is completed . How to achieve this in glue

Thiết kế website giá rẻ

Danh mục