I have created a data pipeline in Azure Data Factory that reads data from a CSV file and loads it into a SQL database. The table/CSV has 369 columns. In the data flow, I have created a derived column activity between source and sink to enforce each column to its data type and length (for string columns). This was a significant amount of work to capture all 369 columns individually. The pipeline runs successfully in the development environment.
To move it to the acceptance environment, I exported the pipeline as a template and imported it into acceptance. However, when I run the pipeline in the acceptance environment, I get the following error:
Job failed due to reason: at Sink ‘sink1’: The given value of type NVARCHAR(21) from the data source cannot be converted to type nvarchar(20) of the specified target column D1001CRM_XYSQN.
In the derived column activity, I have enforced this as:
D1001CRM_XYSQN = subString(trim(D1001CRM_XYSQN), 1, 20)
This works in the DEV environment but doesn’t in Acceptance. I suspect this might be the case with other columns as well. I do not want to spend hours manually enforcing each column once again in the acceptance environment.
Here are some additional details:
- Both DEV and Acceptance environments should be identical in terms of data schema and data types.
- The derived column expressions should have been exported and imported correctly with the pipeline template.
Questions:
- Why is this issue occurring in the acceptance environment but not in the development environment?
- Is there a more efficient way to ensure that all columns have the correct data type and length enforcement without having to redo the derived column activity for all 369 columns?
Screenshots:
Data flow:
Derivedd Column setting in Data flow:
Pipeline: