Python for Azure Synapse Lake Database

  Kiến thức lập trình

What I have done in AWS:
I have a datalake in AWS. I use AWS Glue Crawlers to get the schema of data (.parquet format) in different partitions (e.g. year=2000/month=1/day=1/file1.parquet, year=2000/month=1/day=1/file2.parquet) and populate a table in a database in AWS Glue Catalog.
I use AWS Athena to query the table in the database using pyathena and use this data locally for some processing.

What I want to do:
I want to replicate the above in Azure. I can create an Azure Synapse Workspace and add a lake database and specify the column partitions when I add the tables. I run the SQL commands within Azure Synapse Workspace and everything works. However, I cannot query the database and table using a python script and have the result saved in a dataframe.

I have looked into pyodbc and tried to run the following code:

connection_string = f'DRIVER={driver};SERVER={server};PORT=1433;DATABASE={database};UID={username};PWD={password}'

try:
    # Connect to the database
    conn = pyodbc.connect(connection_string)
    print("Connection successful!")

    cursor = conn.cursor()

    query = "SELECT TOP 10 * FROM dbo.tablename"

    cursor.execute(query)
    rows = cursor.fetchall()

    if rows:
        columns = [column[0] for column in cursor.description]
        df = pd.DataFrame.from_records(rows, columns=columns)
        print(df)
    else:
        print("No results found for the query.")

    cursor.close()
    conn.close()
    
except Exception as e:
    print(f"An error occurred: {e}")

I get the following error:
An error occurred: (‘42000’, “[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot find the CREDENTIAL ”https://.dfs.core.windows.net//year=/month=/day=/.parquet’,’, because it does not exist or you do not have permission. (15151) (SQLExecDirectW)”)

Can anyone help me with this problem? Since I am very new to Azure, it will be extremely helpful if you could guide from the beginning (maybe some steps I missed during the creation of Azure Synapse workspace and the lake database).

LEAVE A COMMENT