To read a file from Azure Data Lake Storage (ADLS) Gen2 in Databricks, you can follow these steps:

Set Up Authentication

First, you need to configure authentication to access your ADLS Gen2 account.

The recommended method is using OAuth 2.0 with a Microsoft Entra ID service principal:

python
spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net", "<application-id>")
spark.conf.set("fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net",
dbutils.secrets.get(scope="<secret-scope>", key="<service-credential-key>"))
spark.conf.set("fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net", "https://login.microsoftonline.com/<directory-id>/oauth2/token")

Replace <storage-account><application-id><secret-scope><service-credential-key>, and <directory-id> with your specific values.

Read the File

Once authentication is set up, you can read the file using Spark:

For CSV files:

python
file_path = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<path-to-file>/file.csv"
df = spark.read.csv(file_path, header=True, inferSchema=True)

For JSON files:

python
file_path = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<path-to-file>/file.json"
df = spark.read.json(file_path)

For Parquet files:

python
file_path = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<path-to-file>/file.parquet"
df = spark.read.parquet(file_path)

Replace <container-name><storage-account-name>, and <path-to-file> with your specific values.

Display the Data

After reading the file, you can display the contents:

python
display(df)

Additional Notes

Remember to replace placeholder values with your actual ADLS Gen2 account details, container names, and file paths.