Connecting to SQL Server from Databricks
To connect to SQL Server from Databricks, you can use the included driver in Databricks Runtime 11.3 LTS and above. Here’s how you can do it:
When working with DataFrames, use the following syntax:
remote_table = (spark.read .format("sqlserver") .option("host", "hostName") .option("port", "port") # optional, can use default port 1433 if omitted .option("user", "username") .option("password", "password") .option("database", "databaseName") .option("dbtable", "schemaName.tableName") # (if schemaName not provided, default to "dbo") .load())
Alternatively, for SQL queries, specify sqlserver in the USING clause:
DROP TABLE IF EXISTS sqlserver_table; CREATE TABLE sqlserver_table USING sqlserver OPTIONS ( dbtable '', host ' ', port '1433', database ' ', user ' ', password ' ' );
For Databricks Runtime 10.4 LTS and below, you must use the JDBC driver:
driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver" database_host = "" database_port = "1433" # update if you use a non-default port database_name = " " table = " " user = " " password = " " url = f"jdbc:sqlserver://{database_host}:{database_port};database={database_name}" remote_table = (spark.read .format("jdbc") .option("driver", driver) .option("url", url) .option("dbtable", table) .option("user", user) .option("password", password) .load())
Frequently Asked Questions
- Q: What is the default port for SQL Server connections in Databricks?
A: The default port for SQL Server connections is 1433.
- Q: Can I use Lakehouse Federation for SQL Server connections?
A: Yes, Lakehouse Federation is recommended for full query federation support, allowing you to use Unity Catalog syntax and data governance tools.
- Q: How do I handle schema names in SQL Server connections?
A: If you don’t specify a schema name, it defaults to “dbo”. Otherwise, you can specify it in the “dbtable” option as “schemaName.tableName”.
- Q: What permissions are required to create a SQL Server connection in Azure Databricks?
A: You need “CREATE CONNECTION” privileges on the metastore to create a new connection.
- Q: Can I use SQL Server connections with older Databricks Runtime versions?
A: Yes, for versions below 11.3 LTS, you must use the JDBC driver.
- Q: How do I test the SQL Server connection in Azure Databricks?
A: You can test the connection by verifying that the host is reachable. However, this does not validate the username and password.
- Q: Are there any limitations to using SQL Server connections in Databricks?
A: Yes, the configurations for SQL Server connections are currently experimental and not fully supported by Databricks customer technical support.
Bottom Line: Connecting to SQL Server from Databricks is straightforward using the included driver in newer runtime versions or the JDBC driver in older versions. However, for full query federation and data governance features, consider using Lakehouse Federation.