Providing Paths in Databricks to Load a File

When working with Databricks, specifying file paths is crucial for loading files from different file systems. Databricks supports two primary file systems: the Local File System of the Driver Node and the Databricks File System (DBFS). The syntax for specifying paths depends on the file system and the type of code being executed.

Default File Systems and Prefixes

Command/Code Default Location Prefix to access DBFS Prefix to access Local File System
%fs DBFS Root Optional file:/
dbutils.fs DBFS Root Optional file:/
spark.read/write DBFS Root Optional file:/
Spark SQL DBFS Root Optional file:/
Python code Local File System /dbfs None
%sh Local File System /dbfs None

Examples of Loading Files

Using Spark: To load a file from DBFS using Spark, you can use the following command:

spark.read.parquet("dbfs:/mnt/test_folder/test_folder1/file.parquet")

Using Python: To load a file from DBFS using Python, you can use the following command:

dbutils.fs.ls("/dbfs/mnt/test_folder/test_folder1/")

Frequently Asked Questions

Bottom Line

Providing paths in Databricks requires understanding the default file systems and prefixes for different types of code. By using the appropriate prefixes and commands, you can efficiently load files from both DBFS and the Local File System.


👉 Hop on a short call to discover how Fog Solutions helps navigate your sea of data and lights a clear path to grow your business.