Specifying Paths in Databricks

In Databricks, specifying paths depends on the type of code being executed and the file system being accessed. The default file system for most commands like %fs, dbutils.fs, and Spark SQL is DBFS (Databricks File System), while Python code defaults to the Local File System of the driver node.

To access DBFS from Python code, you need to use the /dbfs prefix. For example, to list the contents of the DBFS root using Python, you would use:

import os
print(os.listdir("/dbfs/"))

To access the Local File System from commands that default to DBFS, you use the file:/ prefix. For instance, to list the contents of the Local File System root using %fs, you would use:

%fs ls file:/

When using dbutils.fs to copy a file from the Local File System to DBFS, you would specify the source path with file:/ and the destination path with /dbfs or dbfs:/.

dbutils.fs.cp("file:/tmp/my_file.txt", "/FileStore/")

Frequently Asked Questions

Bottom Line: Specifying paths in Databricks requires understanding the default file systems for different types of code and using appropriate prefixes to access non-default file systems. This ensures seamless interaction with both DBFS and the Local File System.


👉 Hop on a short call to discover how Fog Solutions helps navigate your sea of data and lights a clear path to grow your business.