This is a Model Context Protocol (MCP) server for executing SQL queries against Databricks using the Statement Execution API. It can retrieve data by performing SQL requests using the Databricks API. When used in an Agent mode, it can successfully iterate over a number of requests to perform complex tasks. It is even better when coupled with Unity Catalog Metadata.
uv
, ensure it's installedpip install -r requirements.txt
Or if using uv
:
uv pip install -r requirements.txt
Set up your environment variables:
Option 1: Using a .env file (recommended)
Create a .env file with your Databricks credentials:
DATABRICKS_HOST=your-databricks-instance.cloud.databricks.com
DATABRICKS_TOKEN=your-databricks-access-token
DATABRICKS_SQL_WAREHOUSE_ID=your-sql-warehouse-id
Option 2: Setting environment variables directly
export DATABRICKS_HOST="your-databricks-instance.cloud.databricks.com"
export DATABRICKS_TOKEN="your-databricks-access-token"
export DATABRICKS_SQL_WAREHOUSE_ID="your-sql-warehouse-id"
You can find your SQL warehouse ID in the Databricks UI under SQL Warehouses.
Before using this MCP server, ensure that:
SQL Warehouse Permissions: The user associated with the provided token must have appropriate permissions to access the specified SQL warehouse. You can configure warehouse permissions in the Databricks UI under SQL Warehouses > [Your Warehouse] > Permissions.
Token Permissions: The personal access token used should have the minimum necessary permissions to perform the required operations. It is strongly recommended to:
Data Access Permissions: The user associated with the token must have appropriate permissions to access the catalogs, schemas, and tables that will be queried.
To set SQL warehouse permissions via the Databricks REST API, you can use:
GET /api/2.0/sql/permissions/warehouses/{warehouse_id}
to check current permissionsPATCH /api/2.0/sql/permissions/warehouses/{warehouse_id}
to update permissionsFor security best practices, consider regularly rotating your access tokens and auditing query history to monitor usage.
To run the server in standalone mode:
python main.py
This will start the MCP server using stdio transport, which can be used with Agent Composer or other MCP clients.
To use this MCP server with Cursor, you need to configure it in your Cursor settings:
.cursor
directory in your home directory if it doesn't already existmcp.json
file in that directory:mkdir -p ~/.cursor
touch ~/.cursor/mcp.json
mcp.json
file, replacing the directory path with the actual path to where you've installed this server:{
"mcpServers": {
"databricks": {
"command": "uv",
"args": [
"--directory",
"/path/to/your/mcp-databricks-server",
"run",
"main.py"
]
}
}
}
If you're not using uv
, you can use python
instead:
{
"mcpServers": {
"databricks": {
"command": "python",
"args": [
"/path/to/your/mcp-databricks-server/main.py"
]
}
}
}
Now you can use the Databricks MCP server directly within Cursor's AI assistant.
The server provides the following tools:
execute_sql_query
: Execute a SQL query and return the results
execute_sql_query(sql: str) -> str
list_schemas
: List all available schemas in a specific catalog
list_schemas(catalog: str) -> str
list_tables
: List all tables in a specific schema
list_tables(schema: str) -> str
describe_table
: Describe a table's schema
describe_table(table_name: str) -> str
In Agent Composer or other MCP clients, you can use these tools like:
execute_sql_query("SELECT * FROM my_schema.my_table LIMIT 10")
list_schemas("my_catalog")
list_tables("my_catalog.my_schema")
describe_table("my_catalog.my_schema.my_table")
The server is designed to handle long-running queries by polling the Databricks API until the query completes or times out. The default timeout is 10 minutes (60 retries with 10-second intervals), which can be adjusted in the dbapi.py
file if needed.
{
"mcpServers": {
"databricks": {
"env": {},
"args": [
"--directory",
"/path/to/your/mcp-databricks-server",
"run",
"main.py"
],
"command": "uv"
}
}
}
Seamless access to top MCP servers powering the future of AI integration.