A Model Context Protocol (MCP) server for Firecrawl Simple, a powerful web scraping and site mapping tool that enables Large Language Models (LLMs) to access and process web content.
Before installing and using Firecrawl Simple MCP Server, ensure you have:
# Install the package
npm install -g firecrawl-simple-mcp
# Run with default configuration
FIRECRAWL_API_URL=http://localhost:3002/v1 firecrawl-simple-mcp
# Test with a simple scrape request
curl -X POST http://localhost:3003/mcp/tool \
-H "Content-Type: application/json" \
-d '{"name":"firecrawl_scrape","arguments":{"url":"https://example.com"}}'
npm install -g firecrawl-simple-mcp
git clone https://github.com/dsafonov/firecrawl-simple-mcp.git
cd firecrawl-simple-mcp
npm install
npm run build
# Basic usage with self-hosted Firecrawl Simple
FIRECRAWL_API_URL=http://localhost:3002/v1 firecrawl-simple-mcp
# With additional configuration
FIRECRAWL_API_URL=http://localhost:3002/v1 \
FIRECRAWL_LOG_LEVEL=DEBUG \
firecrawl-simple-mcp
The server can be configured using environment variables:
Variable | Description | Default |
---|---|---|
FIRECRAWL_API_URL | URL of the Firecrawl Simple API | http://localhost:3002/v1 |
FIRECRAWL_API_KEY | API key for authentication (if required) | - |
FIRECRAWL_API_TIMEOUT | API request timeout in milliseconds | 30000 |
FIRECRAWL_SERVER_PORT | Port for the MCP server when using SSE transport | 3003 |
FIRECRAWL_TRANSPORT_TYPE | Transport type (stdio or sse ) | stdio |
FIRECRAWL_LOG_LEVEL | Logging level (DEBUG, INFO, WARN, ERROR) | INFO |
FIRECRAWL_VERSION | Version identifier | 1.0.0 |
Add this to your claude_desktop_config.json
:
{
"mcpServers": {
"firecrawl-simple-mcp": {
"command": "npx",
"args": ["-y", "firecrawl-simple-mcp"],
"env": {
"FIRECRAWL_API_URL": "http://localhost:3002/v1"
}
}
}
}
To configure in Cursor:
env FIRECRAWL_API_URL=http://localhost:3002/v1 npx -y firecrawl-simple-mcp
firecrawl_scrape
)Scrape content from a single URL with JavaScript rendering support.
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://example.com",
"formats": ["markdown", "html", "rawHtml"],
"waitFor": 1000,
"timeout": 30000,
"includeTags": ["article", "main"],
"excludeTags": ["nav", "footer"],
"headers": {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"Accept-Language": "en-US,en;q=0.9"
},
"mobile": false
}
}
Parameters:
url
(required): The URL to scrape.formats
: Array of output formats to return. Options include "markdown", "html", and "rawHtml".waitFor
: Time to wait for JavaScript execution in milliseconds.timeout
: Request timeout in milliseconds.includeTags
: HTML tags to include in the result.excludeTags
: HTML tags to exclude from the result.headers
: Custom HTTP headers to send with the request.mobile
: Whether to use a mobile viewport.firecrawl_map
)Generate a sitemap of a given site.
{
"name": "firecrawl_map",
"arguments": {
"url": "https://example.com",
"search": "optional search term",
"ignoreSitemap": true,
"includeSubdomains": false,
"limit": 5000
}
}
Parameters:
url
(required): The URL to map.search
: Search term to filter URLs.ignoreSitemap
: Whether to ignore sitemap.xml.includeSubdomains
: Include subdomains in mapping.limit
: Maximum number of URLs to map.Error: Failed to connect to Firecrawl API at http://localhost:3002/v1
Solution:
/v1
pathError: Authentication failed: Invalid API key
Solution:
FIRECRAWL_API_KEY
is correctError: Request timed out after 30000ms
Solution:
FIRECRAWL_API_TIMEOUT
valuewaitFor
parameter to allow more time for JavaScript executionError: Rate limit exceeded
Solution:
Error: Invalid URL format
Solution:
The crawl functionality has been intentionally removed from this MCP server for the following reasons:
Context Management: The crawl functionality provides too much information in the context of an MCP server, which can lead to context overflow issues for LLMs. This is because crawling multiple pages generates large amounts of text that would exceed the context limits of most models.
Asynchronous Operation: Crawling runs asynchronously, which is not ideal for the MCP server architecture that works best with synchronous request-response patterns. The asynchronous nature of crawling makes it difficult to integrate with the synchronous communication model of MCP.
Documentation Alignment: We've aligned the available tools with the primary documentation to ensure consistency and clarity for users.
If you need website crawling capabilities, consider using the individual scrape tool with multiple targeted URLs or implementing a custom solution outside the MCP server.
The codebase has been optimized with several key improvements:
any
types.# Install dependencies
npm install
# Run in development mode
npm run dev
# Build
npm run build
# Run tests
npm test
# Run tests with coverage
npm run coverage
# Lint code
npm run lint
# Type check
npm run typecheck
The project includes comprehensive tests for all tools and services. To run the tests:
npm test
To generate a test coverage report:
npm run coverage
The test suite includes:
Contributions are welcome! Please feel free to submit a Pull Request.
git checkout -b feature/amazing-feature
).git commit -m 'Add some amazing feature'
).git push origin feature/amazing-feature
).This project is licensed under the MIT License.
Seamless access to top MCP servers powering the future of AI integration.