27 KiB
ESPHome AI Collaboration Guide
This document provides essential context for AI models interacting with this project. Adhering to these guidelines will ensure consistency and maintain code quality.
1. Project Overview & Purpose
- Primary Goal: ESPHome is a system to configure microcontrollers (like ESP32, ESP8266, RP2040, and LibreTiny-based chips) using simple yet powerful YAML configuration files. It generates C++ firmware that can be compiled and flashed to these devices, allowing users to control them remotely through home automation systems.
- Business Domain: Internet of Things (IoT), Home Automation.
2. Core Technologies & Stack
- Languages: Python (>=3.11), C++ (gnu++20)
- Frameworks & Runtimes: PlatformIO, Arduino, ESP-IDF.
- Build Systems: PlatformIO is the primary build system. CMake is used as an alternative.
- Configuration: YAML.
- Key Libraries/Dependencies:
- Python:
voluptuous(for configuration validation),PyYAML(for parsing configuration files),paho-mqtt(for MQTT communication),tornado(for the web server),aioesphomeapi(for the native API). - C++:
ArduinoJson(for JSON serialization/deserialization),AsyncMqttClient-esphome(for MQTT),ESPAsyncWebServer(for the web server).
- Python:
- Package Manager(s):
pip(for Python dependencies),platformio(for C++/PlatformIO dependencies). - Communication Protocols: Protobuf (for native API), MQTT, HTTP.
3. Architectural Patterns
-
Overall Architecture: The project follows a code-generation architecture. The Python code parses user-defined YAML configuration files and generates C++ source code. This C++ code is then compiled and flashed to the target microcontroller using PlatformIO.
-
Directory Structure Philosophy:
/esphome: Contains the core Python source code for the ESPHome application./esphome/components: Contains the individual components that can be used in ESPHome configurations. Each component is a self-contained unit with its own C++ and Python code./tests: Contains all unit and integration tests for the Python code./docker: Contains Docker-related files for building and running ESPHome in a container./script: Contains helper scripts for development and maintenance.
-
Core Architectural Components:
- Configuration System (
esphome/config*.py): Handles YAML parsing and validation using Voluptuous, schema definitions, and multi-platform configurations. - Code Generation (
esphome/codegen.py,esphome/cpp_generator.py): Manages Python to C++ code generation, template processing, and build flag management. - Component System (
esphome/components/): Contains modular hardware and software components with platform-specific implementations and dependency management. - Core Framework (
esphome/core/): Manages the application lifecycle, hardware abstraction, and component registration. - Dashboard (
esphome/dashboard/): A web-based interface for device configuration, management, and OTA updates.
- Configuration System (
-
Platform Support:
- ESP32 (
components/esp32/): Espressif ESP32 family. Supports multiple variants (Original, C2, C3, C5, C6, H2, P4, S2, S3) with ESP-IDF framework. Arduino framework supports only a subset of the variants (Original, C3, S2, S3). - ESP8266 (
components/esp8266/): Espressif ESP8266. Arduino framework only, with memory constraints. - RP2040 (
components/rp2040/): Raspberry Pi Pico/RP2040. Arduino framework with PIO (Programmable I/O) support. - LibreTiny (
components/libretiny/): Realtek and Beken chips. Supports multiple chip families and auto-generated components.
- ESP32 (
4. Coding Conventions & Style Guide
-
Formatting:
- Python: Uses
ruffandflake8for linting and formatting. Configuration is inpyproject.toml. - C++: Uses
clang-formatfor formatting. Configuration is in.clang-format.
- Python: Uses
-
Naming Conventions:
- Python: Follows PEP 8. Use clear, descriptive names following snake_case.
- C++: Follows the Google C++ Style Guide with these specifics (following clang-tidy conventions):
- Function, method, and variable names:
lower_snake_case - Class/struct/enum names:
UpperCamelCase - Top-level constants (global/namespace scope):
UPPER_SNAKE_CASE - Function-local constants:
lower_snake_case - Protected/private fields:
lower_snake_case_with_trailing_underscore_ - Favor descriptive names over abbreviations
- Function, method, and variable names:
-
C++ Field Visibility:
- Prefer
protected: Useprotectedfor most class fields to enable extensibility and testing. Fields should belower_snake_case_with_trailing_underscore_. - Use
privatefor safety-critical cases: Useprivatevisibility when direct field access could introduce bugs or violate invariants:- Pointer lifetime issues: When setters validate and store pointers from known lists to prevent dangling references.
// Helper to find matching string in vector and return its pointer inline const char *vector_find(const std::vector<const char *> &vec, const char *value) { for (const char *item : vec) { if (strcmp(item, value) == 0) return item; } return nullptr; } class ClimateDevice { public: void set_custom_fan_modes(std::initializer_list<const char *> modes) { this->custom_fan_modes_ = modes; this->active_custom_fan_mode_ = nullptr; // Reset when modes change } bool set_custom_fan_mode(const char *mode) { // Find mode in supported list and store that pointer (not the input pointer) const char *validated_mode = vector_find(this->custom_fan_modes_, mode); if (validated_mode != nullptr) { this->active_custom_fan_mode_ = validated_mode; return true; } return false; } private: std::vector<const char *> custom_fan_modes_; // Pointers to string literals in flash const char *active_custom_fan_mode_{nullptr}; // Must point to entry in custom_fan_modes_ }; - Invariant coupling: When multiple fields must remain synchronized to prevent buffer overflows or data corruption.
class Buffer { public: void resize(size_t new_size) { auto new_data = std::make_unique<uint8_t[]>(new_size); if (this->data_) { std::memcpy(new_data.get(), this->data_.get(), std::min(this->size_, new_size)); } this->data_ = std::move(new_data); this->size_ = new_size; // Must stay in sync with data_ } private: std::unique_ptr<uint8_t[]> data_; size_t size_{0}; // Must match allocated size of data_ }; - Resource management: When setters perform cleanup or registration operations that derived classes might skip.
- Pointer lifetime issues: When setters validate and store pointers from known lists to prevent dangling references.
- Provide
protectedaccessor methods: When derived classes need controlled access toprivatemembers.
- Prefer
-
C++ Preprocessor Directives:
- Avoid
#definefor constants: Using#definefor constants is discouraged and should be replaced withconstvariables or enums. - Use
#defineonly for:- Conditional compilation (
#ifdef,#ifndef) - Compile-time sizes calculated during Python code generation (e.g., configuring
std::arrayorStaticVectordimensions viacg.add_define())
- Conditional compilation (
- Avoid
-
C++ Additional Conventions:
- Member access: Prefix all class member access with
this->(e.g.,this->value_notvalue_) - Indentation: Use spaces (two per indentation level), not tabs
- Type aliases: Prefer
using type_t = int;overtypedef int type_t; - Line length: Wrap lines at no more than 120 characters
- Member access: Prefix all class member access with
-
Component Structure:
-
Standard Files:
components/[component_name]/ ├── __init__.py # Component configuration schema and code generation ├── [component].h # C++ header file (if needed) ├── [component].cpp # C++ implementation (if needed) └── [platform]/ # Platform-specific implementations ├── __init__.py # Platform-specific configuration ├── [platform].h # Platform C++ header └── [platform].cpp # Platform C++ implementation -
Component Metadata:
DEPENDENCIES: List of required componentsAUTO_LOAD: Components to automatically loadCONFLICTS_WITH: Incompatible componentsCODEOWNERS: GitHub usernames responsible for maintenanceMULTI_CONF: Whether multiple instances are allowed
-
-
Code Generation & Common Patterns:
-
Configuration Schema Pattern:
import esphome.codegen as cg import esphome.config_validation as cv from esphome.const import CONF_KEY, CONF_ID CONF_PARAM = "param" # A constant that does not yet exist in esphome/const.py my_component_ns = cg.esphome_ns.namespace("my_component") MyComponent = my_component_ns.class_("MyComponent", cg.Component) CONFIG_SCHEMA = cv.Schema({ cv.GenerateID(): cv.declare_id(MyComponent), cv.Required(CONF_KEY): cv.string, cv.Optional(CONF_PARAM, default=42): cv.int_, }).extend(cv.COMPONENT_SCHEMA) async def to_code(config): var = cg.new_Pvariable(config[CONF_ID]) await cg.register_component(var, config) cg.add(var.set_key(config[CONF_KEY])) cg.add(var.set_param(config[CONF_PARAM])) -
C++ Class Pattern:
namespace esphome::my_component { class MyComponent : public Component { public: void setup() override; void loop() override; void dump_config() override; void set_key(const std::string &key) { this->key_ = key; } void set_param(int param) { this->param_ = param; } protected: std::string key_; int param_{0}; }; } // namespace esphome::my_component -
Common Component Examples:
-
Sensor:
from esphome.components import sensor CONFIG_SCHEMA = sensor.sensor_schema(MySensor).extend(cv.polling_component_schema("60s")) async def to_code(config): var = await sensor.new_sensor(config) await cg.register_component(var, config) -
Binary Sensor:
from esphome.components import binary_sensor CONFIG_SCHEMA = binary_sensor.binary_sensor_schema().extend({ ... }) async def to_code(config): var = await binary_sensor.new_binary_sensor(config) -
Switch:
from esphome.components import switch CONFIG_SCHEMA = switch.switch_schema().extend({ ... }) async def to_code(config): var = await switch.new_switch(config)
-
-
-
Configuration Validation:
- Common Validators:
cv.int_,cv.float_,cv.string,cv.boolean,cv.int_range(min=0, max=100),cv.positive_int,cv.percentage. - Complex Validation:
cv.All(cv.string, cv.Length(min=1, max=50)),cv.Any(cv.int_, cv.string). - Platform-Specific:
cv.only_on(["esp32", "esp8266"]),esp32.only_on_variant(...),cv.only_on_esp32,cv.only_on_esp8266,cv.only_on_rp2040. - Framework-Specific:
cv.only_with_framework(...),cv.only_with_arduino,cv.only_with_esp_idf. - Schema Extensions:
CONFIG_SCHEMA = cv.Schema({ ... }) .extend(cv.COMPONENT_SCHEMA) .extend(uart.UART_DEVICE_SCHEMA) .extend(i2c.i2c_device_schema(0x48)) .extend(spi.spi_device_schema(cs_pin_required=True))
- Common Validators:
5. Key Files & Entrypoints
- Main Entrypoint(s):
esphome/__main__.pyis the main entrypoint for the ESPHome command-line interface. - Configuration:
pyproject.toml: Defines the Python project metadata and dependencies.platformio.ini: Configures the PlatformIO build environments for different microcontrollers..pre-commit-config.yaml: Configures the pre-commit hooks for linting and formatting.
- CI/CD Pipeline: Defined in
.github/workflows. - Static Analysis & Development:
esphome/core/defines.h: A comprehensive header file containing all#definedirectives that can be added by components usingcg.add_define()in Python. This file is used exclusively for development, static analysis tools, and CI testing - it is not used during runtime compilation. When developing components that add new defines, they must be added to this file to ensure proper IDE support and static analysis coverage. The file includes feature flags, build configurations, and platform-specific defines that help static analyzers understand the complete codebase without needing to compile for specific platforms.
6. Development & Testing Workflow
- Local Development Environment: Use the provided Docker container or create a Python virtual environment and install dependencies from
requirements_dev.txt. - Running Commands: Use the
script/run-in-env.pyscript to execute commands within the project's virtual environment. For example, to run the linter:python3 script/run-in-env.py pre-commit run. - Testing:
- Python: Run unit tests with
pytest. - C++: Use
clang-tidyfor static analysis. - Component Tests: YAML-based compilation tests are located in
tests/. The structure is as follows:Run them usingtests/ ├── test_build_components/ # Base test configurations └── components/[component]/ # Component-specific testsscript/test_build_components. Use-c <component>to test specific components and-t <target>for specific platforms. - Testing All Components Together: To verify that all components can be tested together without ID conflicts or configuration issues, use:
This tests all components in a single build to catch conflicts that might not appear when testing components individually. Use
./script/test_component_grouping.py -e config --all-e configfor fast configuration validation, or-e compilefor full compilation testing.
- Python: Run unit tests with
- Debugging and Troubleshooting:
- Debug Tools:
esphome config <file>.yamlto validate configuration.esphome compile <file>.yamlto compile without uploading.- Check the Dashboard for real-time logs.
- Use component-specific debug logging.
- Common Issues:
- Import Errors: Check component dependencies and
PYTHONPATH. - Validation Errors: Review configuration schema definitions.
- Build Errors: Check platform compatibility and library versions.
- Runtime Errors: Review generated C++ code and component logic.
- Import Errors: Check component dependencies and
- Debug Tools:
7. Specific Instructions for AI Collaboration
-
Contribution Workflow (Pull Request Process):
- Fork & Branch: Create a new branch based on the
devbranch (always usegit checkout -b <branch-name> devto ensure you're branching fromdev, not the currently checked out branch). - Make Changes: Adhere to all coding conventions and patterns.
- Test: Create component tests for all supported platforms and run the full test suite locally.
- Lint: Run
pre-committo ensure code is compliant. - Commit: Commit your changes. There is no strict format for commit messages.
- Pull Request: Submit a PR against the
devbranch. The Pull Request title should have a prefix of the component being worked on (e.g.,[display] Fix bug,[abc123] Add new component). Update documentation, examples, and addCODEOWNERSentries as needed. Pull requests should always be made using the.github/PULL_REQUEST_TEMPLATE.mdtemplate - fill out all sections completely without removing any parts of the template.
- Fork & Branch: Create a new branch based on the
-
Documentation Contributions:
- Documentation is hosted in the separate
esphome/esphome-docsrepository. - The contribution workflow is the same as for the codebase.
- Documentation is hosted in the separate
-
Best Practices:
-
Component Development: Keep dependencies minimal, provide clear error messages, and write comprehensive docstrings and tests.
-
Code Generation: Generate minimal and efficient C++ code. Validate all user inputs thoroughly. Support multiple platform variations.
-
Configuration Design: Aim for simplicity with sensible defaults, while allowing for advanced customization.
-
Embedded Systems Optimization: ESPHome targets resource-constrained microcontrollers. Be mindful of flash size and RAM usage.
STL Container Guidelines:
ESPHome runs on embedded systems with limited resources. Choose containers carefully:
-
Compile-time-known sizes: Use
std::arrayinstead ofstd::vectorwhen size is known at compile time.// Bad - generates STL realloc code std::vector<int> values; // Good - no dynamic allocation std::array<int, MAX_VALUES> values;Use
cg.add_define("MAX_VALUES", count)to set the size from Python configuration.For byte buffers: Avoid
std::vector<uint8_t>unless the buffer needs to grow. Usestd::unique_ptr<uint8_t[]>instead.Note:
std::unique_ptr<uint8_t[]>does not provide bounds checking or iterator support likestd::vector<uint8_t>. Use it only when you do not need these features and want minimal overhead.// Bad - STL overhead for simple byte buffer std::vector<uint8_t> buffer; buffer.resize(256); // Good - minimal overhead, single allocation std::unique_ptr<uint8_t[]> buffer = std::make_unique<uint8_t[]>(256); // Or if size is constant: std::array<uint8_t, 256> buffer; -
Compile-time-known fixed sizes with vector-like API: Use
StaticVectorfromesphome/core/helpers.hfor fixed-size stack allocation withpush_back()interface.// Bad - generates STL realloc code (_M_realloc_insert) std::vector<ServiceRecord> services; services.reserve(5); // Still includes reallocation machinery // Good - compile-time fixed size, stack allocated, no reallocation machinery StaticVector<ServiceRecord, MAX_SERVICES> services; // Allocates all MAX_SERVICES on stack services.push_back(record1); // Tracks count but all slots allocatedUse
cg.add_define("MAX_SERVICES", count)to set the size from Python configuration. Likestd::arraybut with vector-like API (push_back(),size()) and no STL reallocation code. -
Runtime-known sizes: Use
FixedVectorfromesphome/core/helpers.hwhen the size is only known at runtime initialization.// Bad - generates STL realloc code (_M_realloc_insert) std::vector<TxtRecord> txt_records; txt_records.reserve(5); // Still includes reallocation machinery // Good - runtime size, single allocation, no reallocation machinery FixedVector<TxtRecord> txt_records; txt_records.init(record_count); // Initialize with exact size at runtimeBenefits:
- Eliminates
_M_realloc_insert,_M_default_appendtemplate instantiations (saves 200-500 bytes per instance) - Single allocation, no upper bound needed
- No reallocation overhead
- Compatible with protobuf code generation when using
[(fixed_vector) = true]option
- Eliminates
-
Small datasets (1-16 elements): Use
std::vectororstd::arraywith simple structs instead ofstd::map/std::set/std::unordered_map.// Bad - 2KB+ overhead for red-black tree/hash table std::map<std::string, int> small_lookup; std::unordered_map<int, std::string> tiny_map; // Good - simple struct with linear search (std::vector is fine) struct LookupEntry { const char *key; int value; }; std::vector<LookupEntry> small_lookup = { {"key1", 10}, {"key2", 20}, {"key3", 30}, }; // Or std::array if size is compile-time constant: // std::array<LookupEntry, 3> small_lookup = {{ ... }};Linear search on small datasets (1-16 elements) is often faster than hashing/tree overhead, but this depends on lookup frequency and access patterns. For frequent lookups in hot code paths, the O(1) vs O(n) complexity difference may still matter even for small datasets.
std::vectorwith simple structs is usually fine—it's the heavy containers (map,set,unordered_map) that should be avoided for small datasets unless profiling shows otherwise. -
Detection: Look for these patterns in compiler output:
- Large code sections with STL symbols (vector, map, set)
alloc,realloc,deallocin symbol names_M_realloc_insert,_M_default_append(vector reallocation)- Red-black tree code (
rb_tree,_Rb_tree) - Hash table infrastructure (
unordered_map,hash)
When to optimize:
- Core components (API, network, logger)
- Widely-used components (mdns, wifi, ble)
- Components causing flash size complaints
When not to optimize:
- Single-use niche components
- Code where readability matters more than bytes
- Already using appropriate containers
-
-
State Management: Use
CORE.datafor component state that needs to persist during configuration generation. Avoid module-level mutable globals.Bad Pattern (Module-Level Globals):
# Don't do this - state persists between compilation runs _component_state = [] _use_feature = None def enable_feature(): global _use_feature _use_feature = TrueBad Pattern (Flat Keys):
# Don't do this - keys should be namespaced under component domain MY_FEATURE_KEY = "my_component_feature" CORE.data[MY_FEATURE_KEY] = TrueGood Pattern (dataclass):
from dataclasses import dataclass, field from esphome.core import CORE DOMAIN = "my_component" @dataclass class MyComponentData: feature_enabled: bool = False item_count: int = 0 items: list[str] = field(default_factory=list) def _get_data() -> MyComponentData: if DOMAIN not in CORE.data: CORE.data[DOMAIN] = MyComponentData() return CORE.data[DOMAIN] def request_feature() -> None: _get_data().feature_enabled = True def add_item(item: str) -> None: _get_data().items.append(item)If you need a real-world example, search for components that use
@dataclasswithCORE.datain the codebase. Note: Some components may useTypedDictfor dictionary-based storage; both patterns are acceptable depending on your needs.Why this matters:
- Module-level globals persist between compilation runs if the dashboard doesn't fork/exec
CORE.dataautomatically clears between runs- Namespacing under
DOMAINprevents key collisions between components @dataclassprovides type safety and cleaner attribute access
-
-
Security: Be mindful of security when making changes to the API, web server, or any other network-related code. Do not hardcode secrets or keys.
-
Dependencies & Build System Integration:
- Python: When adding a new Python dependency, add it to the appropriate
requirements*.txtfile andpyproject.toml. - C++ / PlatformIO: When adding a new C++ dependency, add it to
platformio.iniand usecg.add_library. - Build Flags: Use
cg.add_build_flag(...)to add compiler flags.
- Python: When adding a new Python dependency, add it to the appropriate
8. Public API and Breaking Changes
-
Public C++ API:
- Components: Only documented features at esphome.io are public API. Undocumented
publicmembers are internal. - Core/Base Classes (
esphome/core/,Component,Sensor, etc.): Allpublicmembers are public API. - Components with Global Accessors (
global_api_server, etc.): Allpublicmembers are public API (except config setters).
- Components: Only documented features at esphome.io are public API. Undocumented
-
Public Python API:
- All documented configuration options at esphome.io are public API.
- Python code in
esphome/core/actively used by existing core components is considered stable API. - Other Python code is internal unless explicitly documented for external component use.
-
Breaking Changes Policy:
- Aim for 6-month deprecation window when possible
- Clean breaks allowed for: signature changes, deep refactorings, resource constraints
- Must document migration path in PR description (generates release notes)
- Blog post required for core/base class changes or significant architectural changes
- Full details: https://developers.esphome.io/contributing/code/#public-api-and-breaking-changes
-
Breaking Change Checklist:
- Clear justification (RAM/flash savings, architectural improvement)
- Explored non-breaking alternatives
- Added deprecation warnings if possible (use
ESPDEPRECATEDmacro for C++) - Documented migration path in PR description with before/after examples
- Updated all internal usage and esphome-docs
- Tested backward compatibility during deprecation period
-
Deprecation Pattern (C++):
// Remove before 2026.6.0 ESPDEPRECATED("Use new_method() instead. Removed in 2026.6.0", "2025.12.0") void old_method() { this->new_method(); } -
Deprecation Pattern (Python):
# Remove before 2026.6.0 if CONF_OLD_KEY in config: _LOGGER.warning(f"'{CONF_OLD_KEY}' deprecated, use '{CONF_NEW_KEY}'. Removed in 2026.6.0") config[CONF_NEW_KEY] = config.pop(CONF_OLD_KEY) # Auto-migrate