component.h is included before defines.h in application.h,
so the #ifdef guards on declarations were evaluated before the
define was visible. Keep declarations always visible (harmless
unused declarations cost nothing) and only guard the definitions.
The setup_priority override mechanism (struct, vector, linear scan,
allocation, and cleanup) is only needed when a user explicitly sets
setup_priority: in their YAML config. In practice this is almost
never used - components define their priorities via C++ virtual
methods instead.
Gate the entire mechanism behind USE_SETUP_PRIORITY_OVERRIDE, which
is only defined when the codegen encounters a setup_priority: config
entry. This eliminates dead code (struct, std::vector with reserve,
new/delete, linear scan in get_actual_setup_priority) from nearly
all builds.
Also removes the unnecessary reserve(10) call since the override
count is always very small.
Components are indexed by ESPHOME_COMPONENT_COUNT which is a uint16_t-sized
StaticVector. Using size_t for the dump_config index wastes 2 bytes of storage
and adds padding. Move it to the uint16_t group for better struct packing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the global `Application App` with placement-new construction
in aligned .bss storage. This eliminates the global constructor and
destructor chain that were:
- Calling xSemaphoreCreateMutex() at static init time (via Scheduler's
Mutex member) before app_main() runs
- Redundantly zero-initializing all members that .bss already zeroes
- Registering __cxa_atexit for ~Application() destructor chain
(~Application, ~vector, ~Mutex) that never runs on embedded
The storage is a char[] with a GCC asm label matching the mangled name
of esphome::App. Other translation units see a typed extern Application
(identical codegen, no indirection), while the defining TU sees a
trivially-destructible char array — so the compiler never emits
__cxa_atexit or the destructor chain.
Construction happens in pre_setup() via placement new, which is always
the first method called on App in the generated setup() function.
Add a 15-second timeout for completing the API handshake (Noise
transport + HelloRequest). Previously, a client could connect and
stall mid-handshake, holding a connection slot for up to 150 seconds
(the keepalive disconnect timeout). With max_connections defaulting
to 8 on ESP32, this allowed all slots to be blocked with minimal
effort.
Normal clients complete the full handshake in milliseconds, so 15
seconds is generous. The check short-circuits for authenticated
connections (single bitfield compare) so there is no overhead for
established sessions.
Add a 15-second timeout for completing the API handshake (Noise
transport + HelloRequest). Previously, a client could connect and
stall mid-handshake, holding a connection slot for up to 150 seconds
(the keepalive disconnect timeout). With max_connections defaulting
to 8 on ESP32, this allowed all slots to be blocked with minimal
effort.
Normal clients complete the full handshake in milliseconds, so 15
seconds is generous. The check short-circuits for authenticated
connections (single bitfield compare) so there is no overhead for
established sessions.
Single-byte varints (0-127) are the most common case in protobuf
messages (booleans, small enums, field tags). Skip the loop entirely
for these values by checking the first byte before entering the
multi-byte parsing loop.
Device IDs are FNV hashes (uint32) that frequently exceed 2^28,
requiring 5 varint bytes. This test verifies the firmware correctly
decodes these values in incoming SwitchCommandRequest messages and
encodes them in state responses.
Convert COLOR_OFF and COLOR_ON from extern const to inline constexpr.
The Color class already has constexpr constructors so these can be
compile-time constants, allowing the compiler to optimize default
parameter values and eliminate the runtime storage.
Split the rarely-taken warning path into a separate noinline cold
function so the hot path (called every component every loop iteration)
is minimal. Also make WARN_IF_BLOCKING_OVER_MS constexpr so the
compiler uses an immediate compare instead of a memory load, and
merge the two ESP_LOGW calls into one.
finish() shrinks from 108 to 30 bytes. Total flash savings: -116 bytes.
Convert setup_priority floats, component state uint8_t constants, and
status LED constants from extern const (defined in component.cpp) to
inline constexpr in the header. This lets the compiler use immediate
values instead of memory loads across all translation units.
Also removes the dead HARDWARE_LATE declaration (declared extern but
never defined).
Saves ~364 bytes flash on ESP32-S3.
On 32-bit platforms (ESP32 Xtensa), 64-bit shifts in varint parsing
compile to __ashldi3 library calls. Since the vast majority of protobuf
varint fields (message types, sizes, enum values, sensor readings) fit
in 4 bytes, the 64-bit arithmetic is unnecessary overhead on the common
path.
Split parse() into two phases:
- Bytes 0-3: uint32_t loop with native 32-bit shifts (0, 7, 14, 21)
- Bytes 4-9: noinline parse_wide_() with uint64_t, only for BLE
addresses and other 64-bit fields
The code generator auto-detects which proto messages use int64/uint64/
sint64 fields and emits USE_API_VARINT64 conditionally. On non-BLE
configs, parse_wide_() and the 64-bit accessors (as_uint64, as_int64,
as_sint64) are compiled out entirely.
Saves ~40 bytes flash on non-BLE configs. Benchmark shows 25-50%
faster parsing for 1-4 byte varints (the common case).