agents.components.speechtotext

Module Contents

Classes

SpeechToText

This component takes in audio input and outputs a text representation of the audio using Speech-to-Text models (e.g. Whisper).

API

class agents.components.speechtotext.SpeechToText(*, inputs: List[agents.ros.Topic], outputs: List[agents.ros.Topic], model_client: Optional[agents.clients.model_base.ModelClient] = None, config: Optional[agents.config.SpeechToTextConfig] = None, trigger: Union[agents.ros.Topic, List[agents.ros.Topic]], component_name: str, **kwargs)

Bases: agents.components.model_component.ModelComponent

This component takes in audio input and outputs a text representation of the audio using Speech-to-Text models (e.g. Whisper).

Parameters:
  • inputs (list[Topic]) – The input topics for the STT. This should be a list of Topic objects, limited to Audio type.

  • outputs (list[Topic]) – The output topics for the STT. This should be a list of Topic objects, String type is handled automatically.

  • model_client (Optional[ModelClient]) – The model client for the STT. This should be an instance of ModelClient. Optional if enable_local_model is set to True in the config.

  • config (Optional[SpeechToTextConfig]) – The configuration for the STT. This should be an instance of SpeechToTextConfig. If not provided, defaults to SpeechToTextConfig().

  • trigger (Union[Topic, list[Topic], float]) – The trigger value or topic for the STT. This can be a single Topic object, a list of Topic objects.

  • component_name (str) – The name of the STT component. This should be a string.

Example usage:

audio_topic = Topic(name="audio", msg_type="Audio")
text_topic = Topic(name="text", msg_type="String")
config = SpeechToTextConfig(enable_vad=True)
model = Whisper(name="whisper")
model_client = ModelClient(model=model)
stt_component = SpeechToText(
    inputs=[audio_topic],
    outputs=[text_topic],
    model_client=model_client,
    config=config,
    component_name='stt_component'
)

Example usage with local model:

audio_topic = Topic(name="audio", msg_type="Audio")
text_topic = Topic(name="text", msg_type="String")
config = SpeechToTextConfig(enable_local_model=True, enable_vad=True)
stt_component = SpeechToText(
    inputs=[audio_topic],
    outputs=[text_topic],
    config=config,
    trigger=audio_topic,
    component_name='local_stt'
)
custom_on_configure()

Custom configuration

custom_on_activate()

Custom activation

custom_on_deactivate()

Destroy model client if it exists

property additional_model_clients: Optional[Dict[str, agents.clients.model_base.ModelClient]]

Get the dictionary of additional model clients registered to this component.

Returns:

A dictionary mapping client names (str) to ModelClient instances, or None if not set.

Return type:

Optional[Dict[str, ModelClient]]

fallback_to_local() bool

Switch from remote model_client to the built-in local model at runtime.

The local model is deployed on first call (lazy initialization) to avoid consuming GPU memory until actually needed. If enable_local_model is not already set in config, it is enabled automatically.

This is commonly used as a target for Actions in the Event system.

Returns:

True if the switch was successful, False otherwise.

Return type:

bool

Example:


    from agents.ros import Action

    # Define an action to switch to the 'local model' available in each component
    switch_to_local = Action(
        method=brain.fallback_to_local,
    )

    # Trigger this action if the component fails (e.g. internet outage)
    brain.on_component_fail(action=switch_to_local, max_retries=3)
change_model_client(model_client_name: str) bool

Hot-swap the active model client at runtime.

This method replaces the component’s current model_client with one from the registered additional_model_clients. It handles the safe de-initialization of the old client and initialization of the new one.

This is commonly used as a target for Actions in the Event system.

Parameters:

model_client_name (str) – The key corresponding to the desired client in additional_model_clients.

Returns:

True if the swap was successful, False otherwise (e.g., if the name was not found or initialization failed).

Return type:

bool

Example:


    from agents.ros import Action

    # Define an action to switch to the 'remote_backup' client defined previously
    switch_to_backup = Action(
        method=brain.change_model_client,
        args=("remote_backup",)
    )

    # Trigger this action if the component fails (e.g. server down)
    brain.on_component_fail(action=switch_to_backup, max_retries=3)
property warmup: bool

Enable warmup of the model.

create_all_subscribers()

Override to handle trigger topics and fixed inputs. Called by parent BaseComponent

activate_all_triggers() None

Activates component triggers by attaching execution step to callbacks

destroy_all_subscribers() None

Destroys all node subscribers