Filter Function

Filter Functions are a built-in, lightweight extension mechanism in IntraLLM AI for modifying request data before it is sent to an LLM and/or after it is returned. Filters can add context, sanitize inputs, enforce formatting, and post-process outputs, including real-time adjustments during streaming.

Filter Function: Modify Inputs and Outputs

Welcome to the guide on Filter Functions in IntraLLM AI. Filters provide a flexible and powerful way to transform data before it is sent to the LLM (input) and after it is returned from the LLM (output). Common use cases include injecting context for better responses, sanitizing user text, applying formatting rules, and removing sensitive information.

Filters are not standalone models. They are workflow enhancements that sit in the message flow and modify the request/response data traveling to and from models.

What are Filters in IntraLLM AI?

A practical way to think about Filters is as checkpoints in the message flow:

  • User inputs and LLM outputs are the data stream.
  • Filters are processing stages that clean, modify, and adapt that stream before it reaches the next step.
  • Filters sit between the UI and the model response, allowing you to intercept and adjust data without changing the model itself.

Filters can perform three kinds of modifications:

  • Modify user inputs (inlet): adjust request data before it reaches the model
  • Intercept streamed outputs (stream): modify incremental output chunks as they arrive
  • Modify final outputs (outlet): post-process the completed response before display

Key concept:

  • Filters enhance and transform data traveling to and from models, without presenting themselves as selectable models.

Structure of a Filter Function

A Filter Function is implemented as a Filter class. A common structure includes:

  • Valves for configuration (optional)
  • __init__ to initialize filter state
  • inlet(...) to preprocess request input
  • stream(...) to modify streaming events (optional)
  • outlet(...) to postprocess completed responses

Basic skeleton

from pydantic import BaseModel
from typing import Optional

class Filter:
    # Valves: Configuration options for the filter
    class Valves(BaseModel):
        pass

    def __init__(self):
        # Initialize valves (optional configuration for the Filter)
        self.valves = self.Valves()

    def inlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
        # Manipulate user inputs before sending to the model
        print(f"inlet called: {body}")
        return body

    def stream(self, event: dict, __user__: Optional[dict] = None) -> dict:
        # Modify streamed chunks of model output in real time
        print(f"stream event: {event}")
        return event

    def outlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
        # Manipulate model outputs after completion, before display
        print(f"outlet called: {body}")
        return body

Core components explained

1. Valves class (optional settings)

Valves are configurable parameters for your Filter. Use Valves when you want to change behavior without editing code (e.g., toggle a transformation, set a prefix, define redaction rules).

Example:

from pydantic import BaseModel, Field

class Filter:
    class Valves(BaseModel):
        TRANSFORM_UPPERCASE: bool = Field(
            default=False,
            description="If true, convert assistant output to uppercase."
        )

2. inlet(...): input pre-processing

The inlet function receives a chat-completion-style request body and returns a modified body. This is your opportunity to improve the quality and consistency of input that reaches the model.

Typical uses:

  • Add context or instructions (system messages)
  • Normalize formatting (JSON, Markdown, templates)
  • Sanitize input (strip noise, remove unwanted characters)
  • Enforce consistency (apply policies or required structure)

Example: add system context

from typing import Optional

class Filter:
    def inlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
        context_message = {
            "role": "system",
            "content": "You are a software troubleshooting assistant."
        }
        body.setdefault("messages", []).insert(0, context_message)
        return body

Example: clean input (remove unwanted tokens)

from typing import Optional

class Filter:
    def inlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
        messages = body.get("messages", [])
        if not messages:
            return body

        last = messages[-1]
        if last.get("role") == "user" and isinstance(last.get("content"), str):
            last["content"] = last["content"].replace("!!!", "").strip()

        return body

3. stream(...): real-time streaming hook

The stream function allows you to intercept and modify streamed model responses while they are being generated. This is useful when you need real-time transformation or filtering instead of waiting for the final output.

Typical uses:

  • Remove or mask sensitive information as it streams
  • Enforce formatting constraints on incremental output
  • Observe streaming chunks for monitoring or debugging

Example: log streaming chunks

class Filter:
    def stream(self, event: dict, __user__=None) -> dict:
        print(event)
        return event

Example streamed events may contain incremental deltas, such as:

{'choices': [{'delta': {'content': 'Hi'}}]}
{'choices': [{'delta': {'content': '!'}}]}
{'choices': [{'delta': {'content': ' world'}}]}

Example: strip emojis from streamed content

class Filter:
    def stream(self, event: dict, __user__=None) -> dict:
        for choice in event.get("choices", []):
            delta = choice.get("delta", {})
            if "content" in delta and isinstance(delta["content"], str):
                delta["content"] = delta["content"].replace("😊", "")
        return event

4. outlet(...): output post-processing

The outlet function runs after the model has completed its response. It receives the conversation body (including messages) and returns a modified version for display or downstream processing.

Typical uses:

  • Apply final formatting (lightweight)
  • Redact sensitive content
  • Add post-response annotations
  • Log response metadata for analytics (prefer logging over heavy rewriting)

Example: redact secret-like tokens

from typing import Optional

class Filter:
    def outlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
        for message in body.get("messages", []):
            if isinstance(message.get("content"), str):
                message["content"] = message["content"].replace("<API_KEY>", "[REDACTED]")
        return body

Example: highlight assistant output with Markdown (lightweight formatting)

from typing import Optional

class Filter:
    def outlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
        for message in body.get("messages", []):
            if message.get("role") == "assistant" and isinstance(message.get("content"), str):
                message["content"] = f"**{message['content']}**"
        return body

Filters in action: practical patterns

Pattern 1: consistent context injection

Use inlet to ensure a baseline system instruction or policy is always applied, even when users provide incomplete prompts.

Pattern 2: input normalization

Use inlet to enforce a consistent schema (e.g., wrap user content into a known template), which improves determinism and reduces prompt variance.

Pattern 3: streaming compliance

Use stream when you need real-time enforcement (e.g., removing disallowed tokens or masking sensitive content during streaming).

Pattern 4: output redaction

Use outlet to remove secrets, internal markers, or provider artifacts that should not be visible to end users.

Filters vs Pipe Functions

Filters:

  • Modify data going to and from models (pre/post-processing)
  • Are typically lightweight and focused on transformation
  • Do not create new selectable models

Pipes:

  • Can implement deeper integrations and custom workflows
  • Can proxy external providers or orchestrate multiple calls
  • Appear as new models in the UI via pipes() definitions

Guideline:

  • Use Filters for lightweight transformations and enforcement.
  • Use Pipes when you need a model-like integration or heavy workflow logic.

Best practices

  • Keep Filters minimal and predictable; avoid heavy business logic unless necessary.
  • Prefer inlet for context and normalization; prefer outlet for final cleanup or redaction.
  • Avoid modifying historical messages unless required; prefer targeting specific message roles.
  • Do not log sensitive content; redact secrets before logging if needed.
  • Use Valves for configuration rather than hard-coding behavior.
  • Handle missing keys safely (get(...), setdefault(...)) to avoid runtime errors.

Recap

  • Inlet: preprocesses user input before sending to the model
  • Stream: intercepts and modifies streamed output chunks in real time (optional)
  • Outlet: postprocesses the completed output before display
  • Valves: optional configuration parameters to control Filter behavior without code changes

Notes

  • Filters should be installed only from trusted sources and reviewed before production use.
  • Validate compatibility with your current IntraLLM AI/IntraLLM AI version and signatures.
  • For complex integrations or multi-step workflows, consider implementing a Pipe Function instead.