Elevate Chat Safety with Murnitur: Advanced Custom Metrics
Explore how Murnitur Shield uses custom metrics to protect children's chat apps by preventing political discussions. Learn how the advanced_political_content_metric can be customized to ensure a safe and appropriate environment for young users, tailored to complex use cases.
Maintaining a safe and engaging environment is crucial when developing chat applications for children. Murnitur Shield offers robust tools for content moderation, including custom metrics to tailor the moderation process. This blog post will dive into a complex use case: preventing political discussions in a children's chat application using advanced custom metrics.
The Challenge: Detecting Political Content with Context
For a chat application designed for children, it's essential to avoid discussions about political topics. We'll use Murnitur Shield to implement a custom metric that not only identifies political keywords but also understands the context in which these terms are used. This will ensure more accurate and nuanced content moderation.
Step 1: Configuring Murnitur Shield
Set up Murnitur Shield using the Guard
class directly, without instantiation:
from murnitur import Guard, GuardConfig
from murnitur.guard import Payload, RuleSet
# Create the configuration
config = GuardConfig()
Step 2: Develop an Advanced Custom Metric
We'll create a custom metric function that performs a more nuanced analysis. This function will:
- Check for specific political keywords.
- Analyze the context around these keywords to determine if the content is politically charged.
- Use a scoring mechanism to evaluate the likelihood of political content.
Advanced Custom Metric Function
Here's the advanced custom metric function:
import re
from typing import Tuple, Optional
from murnitur.guard import Payload
political_terms = [
"politics",
"government",
"election",
"policy",
"candidate",
"party",
"debate",
"congress",
"senate",
"president",
"parliament",
]
def advanced_political_content_metric(payload: Payload) -> Tuple[bool, Optional[str]]:
# Retrieve the chat message from the payload
chat_message = payload.get("output", "").lower()
# Check for political terms in the chat message
for term in political_terms:
if re.search(r"\b" + re.escape(term) + r"\b", chat_message):
return True, None
# Optionally, apply more advanced checks here, such as analyzing message context
# For example, checking for named entities related to politics
# If no political terms are found
return False, chat_message
Registering the Custom Metric
Register the custom metric with the Guard
class:
# Register the advanced custom metric
Guard.register_custom_metric('advanced_political_content', advanced_political_content_metric)
Step 3: Implementing Murnitur Shield with Custom Metrics
Configure Murnitur Shield to use the custom metric. Define rules and use the shield
method to check the payload:
rulesets = [
{
"rules": [{"metric": "custom", "value": "advanced_political_content"}],
"action": {
"type": "OVERRIDE",
"fallback": "Sorry, political discussions are not allowed.",
},
}
]
payload = {
"output": "The upcoming election is a major topic in the news.",
"contexts": [
"The election year is causing a lot of debates.",
"Government policies are changing rapidly.",
],
}
# Check the payload using Murnitur Shield
response = Guard.shield(payload, rulesets, config)
print(response.text)
In this example:
- Advanced Custom Metric: Evaluates the presence and context of political keywords and phrases, scoring the likelihood of political content.
- Ruleset: Uses the custom metric to detect political content and override messages accordingly.
- Payload: Contains the chat output and context, which are evaluated for political content.
Conclusion
By implementing advanced custom metrics with Murnitur Shield, you can achieve precise content moderation for applications targeting children. The enhanced custom metric function ensures that political discussions are accurately identified and blocked, maintaining a safe and enjoyable chat environment. This approach not only filters out unwanted content but also respects the context in which terms are used, offering a robust solution for content moderation.