Skip to main content

Regex Match

Open In Colab

To evaluate chain or runnable string predictions against a custom regex, you can use the regex_match evaluator.

from langchain.evaluation import RegexMatchStringEvaluator

evaluator = RegexMatchStringEvaluator()

Alternatively via the loader:

from langchain.evaluation import load_evaluator

evaluator = load_evaluator("regex_match")
# Check for the presence of a YYYY-MM-DD string.
evaluator.evaluate_strings(
prediction="The delivery will be made on 2024-01-05",
reference=".*\\b\\d{4}-\\d{2}-\\d{2}\\b.*",
)
{'score': 1}
# Check for the presence of a MM-DD-YYYY string.
evaluator.evaluate_strings(
prediction="The delivery will be made on 2024-01-05",
reference=".*\\b\\d{2}-\\d{2}-\\d{4}\\b.*",
)
{'score': 0}
# Check for the presence of a MM-DD-YYYY string.
evaluator.evaluate_strings(
prediction="The delivery will be made on 01-05-2024",
reference=".*\\b\\d{2}-\\d{2}-\\d{4}\\b.*",
)
{'score': 1}

Match against multiple patterns

To match against multiple patterns, use a regex union "|".

# Check for the presence of a MM-DD-YYYY string or YYYY-MM-DD
evaluator.evaluate_strings(
prediction="The delivery will be made on 01-05-2024",
reference="|".join(
[".*\\b\\d{4}-\\d{2}-\\d{2}\\b.*", ".*\\b\\d{2}-\\d{2}-\\d{4}\\b.*"]
),
)
{'score': 1}

Configure the RegexMatchStringEvaluator

You can specify any regex flags to use when matching.

import re

evaluator = RegexMatchStringEvaluator(flags=re.IGNORECASE)

# Alternatively
# evaluator = load_evaluator("exact_match", flags=re.IGNORECASE)
evaluator.evaluate_strings(
prediction="I LOVE testing",
reference="I love testing",
)
{'score': 1}

Help us out by providing feedback on this documentation page: