自定义机器学习解释与块

标签:解释,情感分析

先决条件:本指南要求你了解块和界面的解释功能。 确保首先阅读块指南以及高级界面功能指南的解释部分。

Prerequisite: This Guide requires you to know about Blocks and the interpretation feature of Interfaces. Make sure to read the Guide to Blocks first as well as the interpretation section of the Advanced Interface Features Guide.

介绍

如果你有使用 Interface 类的经验,那么你就会知道解释机器学习模型的预测就像将 interpretation 参数设置为“default”或“shap”一样简单。

If you have experience working with the Interface class, then you know that interpreting the prediction of your machine learning model is as easy as setting the interpretation parameter to either "default" or "shap".

你可能想知道是否可以将相同的解释功能添加到使用 Blocks API 构建的应用程序中。 这不仅是可能的,而且 Blocks 的灵活性使你能够以界面无法实现的方式显示解释输出!

You may be wondering if it is possible to add the same interpretation functionality to an app built with the Blocks API. Not only is it possible, but the flexibility of Blocks lets you display the interpretation output in ways that are impossible to do with Interfaces!

本指南将展示如何:

This guide will show how to:

  1. 在 Blocks 应用程序中重新创建界面解释功能的行为。

    Recreate the behavior of Interfaces's interpretation feature in a Blocks app.

  2. 自定义解释在 Blocks 应用程序中的显示方式。

    Customize how interpretations are displayed in a Blocks app.

让我们开始吧!

Let's get started!

设置块应用程序

让我们使用 Blocks API 构建一个情感分类应用程序。 此应用程序将文本作为输入并输出此文本表达消极或积极情绪的概率。 我们将有一个输入 Textbox 和一个输出 Label 组件。 下面是应用程序的代码以及应用程序本身。

Let's build a sentiment classification app with the Blocks API. This app will take text as input and output the probability that this text expresses either negative or positive sentiment. We'll have a single input Textbox and a single output Label component. Below is the code for the app as well as the app itself.

import gradio as gr 
from transformers import pipeline

sentiment_classifier = pipeline("text-classification", return_all_scores=True)

def classifier(text):
    pred = sentiment_classifier(text)
    return {p["label"]: p["score"] for p in pred[0]}

with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            input_text = gr.Textbox(label="Input Text")
            with gr.Row():
                classify = gr.Button("Classify Sentiment")
        with gr.Column():
            label = gr.Label(label="Predicted Sentiment")

    classify.click(classifier, input_text, label)
demo.launch()

向应用程序添加解释

我们的目标是向用户展示输入中的单词如何影响模型的预测。 这将帮助我们的用户了解模型的工作原理并评估其有效性。 例如,我们应该期望我们的模型能够识别出具有积极情绪的词“快乐”和“爱”——如果不是,那就表明我们在训练它时犯了一个错误!

Our goal is to present to our users how the words in the input contribute to the model's prediction. This will help our users understand how the model works and also evaluate its effectiveness. For example, we should expect our model to identify the words "happy" and "love" with positive sentiment - if not it's a sign we made a mistake in training it!

对于输入中的每个词,我们将计算模型对积极情绪的预测被该词改变了多少的分数。 一旦我们有了这些 (word, score) 对,我们就可以使用 gradio 为用户可视化它们。

For each word in the input, we will compute a score of how much the model's prediction of positive sentiment is changed by that word. Once we have those (word, score) pairs we can use gradio to visualize them for the user.

shap库将帮助我们计算 (word, score) 对,gradio 将负责向用户显示输出。

The shap library will help us compute the (word, score) pairs and gradio will take care of displaying the output to the user.

以下代码计算 (word, score) 对:

The following code computes the (word, score) pairs:

def interpretation_function(text):
    explainer = shap.Explainer(sentiment_classifier)
    shap_values = explainer([text])

    # Dimensions are (batch size, text size, number of classes)
    # Since we care about positive sentiment, use index 1
    scores = list(zip(shap_values.data[0], shap_values.values[0, :, 1]))
    # Scores contains (word, score) pairs


    # Format expected by gr.components.Interpretation
    return {"original": text, "interpretation": scores}

现在,我们所要做的就是添加一个按钮,在单击时运行此功能。 为了显示解释,我们将使用 gr.components.Interpretation 。 这会将输入中的每个单词着色为红色或蓝色。 如果它有助于积极情绪,则为红色;如果有助于消极情绪,则为蓝色。 这就是 Interface 显示文本解释输出的方式。

Now, all we have to do is add a button that runs this function when clicked. To display the interpretation, we will use gr.components.Interpretation. This will color each word in the input either red or blue. Red if it contributes to positive sentiment and blue if it contributes to negative sentiment. This is how Interface displays the interpretation output for text.

with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            input_text = gr.Textbox(label="Input Text")
            with gr.Row():
                classify = gr.Button("Classify Sentiment")
                interpret = gr.Button("Interpret")
        with gr.Column():
            label = gr.Label(label="Predicted Sentiment")
        with gr.Column():
            interpretation = gr.components.Interpretation(input_text)
    classify.click(classifier, input_text, label)
    interpret.click(interpretation_function, input_text, interpretation)

demo.launch()

自定义解释的显示方式

gr.components.Interpretation 组件很好地展示了单个单词如何对情绪预测做出贡献,但是如果我们还想将分数本身与单词一起显示怎么办?

The gr.components.Interpretation component does a good job of showing how individual words contribute to the sentiment prediction, but what if we also wanted to display the score themselves along with the words?

一种方法是生成一个条形图,其中单词位于水平轴上,条形高度对应于 shap 分数。

One way to do this would be to generate a bar plot where the words are on the horizontal axis and the bar height corresponds to the shap score.

我们可以通过修改我们的 interpretation_function 来额外返回一个 matplotlib 条形图来做到这一点。 我们将在单独的选项卡中使用 gr.Plot 组件显示它。

We can do this by modifying our interpretation_function to additionally return a matplotlib bar plot. We will display it with the gr.Plot component in a separate tab.

这是解释函数的样子:

This is how the interpretation function will look:

def interpretation_function(text):
    explainer = shap.Explainer(sentiment_classifier)
    shap_values = explainer([text])
    # Dimensions are (batch size, text size, number of classes)
    # Since we care about positive sentiment, use index 1
    scores = list(zip(shap_values.data[0], shap_values.values[0, :, 1]))

    scores_desc = sorted(scores, key=lambda t: t[1])[::-1]

    # Filter out empty string added by shap
    scores_desc = [t for t in scores_desc if t[0] != ""]

    fig_m = plt.figure()

    # Select top 5 words that contribute to positive sentiment
    plt.bar(x=[s[0] for s in scores_desc[:5]],
            height=[s[1] for s in scores_desc[:5]])
    plt.title("Top words contributing to positive sentiment")
    plt.ylabel("Shap Value")
    plt.xlabel("Word")
    return {"original": text, "interpretation": scores}, fig_m

这就是应用程序代码的外观:

And this is how the app code will look:

with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            input_text = gr.Textbox(label="Input Text")
            with gr.Row():
                classify = gr.Button("Classify Sentiment")
                interpret = gr.Button("Interpret")
        with gr.Column():
            label = gr.Label(label="Predicted Sentiment")
        with gr.Column():
            with gr.Tabs():
                with gr.TabItem("Display interpretation with built-in component"):
                    interpretation = gr.components.Interpretation(input_text)
                with gr.TabItem("Display interpretation with plot"):
                    interpretation_plot = gr.Plot()

    classify.click(classifier, input_text, label)
    interpret.click(interpretation_function, input_text, [interpretation, interpretation_plot])

demo.launch()

你可以看到下面的演示!

You can see the demo below!

超越情感分类

尽管到目前为止我们都专注于情感分类,但你可以向几乎任何机器学习模型添加解释。 输出必须是 gr.Imagegr.Label ,但输入几乎可以是任何东西( gr.Numbergr.Slidergr.Radiogr.Image )。

Although we have focused on sentiment classification so far, you can add interpretations to almost any machine learning model. The output must be an gr.Image or gr.Label but the input can be almost anything (gr.Number, gr.Slider, gr.Radio, gr.Image).

这是一个使用图像分类模型的解释块构建的演示:

Here is a demo built with blocks of interpretations for an image classification model:

结束语

我们深入研究了 🤿 解释的工作原理以及如何将它们添加到你的 Blocks 应用程序中。

We did a deep dive 🤿 into how interpretations work and how you can add them to your Blocks app.

我们还展示了 Blocks API 如何让你能够控制解释在你的应用程序中的可视化方式。

We also showed how the Blocks API gives you the power to control how the interpretation is visualized in your app.

添加解释是让你的用户理解并信任你的模型的一种有用方式。 现在你拥有将它们添加到所有应用程序所需的所有工具!

Adding interpretations is a helpful way to make your users understand and gain trust in your model. Now you have all the tools you need to add them to all of your apps!