相关空间: https://huggingface.co/spaces/abidlabs/keras-image-classifier 标签:VISION, MOBILENET, TENSORFLOW
图像分类是计算机视觉的核心任务。 构建更好的分类器来对图片中存在的对象进行分类是一个活跃的研究领域,因为它的应用范围从交通控制系统到卫星成像。
Image classification is a central task in computer vision. Building better classifiers to classify what object is present in a picture is an active area of research, as it has applications stretching from traffic control systems to satellite imaging.
此类模型非常适合与 Gradio 的图像输入组件一起使用,因此在本教程中,我们将构建一个网络演示来使用 Gradio 对图像进行分类。 我们将能够用 Python 构建整个 Web 应用程序,它看起来像这样(尝试其中一个示例!):
Such models are perfect to use with Gradio's image input component, so in this tutorial we will build a web demo to classify images using Gradio. We will be able to build the whole web application in Python, and it will look like this (try one of the examples!):
让我们开始吧!
Let's get started!
确保你已经安装了gradio Python 包。 我们将使用预训练的 Keras 图像分类模型,因此你还应该安装 tensorflow 。
Make sure you have the gradio Python package already installed. We will be using a pretrained Keras image classification model, so you should also have tensorflow installed.
首先,我们需要一个图像分类模型。 对于本教程,我们将使用预训练的 Mobile Net 模型,因为它可以很容易地从Keras下载。 你可以使用不同的预训练模型或训练你自己的模型。
First, we will need an image classification model. For this tutorial, we will use a pretrained Mobile Net model, as it is easily downloadable from Keras. You can use a different pretrained model or train your own.
import tensorflow as tf
inception_net = tf.keras.applications.MobileNetV2()
此行使用 Keras 库自动下载 MobileNet 模型和权重。
This line automatically downloads the MobileNet model and weights using the Keras library.
predict 函数接下来,我们需要定义一个函数来接收用户输入(在本例中为图像)并返回预测。 预测应作为字典返回,其键是类名,值是置信概率。 我们将从这个文本文件加载类名。
Next, we will need to define a function that takes in the user input, which in this case is an image, and returns the prediction. The prediction should be returned as a dictionary whose keys are class name and values are confidence probabilities. We will load the class names from this text file.
对于我们的预训练模型,它将如下所示:
In the case of our pretrained model, it will look like this:
import requests
# Download human-readable labels for ImageNet.
response = requests.get("https://git.io/JJkYN")
labels = response.text.split("\n")
def classify_image(inp):
inp = inp.reshape((-1, 224, 224, 3))
inp = tf.keras.applications.mobilenet_v2.preprocess_input(inp)
prediction = inception_net.predict(inp).flatten()
confidences = {labels[i]: float(prediction[i]) for i in range(1000)}
return confidences
让我们分解一下。 该函数接受一个参数:
Let's break this down. The function takes one parameter:
inp :作为 numpy 数组的输入图像
inp: the input image as a numpy array
然后,该函数添加一个批次维度,将其传递给模型,并返回:
Then, the function adds a batch dimension, passes it through the model, and returns:
confidences :预测,作为字典,其键是类标签,其值是置信概率
confidences: the predictions, as a dictionary whose keys are class labels and whose values are confidence probabilities
现在我们已经设置了预测功能,我们可以围绕它创建一个渐变界面。
Now that we have our predictive function set up, we can create a Gradio Interface around it.
在这种情况下,输入组件是一个拖放图像组件。 要创建此输入,我们可以使用 "gradio.inputs.Image" 类,它创建组件并处理预处理以将其转换为 numpy 数组。 我们将使用一个参数实例化该类,该参数自动将输入图像预处理为 224 x 224 像素,这是 MobileNet 期望的大小。
In this case, the input component is a drag-and-drop image component. To create this input, we can use the "gradio.inputs.Image" class, which creates the component and handles the preprocessing to convert that to a numpy array. We will instantiate the class with a parameter that automatically preprocesses the input image to be 224 pixels by 224 pixels, which is the size that MobileNet expects.
输出组件将是一个 "label" ,它以漂亮的形式显示顶部标签。 由于我们不想显示所有 1,000 个类别标签,我们将自定义它以仅显示前 3 个图像。
The output component will be a "label", which displays the top labels in a nice form. Since we don't want to show all 1,000 class labels, we will customize it to show only the top 3 images.
最后,我们将再添加一个参数 examples ,它允许我们使用一些预定义的示例预填充我们的界面。 Gradio 的代码如下所示:
Finally, we'll add one more parameter, the examples, which allows us to prepopulate our interfaces with a few predefined examples. The code for Gradio looks like this:
import gradio as gr
gr.Interface(fn=classify_image,
inputs=gr.Image(shape=(224, 224)),
outputs=gr.Label(num_top_classes=3),
examples=["banana.jpg", "car.jpg"]).launch()
这会产生以下界面,你可以在浏览器中尝试(尝试上传你自己的示例!):
This produces the following interface, which you can try right here in your browser (try uploading your own examples!):
你完成了! 这就是为图像分类器构建 Web 演示所需的全部代码。 如果你想与其他人分享,请尝试在 launch() 界面时设置 share=True !
And you're done! That's all the code you need to build a web demo for an image classifier. If you'd like to share with others, try setting share=True when you launch() the Interface!