9.2 TensorFlow Lite

範例程式：

TensorFlow Lite 是 TensorFlow 的輕量化部署格式，常用於手機、嵌入式裝置或邊緣運算。本篇使用合成三類表格分類資料訓練 Keras 模型，將模型轉成 .tflite，再用 TFLite Interpreter 驗證推論結果。

1. 學習目標

訓練完成的 Keras 模型通常不能直接放到手機或邊緣裝置上執行。TensorFlow Lite 會把模型轉成較輕量的格式，方便在資源較少的環境中推論。

本篇流程如下：

訓練 Keras 模型 -> 儲存 .keras -> 轉換 .tflite -> 用 Interpreter 推論 -> 比較結果

2. 什麼時候適合用 TensorFlow Lite？

場景	說明
手機 App	Android、iOS 端本機推論
邊緣裝置	Raspberry Pi、IoT 裝置、離線設備
低延遲推論	不想每次都把資料送到伺服器
隱私需求	資料留在使用者端，不上傳到後端

若模型主要跑在伺服器 API，通常會先考慮 .keras、SavedModel 或 FastAPI；若要放到行動裝置或邊緣端，再考慮 TFLite。

3. 範例資料與模型

Notebook 使用 make_classification 產生三類表格分類資料，建立小型 DNN。這類資料訓練速度快，適合聚焦在模型轉換與推論驗證。

模型輸入是標準化後的數值特徵，輸出是三個類別的 softmax 機率。轉成 TFLite 後，推論時仍要使用相同的前處理流程，否則結果會和 Keras 模型不一致。

4. 轉換成 TFLite

最基本的轉換方式如下：

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

若要進一步縮小模型，可以加入最佳化：

converter.optimizations = [tf.lite.Optimize.DEFAULT]

最佳化後模型通常較小，但仍需要重新驗證推論結果與精度。

5. 使用 Interpreter 推論

TFLite 不使用 model.predict()，而是透過 Interpreter 設定輸入 tensor、執行推論，再取出輸出 tensor：

interpreter = tf.lite.Interpreter(model_path='model.tflite')
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

interpreter.set_tensor(input_details[0]['index'], sample)
interpreter.invoke()
prediction = interpreter.get_tensor(output_details[0]['index'])

Notebook 會把 Keras 與 TFLite 的預測類別進行比較，確認轉換後模型仍可正常使用。

6. 如何套用到自己的模型？

替換成自己的模型時，請確認：

模型可以先用 .keras 成功儲存與載入。
推論前處理流程已固定，例如 image resize、normalization、StandardScaler。
TFLite 輸入 dtype 與 shape 符合裝置端程式。
轉換後要抽樣比較 Keras 與 TFLite 的預測結果。
若使用量化，需額外檢查精度是否下降太多。

部署時不要只保存 .tflite，也要保存 class names、輸入 shape、前處理規格與模型版本。

7. 小結

TensorFlow Lite 是將模型帶到手機與邊緣裝置的重要格式。完成轉換只是第一步，更重要的是用 Interpreter 驗證推論結果，確保前處理、輸入 shape 與輸出類別都和原本 Keras 模型一致。