TensorFlow.jsというのがあって、これはTensorflowで作成したモデルをブラウザで使えるようにできるライブラリなのだが、これを利用して手の検出をリアルタイムにブラウザ上で行えるライブラリを使う機会があったのでメモ。

サンプルコード

必要最低限の動くコード。HTMLにcanvasとvideoタグがあるが、WEBカメラで取得した映像をvideoタグに反映させて、そこから手を検出した画像がcanvasに反映されていく。試したほうが速い

<div id="message">loading model...</div>
<canvas id="mycanvas" width="640" height="480"></canvas>
<video id="myvideo" width="640" height="480"></video>

<script src="https://cdn.jsdelivr.net/npm/handtrackjs@0.0.13/dist/handtrack.min.js"></script>
<script>
const canvas = document.getElementById('mycanvas');
const context = canvas.getContext('2d');
const video = document.getElementById('myvideo');
let model;
const options = {
  flipHorizontal: true,   // flip e.g for video  
  maxNumBoxes: 3,        // maximum number of boxes to detect
  iouThreshold: 0.5,      // ioU threshold for non-max suppression
  scoreThreshold: 0.7,    // confidence threshold for predictions.
};
handTrack.load(options).then(l_model => {
  model = l_model;
  document.getElementById('message').innerText = 'loaded!';
  handTrack.startVideo(video).then(function (status) {
    if (status) {
      console.log("video started", status);
      runDetection();
    } else {
      console.log("video error", status);
    }
  });
});

function runDetection() {
  model.detect(video).then(predictions => {
      console.log("Predictions: ", predictions);
      model.renderPredictions(predictions, canvas, context, video);
      requestAnimationFrame(runDetection);
  });
}
</script>

ポイントはいくつかって

オプションの

flipHorizontal WEBカメラ経由とかだと鏡になるので左右逆になる　これを補正するかどうか
maxNumBoxes: 3 一度に最大どれだけの手を検出するか
iouThreshold: 0.5 うーんわからん
scoreThreshold: 0.7 手とみなすしきい値　1に近ければ近いほど確からしくないと手として認めてくれなくなる

で、

handTrack.load()
handTrack.startVideo()
model.detect()
model.renderPredictions()
model.renderPredictions()

の順番に実行していく

細かいところは公式のREADME見たほうがいい　というかそれしかないが

動かざることバグの如し

近づきたいよ君の理想に

リアルタイムに手を検出できるJavascriptライブラリ「handtrack.js」

サンプルコード