Modeling tennis moves/shots (Part 1)

Can I reproduce what you see in tennis tournaments when it comes to modeling the path of balls? With YOLO, Kalman filtering, and other computer vision techniques, it is possible.

This is pose estimation from the object detected. Nowadays, this is possible from Yolo as well. The pose estimation here was done using MMPose, which makes this a two stage process. Two stage processes introduces more latency so not the best option for live streaming.

Once we have the pose estimations. You can take sequences of the poses and label them such as “forehand”, “backhand”, and “at rest”. The result is below.

Workflow

Capture Still Frames
- Take still images from a tennis broadcast where the camera is relatively still (e.g., overhead or side views).
- Choose a consistent frame size for your dataset, such as 640x320 pixels.
Annotate Keypoints
- Manually annotate each image with keypoints (e.g., court corners, service boxes).
- Keep the order of keypoints consistent across all annotations
Prepare the Dataset
- Resize or crop all images to the chosen resolution (e.g., 640x320).
- Normalize keypoint coordinates to be in the range [0, 1] relative to image width and height.
Train a CNN
- Use a convolutional neural network (CNN) that accepts input images of shape (3, 320, 640) — 3 color channels, 320 pixels high, 640 pixels wide.
- The model should output a fixed-length vector representing the keypoint coordinates (e.g., 8 values for 4 (x, y) points).

The racket is a rough estimation by taking the player’s handedness and constructing a racket that’s at a right angle to the elbow and the wrist.

How about pickleball?

As an aside, the same could be done with pickleball.

Modeling tennis moves/shots (Part 1)

Workflow

How about pickleball?

Further Reading

Data-driven investigation of Java memory issue of an actual application (Part 1)

API gateway

Heatmaps using Folium