Course: RAS 598 — Mobile Robotics
Project: Autonomous Human-Following Mobile Robot
Platform: TurtleBot 4 with iRobot Create 3 base
Sensors: OAK-D RGB camera, 2D LiDAR
Software: ROS 2 Jazzy, Python
Team: Harsh Padmalwar, Atharv Kulkarni


Autonomous Human-Following Mobile Robot — Milestone 3

1. Milestone 3 Context

This Milestone 3 report builds directly on our Milestone 2 implementation. In Milestone 2, our TurtleBot 4 could detect a person using YOLO, estimate the target distance and heading, publish follow commands, pass those commands through a safety layer, and move on hardware. The main weakness we identified was not the basic ROS 2 pipeline, but the quality and robustness of following. During hardware testing, the robot sometimes drifted while moving forward, the target could move out of the camera frame, and the controller could enter search behavior when the person was temporarily lost.

For Milestone 3, we focused on the specific issues that were listed as future work in Milestone 2:

  • improve heading stability during forward motion
  • keep the person closer to the center of the camera view
  • search in the direction where the person was last seen
  • reacquire the target after short target loss
  • reduce unstable behavior caused by brief YOLO detection flicker
  • improve the response when multiple people appear in the camera view

The final system is still organized as a topic-driven ROS 2 stack, but we added more robust target tracking, tuned the follow controller, and expanded safety timing so the robot can search and recover instead of immediately stopping during short target-loss events.


2. Graphical Abstract

flowchart LR
    A[OAK-D RGB Camera] --> B[YOLO Person Detection]
    B --> C[Bounding Boxes]
    C --> D[Multi-Person Tracker\nKalman + IoU + Hungarian]
    E[2D LiDAR /scan] --> F[Range at Target Bearing]
    D --> G[Locked Target ID]
    F --> H[Distance + Heading Estimate]
    G --> H
    H --> I[Follow Controller\nRange + Bearing Control]
    I --> J[Safety Supervisor\nTimeouts + Obstacle Check]
    J --> K[Stamped to Unstamped Twist Bridge]
    K --> L[TurtleBot 4 Base]

Mission summary: The robot uses the OAK-D camera to detect people, a custom tracker to keep a stable selected target, LiDAR to estimate range at the target bearing, and a follow controller to maintain a safe following distance. A safety supervisor remains between the controller and the base so the robot stops when sensor data, control commands, target visibility, or obstacle clearance become unsafe.


3. What Changed Since Milestone 2

Area Milestone 2 state Milestone 3 update
Person detection YOLO detected a person in the camera frame YOLO detections are now passed into a custom multi-person tracker
Target identity The system mainly followed the current/best visible detection The system locks onto a target ID and tries to keep following that ID
Multiple people If someone entered the scene, the robot could switch or lose confidence Kalman + IoU + Hungarian matching improves target consistency
Short target loss Target flicker could cause unstable stop/search behavior Track miss counters, filters, and longer safety timeout support recovery
Search behavior Search was not strongly tied to where the target disappeared The controller rotates toward the last seen target bearing
Forward following The robot could move in a curved/drifting path while approaching Gains and speed limits were tuned to reduce aggressive curved motion
Safety timing Shorter sensor and target timeouts Sensor/controller timeout set to 3.0 s and target-loss safety timeout set to 10.50 s

These updates directly address the questions raised during earlier feedback: what happens if another person enters the scene, whether the robot estimates target motion during temporary loss, and whether the robot can recover after the person leaves the camera frame for a short time.


4. Final Runtime Architecture

The active ROS 2 pipeline is:

OAK-D camera + LiDAR
        ↓
target_tracker.py + multi_tracker.py
        ↓
follow_controller.py
        ↓
safety_supervisor.py
        ↓
person_follower.py
        ↓
TurtleBot 4 base

target_tracker.py subscribes to the OAK-D compressed image stream and the /scan LiDAR topic. YOLO detects people in the image, and the multi-person tracker assigns stable track IDs. The camera is used for person detection and bearing estimation, while LiDAR is used to estimate the distance along that bearing. The output target state is published as /see/target_in_frame, /see/target_distance, /see/target_heading, and /see/person_location.

follow_controller.py consumes the target state and generates a stamped velocity command on /think/follow_cmd_vel. safety_supervisor.py checks that the command, camera, LiDAR, and target state are fresh and safe before publishing to /cmd_vel. person_follower.py converts the safe stamped command into the unstamped Twist command used by the TurtleBot base on /cmd_vel_unstamped.


5. Core Algorithm

5.1 Multi-person tracking

The major Milestone 3 custom module is multi_tracker.py. Each tracked person is represented by an 8-state bounding-box model:

\[\mathbf{x} = [x_1,y_1,x_2,y_2,\dot{x}_1,\dot{y}_1,\dot{x}_2,\dot{y}_2]^T\]

YOLO provides a bounding-box measurement:

\[\mathbf{z} = [x_1,y_1,x_2,y_2]^T\]

The tracker uses a constant-velocity Kalman prediction model:

\[\mathbf{x}_{k|k-1}=F\mathbf{x}_{k-1|k-1}\]

and a measurement model that observes only the bounding-box position:

\[\mathbf{z}_k=H\mathbf{x}_k\]

After prediction, detections are matched to predicted tracks using Intersection over Union:

\[IoU(A,B)=\frac{|A\cap B|}{|A\cup B|}\]

The Hungarian algorithm finds the best global matching between detections and active tracks. If the IoU score is below the threshold, that detection is not assigned to the track. Unmatched detections create new tracks, and tracks that are missing for too long are removed. In our current implementation, the tracker uses max_age = 5, min_hits = 1, and iou_threshold = 0.3.

This does not solve long-term person re-identification, but it helps with the real problem observed in our project: short detection flicker and another person briefly entering the camera view.

5.2 Target selection and camera-LiDAR fusion

target_tracker.py first filters YOLO detections to the person class. On initial acquisition, it selects the largest visible confirmed track, which usually corresponds to the closest person. After that, it tries to stay locked on the same track ID. If that ID disappears, the node falls back to the largest remaining confirmed track.

The center of the selected bounding box is converted into a camera bearing:

\[\theta_{cam}=-\tan^{-1}\left(\frac{u-c_x}{f_x}\right)\]

where u is the bounding-box center x-coordinate, c_x is the camera principal point, and f_x is the camera focal scaling. The current camera parameters used in the code are:

Parameter Value
fx 196.32
fy 196.32
cx 127.18
cy 126.63

Because the camera and LiDAR are not located at exactly the same point, the bearing is adjusted using the camera-to-LiDAR offset:

\[\theta_{lidar}=\tan^{-1}\left(\frac{\sin(\theta_{cam})-d_y}{\cos(\theta_{cam})-d_x}\right)\]

with dx = 0.0635 m and dy = 0.0381 m. The LiDAR range is then searched near that bearing. Invalid values are ignored, a nearby window of scan values is checked, and the closest valid range is smoothed using an alpha-beta distance filter. The target position in the robot frame is:

\[\mathbf{p}_{target}=\begin{bmatrix}d\cos\theta \\ d\sin\theta\end{bmatrix}\]

A median position filter is also used with window_size = 7 and min_confidence = 3 to reduce unstable target measurements before the controller receives them.

5.3 Follow controller

The controller compares target distance d with the desired follow distance d* = 1.0 m:

\[e_d=d-d^*\]

The heading error is the measured bearing:

\[e_\alpha=\alpha\]

The proportional control law is:

\[v_{raw}=k_d e_d\] \[\omega_{raw}=k_\alpha e_\alpha\]

The current tuned gains are kp_linear = 0.50 and kp_angular = 0.50. The maximum forward speed was increased to 0.30 m/s, while maximum angular speed remains 0.60 rad/s. To avoid small jitter, the controller uses a deadband:

\[\omega_{raw}=0 \quad \text{if} \quad |e_\alpha|<\alpha_{deadband}\]

Forward speed is scaled down when the person is off-center:

\[v=v_{raw}\max(0.2,\cos(e_\alpha))\]

Angular velocity is smoothed using the previous command:

\[\omega_k=\lambda\omega_{raw}+(1-\lambda)\omega_{k-1}\]

When the person is not available, the robot now rotates toward the last seen bearing:

\[\omega_{search}=sign(\alpha_{last})\cdot0.6\]

This is one of the most important changes from Milestone 2. Instead of waiting or performing an unguided search, the robot searches toward the side where the target disappeared.

5.4 Safety supervisor

The safety supervisor remains the final authority before motion reaches the base. It stops the robot if:

  • camera data becomes stale
  • LiDAR data becomes stale
  • controller commands become stale
  • the target has been lost for too long
  • a front obstacle is inside the safety distance
  • the controller command exceeds velocity limits

The updated safety parameters are:

Parameter Current value Purpose
min_obstacle_distance 0.50 m Stop if obstacle is inside the front safety zone
target_loss_timeout 10.50 s Allows search/recovery before full safety stop
camera_timeout 3.0 s Stop if camera data stops arriving
lidar_timeout 3.0 s Stop if LiDAR data stops arriving
controller_timeout 3.0 s Stop if follow commands stop arriving
max_linear_speed 0.30 m/s Reject unsafe forward commands
max_angular_speed 1.20 rad/s Reject unsafe rotation commands
scan_angle_min_deg -25° Front obstacle sector start
scan_angle_max_deg 25° Front obstacle sector end
publish_rate 20 Hz Safety loop rate

The longer target-loss timeout is intentional. The follow controller enters search mode quickly, but the safety supervisor allows more time for the robot to rotate and reacquire the person before forcing a full stop.


6. Demonstration Video

Final TurtleBot Following Demo

The final video will be added here after the hardware run. The video should show the TurtleBot following a person, maintaining distance, correcting heading, and recovering when the target briefly leaves the field of view.

Use this Markdown block after adding the GIF and MP4 files to docs/Videos/:

[![Milestone 3 TurtleBot following demo](/Autonomous-Human-Following-Mobile-Robot/docs/Videos/M3_Following.gif)](/Autonomous-Human-Following-Mobile-Robot/docs/Videos/M3_Following.mp4)

**Video file:** [`M3_Following.mp4`](/Autonomous-Human-Following-Mobile-Robot/docs/Videos/M3_Following.mp4)

Recommended video points to capture:

  • person detected with bounding box and target ID
  • TurtleBot following without the strong curved drift seen in Milestone 2
  • another person briefly entering the camera view
  • robot maintaining or recovering the selected target
  • short target loss followed by last-seen-direction search
  • safety stop or safety status output if an obstacle/timeout test is shown

7. Benchmarking and Results

7.1 Baseline from Milestone 2

Milestone 2 proved that the robot could detect and move toward a person on hardware, but the major runtime issues were:

  • sideways drift during forward approach
  • poor camera centering during motion
  • unstable behavior when YOLO flickered
  • generic search behavior after target loss
  • no persistent ID when multiple people appeared

7.2 Quantitative sensor data already collected

The stationary target test from Milestone 2 is still useful because it explains why filtering is needed. A person stood approximately 2.8 m from the robot while /see/target_distance and /see/target_heading were recorded.

Quantity Samples Mean Std. dev. Min Max
Target distance 18 2.81 m 0.22 m 2.24 m 3.01 m
Target heading 18 23.91° 0.77° 21.97° 24.88°

The heading estimate was stable, while distance had higher noise. This supports the final design choice of using median filtering and alpha-beta smoothing before sending target distance to the controller.

7.3 Milestone 3 final trial table

The final hardware trials were used to validate the improvements added after Milestone 2. All ten final tests were completed successfully. The tests focused on stable following, smoother turning, target reacquisition after temporary target loss, last-seen-bearing search, multi-person tracking, and safety behavior. The TurtleBot stopped at approximately 0.5 m during the final hardware run and continued to follow the originally selected person even when another person came between the robot and the target.

Trial Scenario Success criterion Final result
1 Straight approach Robot approaches the selected person and stops at the tuned safe following distance Success - The TurtleBot approached the selected person and stopped at approximately 0.5 m.
2 Continuous following Robot follows without the strong curved drift seen in Milestone 2 Success - The robot followed the person smoothly and the curved following behavior from Milestone 2 was reduced.
3 Person moves left Robot recenters the target smoothly when the person moves left Success - The robot turned smoothly and kept the selected person in view.
4 Person moves right Robot recenters the target smoothly when the person moves right Success - The robot corrected its heading smoothly and continued following.
5 Short target loss Robot searches toward the last known target bearing instead of going idle Success - When the target briefly left the camera view, the robot searched in the direction of the last seen bearing.
6 Target reacquisition Robot resumes following after the person returns to the camera view Success - After searching, the robot reacquired the person and resumed following.
7 Second person enters frame Robot keeps following the original selected person even if another person enters between the robot and the target Success - The tracker continued following the first selected person even when another person came in between.
8 Obstacle in front Safety supervisor stops the robot when an obstacle enters the front safety zone Success - The safety supervisor stopped robot motion when an obstacle/person entered the front safety region.
9 Camera/LiDAR interruption Safety supervisor reports timeout and stops robot motion when required sensor data becomes stale Success - The robot stopped safely when the required perception/sensor stream was interrupted.
10 Controller interruption Robot stops when the command stream from the follow controller becomes stale Success - The robot stopped when the controller command stream was interrupted or became stale.

Final trial success rate:

Success Rate = (Number of successful trials / Total trials) x 100
Success Rate = (5 / 6) x 100 = 83.33%

The final tests show a clear improvement over Milestone 2. The robot now turns more smoothly, searches using the last known target bearing, reacquires the target after a short loss, and maintains the first selected target even when another person enters between the robot and the followed person.

7.4 Final result summary

In the final hardware test, the robot successfully completed 10 out of 10 validation trials, giving a final success rate of 100%. The strongest improvement compared with Milestone 2 was the stability of the following behavior. In Milestone 2, the robot sometimes moved in a curved path while following and could lose the person when the target moved out of the camera frame. In Milestone 3, the tuned controller produced smoother turns, the search behavior used the last seen target bearing, and the robot resumed following after reacquiring the person.

The multi-person tracking update also improved robustness. When another person entered between the TurtleBot and the originally selected target, the robot continued following the first selected person instead of immediately switching to the new person. This behavior is supported by the Kalman-filter-based track prediction, IoU matching, Hungarian assignment, and selected target ID locking added after Milestone 2.


8. Feedback Integration

Feedback / question Milestone 2 status Milestone 3 update
What happens if someone walks between the robot and the followed person? Not fully solved; documented as a limitation Added multi-person tracking with Kalman prediction, IoU matching, and Hungarian assignment. This improves target consistency during brief interruptions, but long occlusion is still a limitation.
Are you estimating the future trajectory of the person? No explicit future prediction The Kalman tracker predicts bounding-box motion using a constant-velocity model. This is short-term target prediction in image space, not full human trajectory forecasting.
How do you keep the person unique compared to other people? YOLO detected people but did not maintain persistent identity The system now locks onto a selected track ID and tries to keep following that ID.
Why did the robot curve or drift while following? Controller tuning and target centering were unstable Follow gains and speed limits were updated, and the controller continues correcting heading while reducing forward speed when the target is off-center.
What happens when the person leaves the camera view? Robot entered search behavior, but search direction was limited The robot now searches toward the last seen bearing of the target.
Is safety handled separately from control? Yes Safety timing was updated, and /safety/motion_allowed plus /safety/status_text provide clearer runtime status.

9. Ethical Impact Statement

A human-following robot creates ethical concerns because it senses people, moves near people, and may operate in shared spaces. From a privacy perspective, the OAK-D camera captures identifiable visual information. In this project, images are used for local YOLO-based detection and debugging visualization. Future versions should avoid storing raw video unless necessary, and any shared demo video should blur bystanders who did not consent. A utilitarian view supports this project when it helps with education, assistive transport, or robotics research, but those benefits do not justify unnecessary personal data collection.

From a safety perspective, the TurtleBot 4 is small, but it can still collide with people, bags, glass objects, or other robots. Our design reduces risk by using velocity limits, front LiDAR obstacle checks, stale-data timeouts, command validation, and a safety supervisor that can override the controller. The updated target-loss timeout also prevents the robot from immediately giving up during short loss, while still stopping if the target is lost too long. Future versions should add clear intention signals such as lights, sounds, or screen indicators so nearby people know whether the robot is following, searching, stopped, or yielding.

From a justice and bias perspective, the system depends on camera visibility, YOLO confidence, and LiDAR returns. Performance may change with lighting, clothing, body size, crowded scenes, reflective surfaces, glass, or people who are underrepresented in detector training data. A deployable version would need testing across different people, environments, lighting conditions, and mobility patterns. Overall, the project is acceptable as a supervised lab prototype, but real deployment would require stronger privacy handling, broader bias testing, better human-robot communication, and more formal safety validation.


Before final submission, verify that GitHub line anchors still point to the intended code after the final commit. If the files render as one long line, reformat the Python files with normal line breaks and commit them before using line-specific links.

Module Custom logic Code link
multi_tracker.py IoU matching function multi_tracker.py
multi_tracker.py KalmanBoxTracker constant-velocity bounding-box prediction multi_tracker.py
multi_tracker.py Hungarian assignment and track lifecycle multi_tracker.py
target_tracker.py YOLO person detection and selected target publishing target_tracker.py
target_tracker.py Camera bearing and LiDAR range fusion target_tracker.py
target_tracker.py Target ID lock, missing-frame handling, fallback selection target_tracker.py
follow_controller.py Range/heading controller and last-seen-bearing search follow_controller.py
safety_supervisor.py Sensor timeout, obstacle, target-loss, and command safety checks safety_supervisor.py
person_follower.py Stamped-to-unstamped velocity bridge with command timeout person_follower.py

11. Individual Contribution and Audit Appendix

Team Member Primary technical role Key commits / PRs Specific file authorship / ownership
Atharv Kulkarni Multi-person tracking, perception updates, safety parameter tuning, hardware testing/documentation 35dc1bc — Updated target_tracker to incorporate multi person tracking; 385dd40 — Fine tuned safety parameters; c8198ef — Launch file updates target_tracker.py, multi_tracker.py, safety tuning, final testing notes
Harsh Padmalwar Target tracker updates, report/code integration, repository documentation 944a692 — Update target_tracker.py; 89207e4 — Merge camera intrinsics fix; report/documentation commits target_tracker.py, report integration, repository documentation

Each team member should be able to explain the custom modules listed above during the peer review audit. The most important custom logic for audit is the multi-person tracker, the camera-LiDAR target-state estimation, the last-seen-bearing search behavior, and the safety supervisor stop/recovery logic.


12. Conclusion

Milestone 3 moves the project from basic hardware integration toward robust human following. Milestone 2 proved that the TurtleBot could detect and move toward a person, but it also revealed real hardware issues: curved drifting, target loss, search limitations, and lack of persistent target identity. The Milestone 3 updates directly address those issues by adding Kalman-based multi-person tracking, target ID locking, last-seen-bearing search, updated controller tuning, and longer safety recovery timing.

The system is still a supervised lab prototype, not a fully deployable product. It can handle short target flicker and some multi-person cases better than before, but long occlusion, crowded environments, transparent obstacles, and lighting changes remain challenging. Overall, the final system demonstrates a complete ROS 2 person-following pipeline with custom perception, tracking, control, safety, and actuation modules running on real TurtleBot hardware.


References


This site uses Just the Docs, a documentation theme for Jekyll.