Course: RAS 598 — Mobile Robotics
Project: Autonomous Human-Following Mobile Robot
Platform: TurtleBot 4 with iRobot Create 3 base
Sensors: OAK-D RGB camera, 2D LiDAR
Software: ROS 2 Jazzy, Python
Team: Harsh Padmalwar, Atharv Kulkarni

Autonomous Human-Following Mobile Robot — Milestone 3

1. Milestone 3 Context

This Milestone 3 report builds directly on our Milestone 2 implementation. In Milestone 2, our TurtleBot 4 could detect a person using YOLO, estimate the target distance and heading, publish follow commands, pass those commands through a safety layer, and move on hardware. The main weakness we identified was not the basic ROS 2 pipeline, but the quality and robustness of following. During hardware testing, the robot sometimes drifted while moving forward, the target could move out of the camera frame, and the controller could enter search behavior when the person was temporarily lost.

For Milestone 3, we focused on the specific issues that were listed as future work in Milestone 2:

improve heading stability during forward motion
keep the person closer to the center of the camera view
search in the direction where the person was last seen
reacquire the target after short target loss
reduce unstable behavior caused by brief YOLO detection flicker
improve the response when multiple people appear in the camera view

The final system is still organized as a topic-driven ROS 2 stack, but we added more robust target tracking, tuned the follow controller, and expanded safety timing so the robot can search and recover instead of immediately stopping during short target-loss events.

2. Graphical Abstract

flowchart LR
    A[OAK-D RGB Camera] --> B[YOLO Person Detection]
    B --> C[Bounding Boxes]
    C --> D[Multi-Person Tracker\nKalman + IoU + Hungarian]
    E[2D LiDAR /scan] --> F[Range at Target Bearing]
    D --> G[Locked Target ID]
    F --> H[Distance + Heading Estimate]
    G --> H
    H --> I[Follow Controller\nRange + Bearing Control]
    I --> J[Safety Supervisor\nTimeouts + Obstacle Check]
    J --> K[Stamped to Unstamped Twist Bridge]
    K --> L[TurtleBot 4 Base]

Mission summary: The robot uses the OAK-D camera to detect people, a custom tracker to keep a stable selected target, LiDAR to estimate range at the target bearing, and a follow controller to maintain a safe following distance. A safety supervisor remains between the controller and the base so the robot stops when sensor data, control commands, target visibility, or obstacle clearance become unsafe.

3. What Changed Since Milestone 2

Area	Milestone 2 state	Milestone 3 update
Person detection	YOLO detected a person in the camera frame	YOLO detections are now passed into a custom multi-person tracker
Target identity	The system mainly followed the current/best visible detection	The system locks onto a target ID and tries to keep following that ID
Multiple people	If someone entered the scene, the robot could switch or lose confidence	Kalman + IoU + Hungarian matching improves target consistency
Short target loss	Target flicker could cause unstable stop/search behavior	Track miss counters, filters, and longer safety timeout support recovery
Search behavior	Search was not strongly tied to where the target disappeared	The controller rotates toward the last seen target bearing
Forward following	The robot could move in a curved/drifting path while approaching	Gains and speed limits were tuned to reduce aggressive curved motion
Safety timing	Shorter sensor and target timeouts	Sensor/controller timeout set to 3.0 s and target-loss safety timeout set to 10.50 s

These updates directly address the questions raised during earlier feedback: what happens if another person enters the scene, whether the robot estimates target motion during temporary loss, and whether the robot can recover after the person leaves the camera frame for a short time.

4. Final Runtime Architecture

The active ROS 2 pipeline is:

OAK-D camera + LiDAR
        ↓
target_tracker.py + multi_tracker.py
        ↓
follow_controller.py
        ↓
safety_supervisor.py
        ↓
person_follower.py
        ↓
TurtleBot 4 base

target_tracker.py subscribes to the OAK-D compressed image stream and the /scan LiDAR topic. YOLO detects people in the image, and the multi-person tracker assigns stable track IDs. The camera is used for person detection and bearing estimation, while LiDAR is used to estimate the distance along that bearing. The output target state is published as /see/target_in_frame, /see/target_distance, /see/target_heading, and /see/person_location.

follow_controller.py consumes the target state and generates a stamped velocity command on /think/follow_cmd_vel. safety_supervisor.py checks that the command, camera, LiDAR, and target state are fresh and safe before publishing to /cmd_vel. person_follower.py converts the safe stamped command into the unstamped Twist command used by the TurtleBot base on /cmd_vel_unstamped.

5. Core Algorithm

5.1 Multi-person tracking

The major Milestone 3 custom module is multi_tracker.py. Each tracked person is represented by an 8-state bounding-box model:

\[\mathbf{x} = [x_1,y_1,x_2,y_2,\dot{x}_1,\dot{y}_1,\dot{x}_2,\dot{y}_2]^T\]

YOLO provides a bounding-box measurement:

\[\mathbf{z} = [x_1,y_1,x_2,y_2]^T\]

The tracker uses a constant-velocity Kalman prediction model:

\[\mathbf{x}_{k|k-1}=F\mathbf{x}_{k-1|k-1}\]

and a measurement model that observes only the bounding-box position:

\[\mathbf{z}_k=H\mathbf{x}_k\]

After prediction, detections are matched to predicted tracks using Intersection over Union:

\[IoU(A,B)=\frac{|A\cap B|}{|A\cup B|}\]

The Hungarian algorithm finds the best global matching between detections and active tracks. If the IoU score is below the threshold, that detection is not assigned to the track. Unmatched detections create new tracks, and tracks that are missing for too long are removed. In our current implementation, the tracker uses max_age = 5, min_hits = 1, and iou_threshold = 0.3.

This does not solve long-term person re-identification, but it helps with the real problem observed in our project: short detection flicker and another person briefly entering the camera view.

5.2 Target selection and camera-LiDAR fusion

target_tracker.py first filters YOLO detections to the person class. On initial acquisition, it selects the largest visible confirmed track, which usually corresponds to the closest person. After that, it tries to stay locked on the same track ID. If that ID disappears, the node falls back to the largest remaining confirmed track.

The center of the selected bounding box is converted into a camera bearing:

\[\theta_{cam}=-\tan^{-1}\left(\frac{u-c_x}{f_x}\right)\]

where u is the bounding-box center x-coordinate, c_x is the camera principal point, and f_x is the camera focal scaling. The current camera parameters used in the code are:

Parameter	Value
`fx`	196.32
`fy`	196.32
`cx`	127.18
`cy`	126.63

Because the camera and LiDAR are not located at exactly the same point, the bearing is adjusted using the camera-to-LiDAR offset:

\[\theta_{lidar}=\tan^{-1}\left(\frac{\sin(\theta_{cam})-d_y}{\cos(\theta_{cam})-d_x}\right)\]

with dx = 0.0635 m and dy = 0.0381 m. The LiDAR range is then searched near that bearing. Invalid values are ignored, a nearby window of scan values is checked, and the closest valid range is smoothed using an alpha-beta distance filter. The target position in the robot frame is:

\[\mathbf{p}_{target}=\begin{bmatrix}d\cos\theta \\ d\sin\theta\end{bmatrix}\]

A median position filter is also used with window_size = 7 and min_confidence = 3 to reduce unstable target measurements before the controller receives them.

5.3 Follow controller

The controller compares target distance d with the desired follow distance d* = 1.0 m:

\[e_d=d-d^*\]

The heading error is the measured bearing:

\[e_\alpha=\alpha\]

The proportional control law is:

\[v_{raw}=k_d e_d\] \[\omega_{raw}=k_\alpha e_\alpha\]

The current tuned gains are kp_linear = 0.50 and kp_angular = 0.50. The maximum forward speed was increased to 0.30 m/s, while maximum angular speed remains 0.60 rad/s. To avoid small jitter, the controller uses a 4° deadband:

\[\omega_{raw}=0 \quad \text{if} \quad |e_\alpha|<\alpha_{deadband}\]

Forward speed is scaled down when the person is off-center:

\[v=v_{raw}\max(0.2,\cos(e_\alpha))\]

Angular velocity is smoothed using the previous command:

\[\omega_k=\lambda\omega_{raw}+(1-\lambda)\omega_{k-1}\]

When the person is not available, the robot now rotates toward the last seen bearing:

\[\omega_{search}=sign(\alpha_{last})\cdot0.6\]

This is one of the most important changes from Milestone 2. Instead of waiting or performing an unguided search, the robot searches toward the side where the target disappeared.

5.4 Safety supervisor

The safety supervisor remains the final authority before motion reaches the base. It stops the robot if:

camera data becomes stale
LiDAR data becomes stale
controller commands become stale
the target has been lost for too long
a front obstacle is inside the safety distance
the controller command exceeds velocity limits

The updated safety parameters are:

Parameter	Current value	Purpose
`min_obstacle_distance`	0.50 m	Stop if obstacle is inside the front safety zone
`target_loss_timeout`	10.50 s	Allows search/recovery before full safety stop
`camera_timeout`	3.0 s	Stop if camera data stops arriving
`lidar_timeout`	3.0 s	Stop if LiDAR data stops arriving
`controller_timeout`	3.0 s	Stop if follow commands stop arriving
`max_linear_speed`	0.30 m/s	Reject unsafe forward commands
`max_angular_speed`	1.20 rad/s	Reject unsafe rotation commands
`scan_angle_min_deg`	-25°	Front obstacle sector start
`scan_angle_max_deg`	25°	Front obstacle sector end
`publish_rate`	20 Hz	Safety loop rate

The longer target-loss timeout is intentional. The follow controller enters search mode quickly, but the safety supervisor allows more time for the robot to rotate and reacquire the person before forcing a full stop.

6. Demonstration Video

Final TurtleBot Following Demo

The final video will be added here after the hardware run. The video should show the TurtleBot following a person, maintaining distance, correcting heading, and recovering when the target briefly leaves the field of view.

Use this Markdown block after adding the GIF and MP4 files to docs/Videos/:

[![Milestone 3 TurtleBot following demo](/Autonomous-Human-Following-Mobile-Robot/docs/Videos/M3_Following.gif)](/Autonomous-Human-Following-Mobile-Robot/docs/Videos/M3_Following.mp4)

**Video file:** [`M3_Following.mp4`](/Autonomous-Human-Following-Mobile-Robot/docs/Videos/M3_Following.mp4)

7. Benchmarking and Results

7.1 Baseline from Milestone 2

Milestone 2 proved that the robot could detect and move toward a person on hardware, but the major runtime issues were:

sideways drift during forward approach
poor camera centering during motion
unstable behavior when YOLO flickered
generic search behavior after target loss
no persistent ID when multiple people appeared

7.2 Quantitative sensor data already collected

The stationary target test from Milestone 2 is still useful because it explains why filtering is needed. A person stood approximately 2.8 m from the robot while /see/target_distance and /see/target_heading were recorded.

Quantity	Samples	Mean	Std. dev.	Min	Max
Target distance	18	2.81 m	0.22 m	2.24 m	3.01 m
Target heading	18	23.91°	0.77°	21.97°	24.88°

The heading estimate was stable, while distance had higher noise. This supports the final design choice of using median filtering and alpha-beta smoothing before sending target distance to the controller.

7.3 Milestone 3 final trial table

The final hardware trials were used to validate the improvements added after Milestone 2. All ten final tests were completed successfully. The tests focused on stable following, smoother turning, target reacquisition after temporary target loss, last-seen-bearing search, multi-person tracking, and safety behavior. The TurtleBot stopped at approximately 0.5 m during the final hardware run and continued to follow the originally selected person even when another person came between the robot and the target.

Trial	Scenario	Success criterion	Final result
1	Straight approach	Robot approaches the selected person and stops at the tuned safe following distance	Success - The TurtleBot approached the selected person and stopped at approximately 0.5 m.
2	Continuous following	Robot follows without the strong curved drift seen in Milestone 2	Success - The robot followed the person smoothly and the curved following behavior from Milestone 2 was reduced.
3	Person moves left	Robot recenters the target smoothly when the person moves left	Success - The robot turned smoothly and kept the selected person in view.
4	Person moves right	Robot recenters the target smoothly when the person moves right	Success - The robot corrected its heading smoothly and continued following.
5	Short target loss	Robot searches toward the last known target bearing instead of going idle	Success - When the target briefly left the camera view, the robot searched in the direction of the last seen bearing.
6	Target reacquisition	Robot resumes following after the person returns to the camera view	Success - After searching, the robot reacquired the person and resumed following.
7	Second person enters frame	Robot keeps following the original selected person even if another person enters between the robot and the target	Success - The tracker continued following the first selected person even when another person came in between.
8	Obstacle in front	Safety supervisor stops the robot when an obstacle enters the front safety zone	Success - The safety supervisor stopped robot motion when an obstacle/person entered the front safety region.
9	Camera/LiDAR interruption	Safety supervisor reports timeout and stops robot motion when required sensor data becomes stale	Success - The robot stopped safely when the required perception/sensor stream was interrupted.
10	Controller interruption	Robot stops when the command stream from the follow controller becomes stale	Success - The robot stopped when the controller command stream was interrupted or became stale.

Final trial success rate:

Success Rate = (Number of successful trials / Total trials) x 100
Success Rate = (5 / 6) x 100 = 83.33%

The final tests show a clear improvement over Milestone 2. The robot now turns more smoothly, searches using the last known target bearing, reacquires the target after a short loss, and maintains the first selected target even when another person enters between the robot and the followed person.

7.4 Final result summary

In the final hardware test, the robot successfully completed 10 out of 10 validation trials, giving a final success rate of 100%. The strongest improvement compared with Milestone 2 was the stability of the following behavior. In Milestone 2, the robot sometimes moved in a curved path while following and could lose the person when the target moved out of the camera frame. In Milestone 3, the tuned controller produced smoother turns, the search behavior used the last seen target bearing, and the robot resumed following after reacquiring the person.

The multi-person tracking update also improved robustness. When another person entered between the TurtleBot and the originally selected target, the robot continued following the first selected person instead of immediately switching to the new person. This behavior is supported by the Kalman-filter-based track prediction, IoU matching, Hungarian assignment, and selected target ID locking added after Milestone 2.

8. Feedback Integration

Feedback / question	Milestone 2 status	Milestone 3 update
What happens if someone walks between the robot and the followed person?	Not fully solved; documented as a limitation	Added multi-person tracking with Kalman prediction, IoU matching, and Hungarian assignment. This improves target consistency during brief interruptions, but long occlusion is still a limitation.
Are you estimating the future trajectory of the person?	No explicit future prediction	The Kalman tracker predicts bounding-box motion using a constant-velocity model. This is short-term target prediction in image space, not full human trajectory forecasting.
How do you keep the person unique compared to other people?	YOLO detected people but did not maintain persistent identity	The system now locks onto a selected track ID and tries to keep following that ID.
Why did the robot curve or drift while following?	Controller tuning and target centering were unstable	Follow gains and speed limits were updated, and the controller continues correcting heading while reducing forward speed when the target is off-center.
What happens when the person leaves the camera view?	Robot entered search behavior, but search direction was limited	The robot now searches toward the last seen bearing of the target.
Is safety handled separately from control?	Yes	Safety timing was updated, and `/safety/motion_allowed` plus `/safety/status_text` provide clearer runtime status.

9. Ethical Impact Statement

A human-following robot creates ethical concerns because it senses people, moves near people, and may operate in shared spaces. From a privacy perspective, the OAK-D camera captures identifiable visual information. In this project, images are used for local YOLO-based detection and debugging visualization. Future versions should avoid storing raw video unless necessary, and any shared demo video should blur bystanders who did not consent. A utilitarian view supports this project when it helps with education, assistive transport, or robotics research, but those benefits do not justify unnecessary personal data collection.

From a safety perspective, the TurtleBot 4 is small, but it can still collide with people, bags, glass objects, or other robots. Our design reduces risk by using velocity limits, front LiDAR obstacle checks, stale-data timeouts, command validation, and a safety supervisor that can override the controller. The updated target-loss timeout also prevents the robot from immediately giving up during short loss, while still stopping if the target is lost too long. Future versions should add clear intention signals such as lights, sounds, or screen indicators so nearby people know whether the robot is following, searching, stopped, or yielding.

From a justice and bias perspective, the system depends on camera visibility, YOLO confidence, and LiDAR returns. Performance may change with lighting, clothing, body size, crowded scenes, reflective surfaces, glass, or people who are underrepresented in detector training data. A deployable version would need testing across different people, environments, lighting conditions, and mobility patterns. Overall, the project is acceptable as a supervised lab prototype, but real deployment would require stronger privacy handling, broader bias testing, better human-robot communication, and more formal safety validation.

10. Custom Module Code Links

Before final submission, verify that GitHub line anchors still point to the intended code after the final commit. If the files render as one long line, reformat the Python files with normal line breaks and commit them before using line-specific links.

Module	Custom logic	Code link
`multi_tracker.py`	IoU matching function	`multi_tracker.py`
`multi_tracker.py`	`KalmanBoxTracker` constant-velocity bounding-box prediction	`multi_tracker.py`
`multi_tracker.py`	Hungarian assignment and track lifecycle	`multi_tracker.py`
`target_tracker.py`	YOLO person detection and selected target publishing	`target_tracker.py`
`target_tracker.py`	Camera bearing and LiDAR range fusion	`target_tracker.py`
`target_tracker.py`	Target ID lock, missing-frame handling, fallback selection	`target_tracker.py`
`follow_controller.py`	Range/heading controller and last-seen-bearing search	`follow_controller.py`
`safety_supervisor.py`	Sensor timeout, obstacle, target-loss, and command safety checks	`safety_supervisor.py`
`person_follower.py`	Stamped-to-unstamped velocity bridge with command timeout	`person_follower.py`

11. Individual Contribution and Audit Appendix

Team Member	Primary technical role	Key commits / PRs	Specific file authorship / ownership
Atharv Kulkarni	Multi-person tracking, perception updates, safety parameter tuning, hardware testing/documentation	`35dc1bc` — Updated target_tracker to incorporate multi person tracking; `385dd40` — Fine tuned safety parameters; `c8198ef` — Launch file updates	`target_tracker.py`, `multi_tracker.py`, safety tuning, final testing notes
Harsh Padmalwar	Target tracker updates, report/code integration, repository documentation	`944a692` — Update target_tracker.py; `89207e4` — Merge camera intrinsics fix; report/documentation commits	`target_tracker.py`, report integration, repository documentation

Each team member should be able to explain the custom modules listed above during the peer review audit. The most important custom logic for audit is the multi-person tracker, the camera-LiDAR target-state estimation, the last-seen-bearing search behavior, and the safety supervisor stop/recovery logic.

12. Conclusion

Milestone 3 moves the project from basic hardware integration toward robust human following. Milestone 2 proved that the TurtleBot could detect and move toward a person, but it also revealed real hardware issues: curved drifting, target loss, search limitations, and lack of persistent target identity. The Milestone 3 updates directly address those issues by adding Kalman-based multi-person tracking, target ID locking, last-seen-bearing search, updated controller tuning, and longer safety recovery timing.

The system is still a supervised lab prototype, not a fully deployable product. It can handle short target flicker and some multi-person cases better than before, but long occlusion, crowded environments, transparent obstacles, and lighting changes remain challenging. Overall, the final system demonstrates a complete ROS 2 person-following pipeline with custom perception, tracking, control, safety, and actuation modules running on real TurtleBot hardware.

References

RAS 598 Mobile Robotics — Milestone 3 guidelines: https://ras-mobile-robotics.github.io/project/m3.html
Project repository: https://github.com/Hp092/Autonomous-Human-Following-Mobile-Robot
Milestone 2 report: https://github.com/Hp092/Autonomous-Human-Following-Mobile-Robot/blob/main/project/report2.md
ROS 2 Documentation: https://docs.ros.org/
TurtleBot 4 Documentation: https://turtlebot.github.io/turtlebot4-user-manual/
Ultralytics YOLO Documentation: https://docs.ultralytics.com/