This project aims to develop a perception-based module for robots to understand and perform spatiotemporal analysis between humans and objects within the environment, and predict the intent of the human operator. Secondly, a vision-based controller is proposed to leverage the intent awareness module for optimizing robot trajectories. Lastly, the project aims to improve workflow efficiency through task scheduling and allocation mechanisms. We intend to put significant emphasis on the principles of swarm robotics for the potential of enhanced system robustness, flexibility, and scalability.