Predicting Long-term Visual Outcomes for Robot Manipulation Using Vision-based Techniques

Munir Ahmad; Taib Ali; Nabeel Ali Khan; Asima Afzal; Talha Bin Sohail; Tehmina Shahid

doi:10.21015/vtcs.v12i2.1961

Authors

Munir Ahmad Abasyn University Peshawar, Islamabad Campus, Pakistan https://orcid.org/0000-0001-7919-1329
Taib Ali University of Management & Technology, Lahore, Pakistan https://orcid.org/0009-0008-9842-6830
Nabeel Ali Khan University of Management & Technology, Lahore, Pakistan https://orcid.org/0009-0006-4740-8478
Asima Afzal Comsats University Islamabad, Pakistan https://orcid.org/0009-0008-8887-2706
Talha Bin Sohail University of South Asia, Lahore, Pakistan https://orcid.org/0009-0007-4544-8270
Tehmina Shahid Lahore College for Women University, Lahore, Pakistan https://orcid.org/0009-0002-5854-1611

DOI:

https://doi.org/10.21015/vtcs.v12i2.1961

Abstract

Predicting long-term visual outcomes for robot manipulation tasks is crucial for enabling robots to anticipate future changes in their environment and plan optimal actions accordingly. This research is presents a novel approach to long-term visual prediction using vision-based techniques and deep learning models. We propose a hybrid convolutional neural network (CNN) and recurrent neural network (RNN) architecture that combines spatial feature extraction with temporal modeling to predict future visual states accurately. The predictive model is trained on annotated datasets of robot manipulation sequences, allowing it to learn complex spatial and temporal relationships in the data. Experimental results demonstrate the effectiveness of the proposed approach in accurately predicting long-term visual outcomes for a variety of manipulation tasks.

References

Y. Zhang et al., "Deep Predictive Models for Long-term Visual Prediction in Robot Manipulation," in *Proc. IEEE Int. Conf. Robot. Autom.*, 2024.

X. Chen and L. Wang, "Robust Long-term Visual Prediction for Manipulation Tasks Using Hybrid CNN-LSTM Model," *Robot. Auton. Syst.*, vol. 132, p. 102087, 2023.

S. Patel and J. Liu, "Vision-based Predictive Control for Dynamic Object Manipulation using Gaussian Processes," *IEEE Trans. Robot.*, 2022.

H. Kim et al., "Adaptive Long-term Visual Prediction with Incremental Learning for Robotic Manipulation," *IEEE Robot. Autom. Lett.*, vol. 6, no. 2, pp. 3613–3620, 2021.

Q. Li and Z. Wu, "Combining Deep Reinforcement Learning with Visual Prediction for Dynamic Object Manipulation," *arXiv preprint arXiv:2012.04672*, 2020.

Y. Wang and S. Zhang, "Exploring Uncertainty in Long-term Visual Prediction for Robotic Manipulation Tasks," in *Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst.*, 2020.

J. Park et al., "Learning Action-conditioned Visual Prediction Models for Robot Manipulation," in *Proc. AAAI Conf. Artif. Intell.*, 2020.

H. Liu and J. Xu, "Efficient Long-term Visual Prediction for Manipulation Tasks using Lightweight Neural Networks," *Robot. Comput.-Integr. Manuf.*, vol. 66, p. 101980, 2020.

Q. Zhou et al., "Addressing Domain Shift in Long-term Visual Prediction for Robot Manipulation through Domain Adaptation," *IEEE Robot. Autom. Lett.*, vol. 5, no. 4, pp. 5336–5343, 2020.

W. Yang and J. Li, "Enabling Generalization in Long-term Visual Prediction Models for Robot Manipulation using Meta-learning," in *Proc. IEEE Int. Conf. Robot. Autom.*, 2020.

A. Smith et al., "Long-term Visual Prediction of Manipulation Actions using a Memory-augmented Generative Adversarial Network," in *Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst.*, 2019.

T. Wu and K. Chen, "Deep Reinforcement Learning for Long-term Visual Prediction and Control of Object Manipulation," *IEEE Trans. Autom. Sci. Eng.*, 2019.

X. Huang et al., "Adaptive Long-term Prediction of Human Activities using Recurrent Neural Networks," *IEEE Trans. Pattern Anal. Mach. Intell.*, vol. 40, no. 5, pp. 1063–1076, 2018.

A. Nguyen et al., "Robotic Manipulation with Trajectory Prediction using LSTM Networks," in *Proc. IEEE Int. Conf. Robot. Autom.*, 2018.

W. Hu and D. Wu, "Learning Long-term Object Manipulation Skills with Hierarchical Predictive Networks," in *Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst.*, 2018.

J. Li et al., "Predicting Future Object Locations for Dynamic Manipulation using Convolutional Neural Networks," *IEEE Robot. Autom. Lett.*, vol. 3, no. 4, pp. 3389–3396, 2018.

Y. Song et al., "Predicting Visual Features from Unlabeled Video," in *Proc. IEEE Conf. Comput. Vis. Pattern Recognit.*, 2017.

S. Zhang et al., "Visual Prediction of Object Motion with the Functional Object-driven Network," in *Proc. IEEE Conf. Comput. Vis. Pattern Recognit.*, 2017.

Y. Wang et al., "Deep Predictive Coding Network for Object Recognition," in *Proc. IEEE Int. Conf. Comput. Vis.*, 2017.

X. Chen et al., "Long-term Human Motion Prediction with Recurrent Conditional GANs," in *Proc. AAAI Conf. Artif. Intell.*, 2016.

C. Park et al., "Combining LSTM with a CNN for Predicting Action Sequences," in *Proc. IEEE Int. Conf. Comput. Vis.*, 2016.

Z. Zhu et al., "Predicting Object Motion in Videos using Convolutional Networks," in *Proc. Eur. Conf. Comput. Vis.*, 2016.

Q. Li et al., "Predicting Future Human Activity and Object Location with Tensorflow," in *Proc. Int. Conf. Mach. Learn.*, 2015.

L. Wang et al., "Long-term Motion Prediction using Deep Learning," in *Proc. IEEE Conf. Comput. Vis. Pattern Recognit.*, 2015.

H. Kim et al., "Predicting Object Motion using RNNs," in *Proc. Int. Conf. Learn. Represent.*, 2014.

Predicting Long-term Visual Outcomes for Robot Manipulation Using Vision-based Techniques

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information