Shanmuga

EE 645 3-D Computer Vision 2018-19

Lectures - M Slot
Tutorial - P1 Slot
Lecture and Tutorial Hall - 7-102
Instructor - Shanmuga, AB6-327A, PH 2453. E-mail: shanmuga@iitgn.ac.in
TAs - Gagan Kanojia, Diptiben Patel
Pre-requisite - Nil

Course Content

Review of linear algebra, calculus of variations, signals and systems; Camera and image formation - optics; Feature detectors - edge and corner detection; Feature descriptors - SIFT, SURF, feature matching; Shape from X - Reflectance map, BRDF, shape from shading, photometric stereo, depth from defocus, depth from focus, RGB-D images; Single view geometry - finite projective cameras, camera parameters, point correspondences, estimation of camera matrix, direct linear transformation (DLT); Two view geometry - homography, epipolar geometry, estimation of fundamental matrix, image rectification, stereo correspondence, shape from stereo; Three view geometry - trifocal tensors; Motion - optical flow field, Estimation of dense and accurate optical flow field; Multi view geometry - structure from motion, triangulation, factorization, bundle adjustment; Internet vision - mining community photo collections (Flickr, Facebook, etc.).

Textbooks

Hartley R. and Zisserman A. (2004). Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press.
Szeliski, R. (2010). Computer Vision: Algorithms and Applications. Springer-Verlag New York Inc. Available Online.
Horn, B. K. P. (1986). Robot Vision. The MIT Press.
Nixon, M. S., & Aguado, A. S. (2012). Feature Extraction & Image Processing for Computer Vision. Third Edition. Academic Press.
Davies, E. R. (2012). Computer and Machine Vision: Theory, Algorithms, Practicalities. 4th Edition. Academic Press.
Forsyth, D. A., & Ponce, J. (2015). Computer Vision: A Modern Approach. Second Edition. Prentice Hall of India.
Klette, R. (2014). Concise Computer Vision: An Introduction Into Theory and Algorithms. Springer Publishing Company, Incorporated.

The first 3 books are the best ones to learn from while the next 4 books provide alternate treatment of certain topics.

References

Marr, D. (2010). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. The MIT Press.
Sonka, M., Hlavac, V., & Boyle, R. (2014). Image Processing, Analysis, and Machine Vision. 4th Edition. Cengage Learning.
Trucco, E. and Verri, A. (1998). Introductory Techniques for 3D Computer Vision, Prentice- Hall.
Prince, S. J. (2012). Computer Vision: Models, Learning, and Inference. Cambridge University Press. Available Online
Ikeuchi, K. (2014). Computer Vision: A Reference Guide. Springer Publishing Company, Incorporated.
Fisher, R. B., Breckon, T. P., Dawson-Howe, K., Fitzgibbon, A., Robertson, C., Trucco, E., & Williams, C. K. (2013). Dictionary of computer vision and image processing. John Wiley & Sons.
Awesome Computer Vision Reading List

The book by Marr provides a viewpoint based on visual neuroscience concepts. The next 5 books can be used as reference for certain topics. Apart from these books, some topics would be taught from selected research papers.

Lecture notes you make in the classroom will provide pointers to look into topics in different books listed above. The topics taught in a lecture may have evolved from multiple books and research papers. Reading books would certainly aid lectures but can never replace the lectures.

Grading

Programming Assignments - 40%
Course Project - 30%
Final Exam - 30%

Expected Learning Outcomes

The world we live has three dimensions (3D). Human visual system has evolved to perceive all these dimensions. However, the images we capture using conventional cameras are just the 2D projections of the 3D world. In 3D Computer Vision course, we shall explore various techniques for recovering the missing third dimension (depth information) from 2D images using primarily variational methods and projective geometry concepts.

The course contents would enable the student to reconstruct the 3D real world scene from 2D images by various methods. The applications of this course range from cultural heritage to medical imaging, from robot navigation to 3D modeling. The assignments and projects associated with the course to be completed using OpenCV-Python would enable students to develop state-of-the-art 3D computer vision applications.

This course is offered as an elective for BTech, MTech, and PhD students of IIT Gandhinagar. This course is also prescribed for a minor degree in Computer Science.

Contacting Instructor

Primary mode of contact will be to send an email and fix an appointment to meet. Queries may be posted in Google classroom.