Proposal for Google Summer of Code 2013

Name

Joseph Mansfield

Email / IRC / WWW

Email: sftrabbit@gmail.com
IRC: sftrabbit

Additional Contact Info

Phone
Physical Address

Synopsis

Expand and improve Blender's motion tracking features. The motion tracking module is a young but important aspect of Blender and has much room for development. I plan to focus on multicamera solving and reconstruction from footage with variable focal length. In addition, I will make various smaller improvements to the motion tracking workflow and interface.

Benefits to Blender

Motion tracking is an increasingly critical aspect of digital film and visual effects production. Improved functionality will make Blender much more attractive to many amateur and professional filmmakers and animators, as it will be more fully-featured and fleshed out.

Multicamera reconstruction will allow tracking to be performed on multiple clips, each with different views of the same scene, to improve camera and object solving. Support for variable focal length footage is required for clips that change focal length for zoom or effect (e.g. dolly zoom). Such features are indispensable for high quality motion tracking.

Development of motion tracking features will also involve collaboration with other open source projects such as libmv. This will continue to promote Blender within the open source development community.

Deliverables

Multicamera reconstruction:

Ability to view multiple clips simultaneously (main footage and witness cameras, for example). Perhaps through multiple clip spaces or a special mode for multi-clip editing.
User can manually define relations between clips, such as synchronisation, matching of features, triangulation, known positional data, survey data, etc.
Display relations between clips graphically with appropriate options to configure them.
Camera intrinsics should be specified for each clip individually (often lower quality witness cameras are used).
Automatically identify common features (perhaps SIFT) across clips to streamline multi-view triangulation process. This will probably involve improvements to current feature detection algorithm.
Perform camera and object solving using unified data from multiple clips. Likely requires contributions to libmv.
Appropriate Python API for scripting of the above functionality.
High quality end-user documentation.

Variable focal length tracking:

Liberate the focal length property to allow for variable focal length.
Use the graph editor to provide complex changes in focal length with smooth and accurate interpolation.
Attempt to automatically detect changes in focal length where possible.
Perform camera and object solving given information about changes in focal length. Likely requires contributions to libmv.
Appropriate Python API for scripting of the above functionality.
High quality end-user documentation

In combination, if designed appropriately, it should be possible to perform motion tracking with multiple cameras which each have variable focal length.

Project Details

I plan to help augment Blender's motion tracking module by contributing two important features: multicamera solving and variable focal length tracking.

Motion tracking for camera and object solving can be improved and refined by providing more information about the scene. Shooting footage of a scene with multiple cameras can provide lots of further data. When extra cameras are used to film a scene to aid with matchmoving, they are often known as witness cameras. The ability to solve for multicamera footage would be a great addition to Blender's slew of motion tracking tools.

The user should be able to view and define relations between multiple video clips. Ideally, an operation should be provided to automatically detect common features between clips. However, it is important that the user is always able to manually configure the relations, which would be a good first target. The camera and object solvers can then use the additional data to better reconstruct bundles. The user interface should make this workflow as intuitive as possible.

Further advancements for multicamera reconstruction might include: automatic multiview feature detection for feature triangulation; the ability to perform object solving on the position of a camera from the perspective of a witness to aid in camera solving.

Video footage often exhibits variable focal length. The camera's focal length may be changed to zoom or for effect (such as the dolly zoom). Changing the focal length changes the apparent perception of depth in the video and, if not accounted for, causes motion tracking software to distort the scene in the same way.

Supporting such footage will require making a camera's focal length a dynamic property. The user should be able to use the graph editor to plot the change in focal length over time. The camera and object solvers will then use this extra information to accurately reconstruct the 3D positions of tracks without distortion.

Further advancements for variable focal length tracking might include automatic detection of focal length distortion.

For each of these features, the back-end motion tracking API will require some significant changes. The API should be designed to accommodate both features and scalable to ensure the module can effortlessly handle complicated multicamera and variable focal length set-ups. Such changes to the API might rely on not-yet-implemented features of libraries and so contributions to those other open source projects may be necessary.

In addition to this, I will also help to polish and perfect the motion tracking module in general, including smaller features, its interface, workflow, bug fixes, and API development.

Project Schedule

I expect to work for the duration of GSoC 2013 (17th June - 23rd September). Furthermore, I plan to study the Blender source code and contribute with bug fixes and wiki edits in the interim. I'll hopefully be very familiar with the project by the time GSoC officially begins.

Milestone 1 - Design and implement a scalable back-end API for performing multicamera tracking/reconstruction and any support required for variable focal length footage.
Milestone 2 - Implement interface for multicamera tracking, and hook it up to the back-end.
Milestone 3 - Provide mechanism for plotting the variable focal length of a video clip. Incorporate this data into the camera and object solvers.
Milestone 4 - Provide end-user documentation for all implemented features.

I have a planned week trip at the end of July or beginning of August. It can be scheduled to avoid important dates. I have no other conflicts from work or university.

Bio

I am aged twenty-two and live in the United Kingdom. I'm currently studying my third taught year of a Computer Science with Artificial Intelligence master's degree at the University of York. I am first and foremost a software developer with interests in many areas including graphics, computer vision, and game development. I have been looking for a suitable opportunity to contribute to open source software and, due to my interests, I feel that helping out with Blender for GSoC 2013 would be a great fit for me.

I have a lot of experience with C++, and I am pleased to be among the top users who frequent under the C++ tag at Stack Overflow. Please see my Stack Overflow profile. I am pretty comfortable with C and have working knowledge of Python. I am also familiar with using Blender itself.

Next year will be the final year of my degree, and my dissertation will involve computer vision and machine learning. The essence of it is that I will be extracting depth information from 360 degree video to display on a virtual reality headset. Because of this, I feel like the motion tracking functionality of Blender would be a great area for me to work on to gain more insight into computer vision.

I have previous working experience from a yearlong industrial placement at Philips Research Cambridge, where I worked on projects involving Android development, GPS tracking, and web development. It was there that I improved my ability to work in development teams, including some with half of the team based in Eindhoven.

Besides software development, I am also interested in writing, playing music, and graphic design.

Previous relevant work:

2011-2012 - Yearlong industrial placement at Philips Research Cambridge.
2013 - University computer vision assignment, submitted by email to Ton Roosendaal. A review of the paper Sharing Visual Features for Multiclass and Multiview Object Detection by A. Torralba, et al. discussing and critiquing the JointBoost multitask learning algorithm.

Future relevant work:

2013-2014 - Master's dissertation: Learning depth maps from omnidirectional imagery for 360 degree stereo on Oculus Rift

利用者:Sftrabbit/GSoC 2013/Proposal

目次