MZ

FotoFirst NameLast NamePosition
Mykhaylo Andriluka People Detection and Tracking
Roland Angst Vision, Geometry, and Computational Perception
Tamay Aykut
Vahid Babaei
Pierpaolo Baccichet Distributed Media Systems
Volker Blanz Learning-Based Modeling of Objects
Volker Blanz Learning-Based Modeling of Objects
Martin Bokeloh Inverse Procedural Modeling
Adrian Butscher Geometry Processing and Discrete Differential Geometry
Renjie Chen Images and Geometry

Researcher


Dr. Michael Zollhöfer


Visual Computing, Deep Learning and Optimization

Name of Research Group: Visual Computing, Deep Learning and Optimization
Homepage Research Group: web.stanford.edu/~zollhoef
Personal Homepage: zollhoefer.com
Mentor Saarbrücken: Hans-Peter Seidel
Mentor Stanford: Pat Hanrahan
Research Mission: The primary focus of my research is to teach computers to reconstruct and analyze our world at frame rate based on visual input. The extracted knowledge is the foundation for a broad range of applications not only in visual effects, computer animation, autonomous driving and man-machine interaction, but is also essential in other related fields such as medicine and biomechanics. Especially, with the increasing popularity of virtual, augmented and mixed reality, there comes a rising demand for real-time low latency solutions to the underlying core problems.    My research tackles these challenges based on novel mathematical models and algorithms that enable computers to first reconstruct and subsequently analyze our world. The main focus is on fast and robust algorithms that approach the underlying reconstruction and machine learning problems for static as well as dynamic scenes. To this end, I develop key technology to invert the image formation models of computer graphics based on data-parallel optimization and state-of-the-art deep learning techniques.    The extraction of 3D and 4D information from visual data is highly challenging and under-constraint, since image formation convolves multiple physical dimensions into flat color measurements. 3D and 4D reconstruction at real-time rates poses additional challenges, since it involves the solution of unique challenges at the intersection of multiple important research fields, namely computer graphics, computer vision, machine learning, optimization, and high-performance computing. However, a solution to these problems provides strong cues for the extraction of higher-order semantic knowledge. It is incredibly important to solve the underlying core problems, since this will have high impact in multiple important research fields and provide key technological insights that have the potential to transform the visual computing industry. In summer 2019 Michael Zollhöfer joined Facebook.

Researcher

Name of Researcher
Aykut, Tamay
Homepage of Research Group
www.tamay-aykut.com
First Name
Tamay
Last Name
Aykut
Foto
url_foto
Phone
Position
Categories
Current Groups
Research Mission
The VCAI group is involved in cutting-edge research for visual computing by means of artificial intelligence. A key priority is to promote visual realism in remote reality/ telepresence applications to the human user. 3D impression is achieved by providing omnidirectional (360°) stereo vision. The user is thereby equipped with a binocular Head Mounted Display (HMD), such as the VR system Oculus Rift, where the visual content can either be monoscopic or stereoscopic. While in the monoscopic case, the same content is shown for both eyes, a stereoscopic visualization provides the sense and perception of depth by providing different imagery from separate vantage points per eye. Stereoscopic VR systems enable a 3D impression and, hence, provision a more realistic and immersive experience of the remote environment. Most 360° videos available nowadays on platforms like YouTube or Facebook are mostly monoscopic and, thus, do not provide the perception of depth, even though when watched through an HMD. In the case of telepresence, where the visual content needs to be streamed from a remote vision system to a local user over a communication network, the primary goal is to develop vision systems that are not only realtime capable but also provide omnidirectional 3D impression through stereo vision while keeping the computational and financial burden low. However, sending two complete monocular 360° videos would require substantial communication capacity, even though large portions of the imagery are not displayed to the user. Smart acquisition, streaming and rendering strategies are hence needed to avoid claiming large parts of the communication network with unused data. The main challenge is to select the user's prospective viewport portions ahead of time, especially when streamed over a communication network. The latency between the remote vision system and the local users causes incongruities between ego-motion and visual response, which is denoted as Motion-to-Photon (M2P) latency, and provokes the user to suffer from visual discomfort when exceeding a certain threshold. The VCAI group works on sophisticated algorithmic and AI-based solutions to overcome the QoE-limiting M2P latency while maintaining a high degree of visual comfort.
mission_rtf

The VCAI group is involved in cutting-edge research for visual computing by means of artificial intelligence. A key priority is to promote visual realism in remote reality/ telepresence applications to the human user. 3D impression is achieved by providing omnidirectional (360°) stereo vision.

The user is thereby equipped with a binocular Head Mounted Display (HMD), such as the VR system Oculus Rift, where the visual content can either be monoscopic or stereoscopic. While in the monoscopic case, the same content is shown for both eyes, a stereoscopic visualization provides the sense and perception of depth by providing different imagery from separate vantage points per eye. Stereoscopic VR systems enable a 3D impression and, hence, provision a more realistic and immersive experience of the remote environment. Most 360° videos available nowadays on platforms like YouTube or Facebook are mostly monoscopic and, thus, do not provide the perception of depth, even though when watched through an HMD.

In the case of telepresence, where the visual content needs to be streamed from a remote vision system to a local user over a communication network, the primary goal is to develop vision systems that are not only realtime capable but also provide omnidirectional 3D impression through stereo vision while keeping the computational and financial burden low. However, sending two complete monocular 360° videos would require substantial communication capacity, even though large portions of the imagery are not displayed to the user. Smart acquisition, streaming and rendering strategies are hence needed to avoid claiming large parts of the communication network with unused data. The main challenge is to select the user’s prospective viewport portions ahead of time, especially when streamed over a communication network. The latency between the remote vision system and the local users causes incongruities between ego-motion and visual response, which is denoted as Motion-to-Photon (M2P) latency, and provokes the user to suffer from visual discomfort when exceeding a certain threshold. The VCAI group works on sophisticated algorithmic and AI-based solutions to overcome the QoE-limiting M2P latency while maintaining a high degree of visual comfort.

Name of Research Group
Visual Computing and Artificial Intelligence

Personal Info

Photo