First Name	Last Name	Position
Mykhaylo	Andriluka	People Detection and Tracking
Roland	Angst	Vision, Geometry, and Computational Perception
Tamay	Aykut
Vahid	Babaei
Pierpaolo	Baccichet	Distributed Media Systems
Volker	Blanz	Learning-Based Modeling of Objects
Volker	Blanz	Learning-Based Modeling of Objects
Martin	Bokeloh	Inverse Procedural Modeling
Adrian	Butscher	Geometry Processing and Discrete Differential Geometry
Renjie	Chen	Images and Geometry

Researcher

	Dr. Michael Zollhöfer

Visual Computing, Deep Learning and Optimization

Name of Research Group:	Visual Computing, Deep Learning and Optimization
Homepage Research Group:	web.stanford.edu/~zollhoef
Personal Homepage:	zollhoefer.com
Mentor Saarbrücken:	Hans-Peter Seidel
Mentor Stanford:	Pat Hanrahan
Research Mission:	The primary focus of my research is to teach computers to reconstruct and analyze our world at frame rate based on visual input. The extracted knowledge is the foundation for a broad range of applications not only in visual effects, computer animation, autonomous driving and man-machine interaction, but is also essential in other related fields such as medicine and biomechanics. Especially, with the increasing popularity of virtual, augmented and mixed reality, there comes a rising demand for real-time low latency solutions to the underlying core problems. My research tackles these challenges based on novel mathematical models and algorithms that enable computers to first reconstruct and subsequently analyze our world. The main focus is on fast and robust algorithms that approach the underlying reconstruction and machine learning problems for static as well as dynamic scenes. To this end, I develop key technology to invert the image formation models of computer graphics based on data-parallel optimization and state-of-the-art deep learning techniques. The extraction of 3D and 4D information from visual data is highly challenging and under-constraint, since image formation convolves multiple physical dimensions into flat color measurements. 3D and 4D reconstruction at real-time rates poses additional challenges, since it involves the solution of unique challenges at the intersection of multiple important research fields, namely computer graphics, computer vision, machine learning, optimization, and high-performance computing. However, a solution to these problems provides strong cues for the extraction of higher-order semantic knowledge. It is incredibly important to solve the underlying core problems, since this will have high impact in multiple important research fields and provide key technological insights that have the potential to transform the visual computing industry. In summer 2019 Michael Zollhöfer joined Facebook.

Researcher

Name of Researcher: Aykut, Tamay

Homepage of Research Group: www.tamay-aykut.com

First Name: Tamay

Last Name: Aykut

Foto

url_foto

Homepage: www.tamay-aykut.com

Phone

Position

Email

Mentor in Saarbruecken: Hans-Peter Seidel

Mentor in Stanford: Bernd Girod

Categories: Current Groups

Research Mission: The VCAI group is involved in cutting-edge research for visual computing by means of artificial intelligence. A key priority is to promote visual realism in remote reality/ telepresence applications to the human user. 3D impression is achieved by providing omnidirectional (360°) stereo vision. The user is thereby equipped with a binocular Head Mounted Display (HMD), such as the VR system Oculus Rift, where the visual content can either be monoscopic or stereoscopic. While in the monoscopic case, the same content is shown for both eyes, a stereoscopic visualization provides the sense and perception of depth by providing different imagery from separate vantage points per eye. Stereoscopic VR systems enable a 3D impression and, hence, provision a more realistic and immersive experience of the remote environment. Most 360° videos available nowadays on platforms like YouTube or Facebook are mostly monoscopic and, thus, do not provide the perception of depth, even though when watched through an HMD. In the case of telepresence, where the visual content needs to be streamed from a remote vision system to a local user over a communication network, the primary goal is to develop vision systems that are not only realtime capable but also provide omnidirectional 3D impression through stereo vision while keeping the computational and financial burden low. However, sending two complete monocular 360° videos would require substantial communication capacity, even though large portions of the imagery are not displayed to the user. Smart acquisition, streaming and rendering strategies are hence needed to avoid claiming large parts of the communication network with unused data. The main challenge is to select the user's prospective viewport portions ahead of time, especially when streamed over a communication network. The latency between the remote vision system and the local users causes incongruities between ego-motion and visual response, which is denoted as Motion-to-Photon (M2P) latency, and provokes the user to suffer from visual discomfort when exceeding a certain threshold. The VCAI group works on sophisticated algorithmic and AI-based solutions to overcome the QoE-limiting M2P latency while maintaining a high degree of visual comfort.

mission_rtf: The VCAI group is involved in cutting-edge research for visual computing by means of artificial intelligence. A key priority is to promote visual realism in remote reality/ telepresence applications to the human user. 3D impression is achieved by providing omnidirectional (360°) stereo vision.

The user is thereby equipped with a binocular Head Mounted Display (HMD), such as the VR system Oculus Rift, where the visual content can either be monoscopic or stereoscopic. While in the monoscopic case, the same content is shown for both eyes, a stereoscopic visualization provides the sense and perception of depth by providing different imagery from separate vantage points per eye. Stereoscopic VR systems enable a 3D impression and, hence, provision a more realistic and immersive experience of the remote environment. Most 360° videos available nowadays on platforms like YouTube or Facebook are mostly monoscopic and, thus, do not provide the perception of depth, even though when watched through an HMD.

In the case of telepresence, where the visual content needs to be streamed from a remote vision system to a local user over a communication network, the primary goal is to develop vision systems that are not only realtime capable but also provide omnidirectional 3D impression through stereo vision while keeping the computational and financial burden low. However, sending two complete monocular 360° videos would require substantial communication capacity, even though large portions of the imagery are not displayed to the user. Smart acquisition, streaming and rendering strategies are hence needed to avoid claiming large parts of the communication network with unused data. The main challenge is to select the user’s prospective viewport portions ahead of time, especially when streamed over a communication network. The latency between the remote vision system and the local users causes incongruities between ego-motion and visual response, which is denoted as Motion-to-Photon (M2P) latency, and provokes the user to suffer from visual discomfort when exceeding a certain threshold. The VCAI group works on sophisticated algorithmic and AI-based solutions to overcome the QoE-limiting M2P latency while maintaining a high degree of visual comfort.

Name of Research Group: Visual Computing and Artificial Intelligence

Personal Info

Photo

Website, Blog or Social Media Link

MZ

Researcher

Dr. Michael Zollhöfer

Visual Computing, Deep Learning and Optimization

Researcher

Personal Info