Foto | First Name | Last Name | Position |
---|---|---|---|
Mykhaylo | Andriluka | People Detection and Tracking | |
Roland | Angst | Vision, Geometry, and Computational Perception | |
Tamay | Aykut | ||
Vahid | Babaei | ||
Pierpaolo | Baccichet | Distributed Media Systems | |
Volker | Blanz | Learning-Based Modeling of Objects | |
Volker | Blanz | Learning-Based Modeling of Objects | |
Martin | Bokeloh | Inverse Procedural Modeling | |
Adrian | Butscher | Geometry Processing and Discrete Differential Geometry | |
Renjie | Chen | Images and Geometry |
Researcher
|
Dr. Michael Zollhöfer |
Visual Computing, Deep Learning and Optimization
Name of Research Group: | Visual Computing, Deep Learning and Optimization |
Homepage Research Group: | web.stanford.edu/~zollhoef |
Personal Homepage: | zollhoefer.com |
Mentor Saarbrücken: | Hans-Peter Seidel |
Mentor Stanford: | Pat Hanrahan |
Research Mission: | The primary focus of my research is to teach computers to reconstruct and analyze our world at frame rate based on visual input. The extracted knowledge is the foundation for a broad range of applications not only in visual effects, computer animation, autonomous driving and man-machine interaction, but is also essential in other related fields such as medicine and biomechanics. Especially, with the increasing popularity of virtual, augmented and mixed reality, there comes a rising demand for real-time low latency solutions to the underlying core problems. My research tackles these challenges based on novel mathematical models and algorithms that enable computers to first reconstruct and subsequently analyze our world. The main focus is on fast and robust algorithms that approach the underlying reconstruction and machine learning problems for static as well as dynamic scenes. To this end, I develop key technology to invert the image formation models of computer graphics based on data-parallel optimization and state-of-the-art deep learning techniques. The extraction of 3D and 4D information from visual data is highly challenging and under-constraint, since image formation convolves multiple physical dimensions into flat color measurements. 3D and 4D reconstruction at real-time rates poses additional challenges, since it involves the solution of unique challenges at the intersection of multiple important research fields, namely computer graphics, computer vision, machine learning, optimization, and high-performance computing. However, a solution to these problems provides strong cues for the extraction of higher-order semantic knowledge. It is incredibly important to solve the underlying core problems, since this will have high impact in multiple important research fields and provide key technological insights that have the potential to transform the visual computing industry. In summer 2019 Michael Zollhöfer joined Facebook. |
Researcher
- Name of Researcher
- Aykut, Tamay
- Homepage of Research Group
- www.tamay-aykut.com
- First Name
- Tamay
- Last Name
- Aykut
- Foto
- url_foto
- Homepage
- www.tamay-aykut.com
- Phone
- Position
- Mentor in Saarbruecken
- Hans-Peter Seidel
- Mentor in Stanford
- Bernd Girod
- Categories
- Current Groups
- Research Mission
- The VCAI group is involved in cutting-edge research for visual computing by means of artificial intelligence. A key priority is to promote visual realism in remote reality/ telepresence applications to the human user. 3D impression is achieved by providing omnidirectional (360°) stereo vision. The user is thereby equipped with a binocular Head Mounted Display (HMD), such as the VR system Oculus Rift, where the visual content can either be monoscopic or stereoscopic. While in the monoscopic case, the same content is shown for both eyes, a stereoscopic visualization provides the sense and perception of depth by providing different imagery from separate vantage points per eye. Stereoscopic VR systems enable a 3D impression and, hence, provision a more realistic and immersive experience of the remote environment. Most 360° videos available nowadays on platforms like YouTube or Facebook are mostly monoscopic and, thus, do not provide the perception of depth, even though when watched through an HMD. In the case of telepresence, where the visual content needs to be streamed from a remote vision system to a local user over a communication network, the primary goal is to develop vision systems that are not only realtime capable but also provide omnidirectional 3D impression through stereo vision while keeping the computational and financial burden low. However, sending two complete monocular 360° videos would require substantial communication capacity, even though large portions of the imagery are not displayed to the user. Smart acquisition, streaming and rendering strategies are hence needed to avoid claiming large parts of the communication network with unused data. The main challenge is to select the user's prospective viewport portions ahead of time, especially when streamed over a communication network. The latency between the remote vision system and the local users causes incongruities between ego-motion and visual response, which is denoted as Motion-to-Photon (M2P) latency, and provokes the user to suffer from visual discomfort when exceeding a certain threshold. The VCAI group works on sophisticated algorithmic and AI-based solutions to overcome the QoE-limiting M2P latency while maintaining a high degree of visual comfort.
- mission_rtf
The VCAI group is involved in cutting-edge research for visual computing by means of artificial intelligence. A key priority is to promote visual realism in remote reality/ telepresence applications to the human user. 3D impression is achieved by providing omnidirectional (360°) stereo vision.
The user is thereby equipped with a binocular Head Mounted Display (HMD), such as the VR system Oculus Rift, where the visual content can either be monoscopic or stereoscopic. While in the monoscopic case, the same content is shown for both eyes, a stereoscopic visualization provides the sense and perception of depth by providing different imagery from separate vantage points per eye. Stereoscopic VR systems enable a 3D impression and, hence, provision a more realistic and immersive experience of the remote environment. Most 360° videos available nowadays on platforms like YouTube or Facebook are mostly monoscopic and, thus, do not provide the perception of depth, even though when watched through an HMD.
In the case of telepresence, where the visual content needs to be streamed from a remote vision system to a local user over a communication network, the primary goal is to develop vision systems that are not only realtime capable but also provide omnidirectional 3D impression through stereo vision while keeping the computational and financial burden low. However, sending two complete monocular 360° videos would require substantial communication capacity, even though large portions of the imagery are not displayed to the user. Smart acquisition, streaming and rendering strategies are hence needed to avoid claiming large parts of the communication network with unused data. The main challenge is to select the user’s prospective viewport portions ahead of time, especially when streamed over a communication network. The latency between the remote vision system and the local users causes incongruities between ego-motion and visual response, which is denoted as Motion-to-Photon (M2P) latency, and provokes the user to suffer from visual discomfort when exceeding a certain threshold. The VCAI group works on sophisticated algorithmic and AI-based solutions to overcome the QoE-limiting M2P latency while maintaining a high degree of visual comfort.
- Name of Research Group
- Visual Computing and Artificial Intelligence
Personal Info
- Photo
- Website, Blog or Social Media Link