SaifKhayoon t1_j69e65n wrote on January 28, 2023 at 6:17 PM

They had a problem sourcing labeled training data of 3D videos, you can tell this tech is still early from the shield in the bottom right example

They could generate a labeled 3D environments from 2D images using InstantNGP and GET3D with Laion's labeled dataset of 5.85 billion CLIP-filtered image-text pairs to create a useful dataset for training because this currently relies on a workaround of only being trained on text-image pairs and unlabeled videos due to lack of labeled 3D training data.

hapliniste t1_j6gvcgp wrote on January 30, 2023 at 5:45 AM

I guess AR glasses will make access to 3d video (as in first person scanned scenes) way easier (for the companies that control the glasses OS).