Viewing a single comment thread. View all comments

SaifKhayoon t1_j69e65n wrote

They had a problem sourcing labeled training data of 3D videos, you can tell this tech is still early from the shield in the bottom right example

They could generate a labeled 3D environments from 2D images using InstantNGP and GET3D with Laion's labeled dataset of 5.85 billion CLIP-filtered image-text pairs to create a useful dataset for training because this currently relies on a workaround of only being trained on text-image pairs and unlabeled videos due to lack of labeled 3D training data.

1

hapliniste t1_j6gvcgp wrote

I guess AR glasses will make access to 3d video (as in first person scanned scenes) way easier (for the companies that control the glasses OS).

2