Submitted by SpatialComputing t3_10nccbg in MachineLearning
SaifKhayoon t1_j69e65n wrote
They had a problem sourcing labeled training data of 3D videos, you can tell this tech is still early from the shield in the bottom right example
They could generate a labeled 3D environments from 2D images using InstantNGP and GET3D with Laion's labeled dataset of 5.85 billion CLIP-filtered image-text pairs to create a useful dataset for training because this currently relies on a workaround of only being trained on text-image pairs and unlabeled videos due to lack of labeled 3D training data.
hapliniste t1_j6gvcgp wrote
I guess AR glasses will make access to 3d video (as in first person scanned scenes) way easier (for the companies that control the glasses OS).
Viewing a single comment thread. View all comments