Submitted by I_will_delete_myself t3_115z9hc in MachineLearning
gamerx88 t1_j9evm62 wrote
How do you utilize a spot instance for training? How do you automatically resume training from a checkpoint? Or are you referring to something like Sagemaker's managed spot training?
I_will_delete_myself OP t1_j9fp5fh wrote
Try looking into if they have an API. shutdown is rare, but it happens so I only ran into it once. Having the cloud on your mobile device is great, it allows you to check anywhere and do some simple things quickly.
Viewing a single comment thread. View all comments