Submitted by Atom_101 t3_1050cw1 in MachineLearning
Hi, so I need to create an application (for windows and linux) that runs a few pytorch models, on a user's local device. I have only ever deployed models on the cloud. That is pretty straightforward — package your dependencies and code inside a docker container, create an API for calling the model and run it on a cloud instance.
​
But how do I do it when the model needs to run on the end user's device? Docker doesn't really work since there seems to be no way to keep the user from accessing the docker container and hence my source code. I would like to avoid torchscript since my models are quite complex and it will take a lot of effort to make everything scriptable. There seems to be a python compiler called Nuitka which supports Pytorch. But how do python compilers deal with dependencies? Python libraries can be dealt with by following import statements, but what about CUDA?
​
I would ideally like a single executable with all libraries and cuda stuff stored inside. When run, this executable should spawn some API processes in the background and display the frontend that allows the user to interact with the models. But is there a better way to achieve this? I would prefer not to make the users setup CUDA themselves.
CyberDainz t1_j3awj2p wrote
look at my project https://github.com/iperov/DeepFaceLive
I made a builder to create an all-in-one standalone folder for Windows that contains everything to run a python application containing cuda runtime. Release folder also contains portable VSCode that has already configured project to modify only folder's code. No conda, no docker and other redundant shit.
Builder located here https://github.com/iperov/DeepFaceLive/blob/master/build/windows/WindowsBuilder.py and can be expanded to suit your needs.