Hi, so I need to create an application (for windows and linux) that runs a few pytorch models, on a user's local device. I have only ever deployed models on the cloud. That is pretty straightforward — package your dependencies and code inside a docker container, create an API for calling the model and run it on a cloud instance.

But how do I do it when the model needs to run on the end user's device? Docker doesn't really work since there seems to be no way to keep the user from accessing the docker container and hence my source code. I would like to avoid torchscript since my models are quite complex and it will take a lot of effort to make everything scriptable. There seems to be a python compiler called Nuitka which supports Pytorch. But how do python compilers deal with dependencies? Python libraries can be dealt with by following import statements, but what about CUDA?

I would ideally like a single executable with all libraries and cuda stuff stored inside. When run, this executable should spawn some API processes in the background and display the frontend that allows the user to interact with the models. But is there a better way to achieve this? I would prefer not to make the users setup CUDA themselves.

Comments

You must log in or register to comment.

CyberDainz t1_j3awj2p wrote on January 7, 2023 at 6:01 AM

look at my project https://github.com/iperov/DeepFaceLive

I made a builder to create an all-in-one standalone folder for Windows that contains everything to run a python application containing cuda runtime. Release folder also contains portable VSCode that has already configured project to modify only folder's code. No conda, no docker and other redundant shit.

Builder located here https://github.com/iperov/DeepFaceLive/blob/master/build/windows/WindowsBuilder.py and can be expanded to suit your needs.

IndieAIResearcher t1_j3ax10h wrote on January 7, 2023 at 6:06 AM

Why don't you create it as a standalone framework?

CyberDainz t1_j3ax5f5 wrote on January 7, 2023 at 6:07 AM

what do you mean

IndieAIResearcher t1_j3axf09 wrote on January 7, 2023 at 6:10 AM

Windows builder as pypi

CyberDainz t1_j3ayd88 wrote on January 7, 2023 at 6:20 AM

Because each project has its own configuration.

If you make a framework out of this, you get a horror like Bazel. And you will spend time learning how to work with Bazel. But what for? Building a project is a simple operation: create folders, some files, download, unzip, call Popen with a certain env, clean up the __ pycache __ folder at the end, and archive it - that's it! The project is ready.

All you need to do is spend 20 minutes and figure out how WindowsBuilder.py works and adapt it to your project.

WindowsBuilder.py is standalone and requires only python 3.

IndieAIResearcher t1_j3kv33o wrote on January 9, 2023 at 6:39 AM

Understand, thanks!

Atom_101 OP t1_j3b20ok wrote on January 7, 2023 at 7:01 AM

So you are manually setting up cuda and windows environment variables in that script? I'll see if I can get it to work like that. Thanks!

robertknight2 t1_j381r8e wrote on January 6, 2023 at 6:02 PM

Have a look at exporting to ONNX and using ONNX Runtime or another runtime which supports that format: https://pytorch.org/docs/stable/onnx.html

robertknight2 t1_j3821me wrote on January 6, 2023 at 6:03 PM

Ah wait, you said it might not be easily scriptable, so presumably not easily exportable as a graph either?

Atom_101 OP t1_j384ogh wrote on January 6, 2023 at 6:19 PM

I haven't used onnx before but have worked with torchscript. With torchscript I have had to change the models quite a bit with to make it scriptable. If onnx requires similar amount of effort I don't think it will be useful.

I don't want to go through the hassle of scripting because we might change the model architectures soon. I need a quick and possibly inefficient (space wise, not perf wise) way to package the models without exposing source code.

mkthabet t1_j38j1dz wrote on January 6, 2023 at 7:47 PM

Take a look at a package called pyinstaller. I've used it many times for packaging desktop GUI apps that use tensorflow in the backend, but it should work with pytorch equally well. Pyinstaller allows you to package python apps as a single executable and the user can run it without installing anything else since it includes everything it needs to run.

Atom_101 OP t1_j3b2vmr wrote on January 7, 2023 at 7:11 AM

From what I read it doesn't package cuda, only python dependencies. Were you able to get cuda inside your pyinstaller executable? Also is it even possible to package cuda inside an executable? Cuda needs to go into a specific folder right? And it needs to be added to the path variable for pytorch or other libraries to see it. For example in Linux it goes inside /usr/local/cuda

nins_ t1_j38fh4v wrote on January 6, 2023 at 7:25 PM

I found the portable binaries of Real ESRGAN extremely easy to use (no setup involved). Just a command line tool. It does not use CUDA though. https://github.com/xinntao/Real-ESRGAN

They're using NCNN to package the model. Have a look. https://github.com/Tencent/NCNN

Atom_101 OP t1_j3b2hnc wrote on January 7, 2023 at 7:07 AM

Thanks. This might be more useful for someone doing mobile or edge deployment since ncnn seems to be cpu only. I need my model to run on GPUs.

dr-pork t1_j39hmmm wrote on January 6, 2023 at 11:26 PM

Following this. I'm soon facing the same issue. Main concern is to protect source code. I think running a docker container locally would be fine but I still need to protect it somehow.

Is it possible to lock a container down and still access it through an API?

Perhaps encrypt the entire code or container? I think pyinstaller is not enough. If I'm not mistaken it just package the code. But you can still read it.

Atom_101 OP t1_j3b291o wrote on January 7, 2023 at 7:04 AM

It seems you can't lock a container. If the end user has root access they will be able to ssh into the container and see your source code. The solution seems to be to obfuscate your code using something like pyarmor, so that even if the user accesses the docker image, they won't easily figure out your source code.

dr-pork t1_j3gh0wc wrote on January 8, 2023 at 11:32 AM

I'll look into pyarmor. I just found this as well https://www.sourcedefender.co.uk/ Any idea if that would work? Thanks.

psychorameses t1_j3av5f9 wrote on January 7, 2023 at 5:47 AM

Through a web API, sure. Locally, no.

Python fundamentally is a scripting language. It isn't pre-compiled into binary. It literally needs the source code to run.

It's ok, a whole lot of people including my boss have made the exact same mistake and asked the exact same question.

dr-pork t1_j3gh3ke wrote on January 8, 2023 at 11:33 AM

Thanks

Any thoughts on this package?

https://www.sourcedefender.co.uk/

psychorameses t1_j3hxgzs wrote on January 8, 2023 at 6:28 PM

Best way to think of it is that it scrambles your code, but right before it runs it still needs to be unscrambled, so someone dedicated enough will be able to obtain the full source code (or at least byte code) anyway.

I hope my boss sees this because this is a very common misconception. This isn't the 90s where your only option for running code was to install it on the user's machine. You have other options. You have to assume that everything you install on a user device is no longer private. The only way to protect source code is to never distribute it in the first place, and only expose your software through web APIs.

i_ikhatri t1_j3v7uda wrote on January 11, 2023 at 7:44 AM

/u/psychorameses was spot on when he said that python is a scripting language. It’s really not meant for you to be able to do this.

The correct answer for this is what /u/robertknight2 suggested. If you want to deploy a production grade application then you need to export your model (either to ONNX, or TFLite or similar). Once you have the exported model you write a GUI application any old way. You can make a QT GUI application that interacts with the ONNX model using the C++ ONNX runtime API. Or you could write an electron application that uses TFJS to run a TFLite model. Both are viable options (though idk if you’ll be able to get TFLite using the GPU easily in Electron).

This is CPU only but here is an example of a C++ windows application that uses the ONNX runtime. Packaging CUDA (TensorRT really) and getting it to work with ONNX should only be a little bit more work. A quick google search yielded this thread for packaging TensorRT into a VS project on windows. It looks like it has a working example too: https://github.com/NVIDIA/TensorRT/issues/2085

_Arsenie_Boca_ t1_j394z5v wrote on January 6, 2023 at 10:02 PM

Without docker, you wont be able to ensure that all user environments have all dependencies. Is it really worth the effort? (which I believe would be big) Why not just deploy in the cloud and set up an api?