cautioushedonist

cautioushedonist t1_iwgzytj wrote

  1. Can you confirm if the example on this webpage doesn't work for you in terms of size and memory?

Example is under How to use on

https://huggingface.co/distilbert-base-uncased?text=The+goal+of+life+is+%5BMASK%5D

  1. Now, if you're open to paying for GPT3 services then this answer might be helpful.

https://stackoverflow.com/questions/73370817/how-to-use-gpt-3-for-fill-mask-tasks

This will be API calls and so you wouldn't need to worry about inference times and sizes.

So, you can either find the smallest LM possible that can work with fill-mask or use some API service to get around size/memory bottlenecks.

2

cautioushedonist t1_iwffey3 wrote

It sounds like out of all words with which the misspelling has the highest levenshtein score, you need some way to find out what's the most "right" word out of them.

For example -

Original : Mary had a lill lamb

Soundex : Mary had a (little/lit/let/list) lamb

The words in brackets are the ones with very high levenshtein score and you want to find out which one is the most right, right?

Solution -

I would leverage any large language model by "masking" the misspelling and letting the model predict what word should go there. Each of model's prediction will come with a confidence score which can help you make a ranked list of model's prediction.

So, once you have LLM's rankedlist and Soundex's rankedlist, you should be able to come up with a heuristic to find the most right word based on their ranks in each list.

You should be able to easily get started on LLM part based on the example here - https://huggingface.co/tasks/fill-mask

Lmk if you have any questions!

4

cautioushedonist t1_ivbvhp8 wrote

For someone who has mostly been occupied with modeling side of things, what's the easiest deployment+monitoring/MLOps framework to pick up which is moderately 'Batteries included' but then is flexible enough to customize ?

1

cautioushedonist t1_iuzeog4 wrote

Not as famous and might not qualify as a 'trick' but I'll mention "Geometric Deep Learning" anyway.

It tries to explain all the successful neural nets (CNN, RNN, Transformers) on a unified, universal mathematical framework. The most exciting extrapolation of this being that we'll be able to quickly discover new architectures using this framework.

Link - https://geometricdeeplearning.com/

16