cautioushedonist
cautioushedonist t1_iwffey3 wrote
Reply to [D] Phonetic Algorithm Spellcheck Metric by Devinco001
It sounds like out of all words with which the misspelling has the highest levenshtein score, you need some way to find out what's the most "right" word out of them.
For example -
Original : Mary had a lill lamb
Soundex : Mary had a (little/lit/let/list) lamb
The words in brackets are the ones with very high levenshtein score and you want to find out which one is the most right, right?
Solution -
I would leverage any large language model by "masking" the misspelling and letting the model predict what word should go there. Each of model's prediction will come with a confidence score which can help you make a ranked list of model's prediction.
So, once you have LLM's rankedlist and Soundex's rankedlist, you should be able to come up with a heuristic to find the most right word based on their ranks in each list.
You should be able to easily get started on LLM part based on the example here - https://huggingface.co/tasks/fill-mask
Lmk if you have any questions!
cautioushedonist OP t1_iw6bez1 wrote
Reply to comment by chewxy in [D] When was the last time you wrote a custom neural net? by cautioushedonist
Shameless (/s) of you to not plug it in, if it's public.
cautioushedonist OP t1_iw5y4mu wrote
Reply to comment by AConcernedCoder in [D] When was the last time you wrote a custom neural net? by cautioushedonist
Can most/many of those downloads come from it being in requirements.txt of other widely used repos?
cautioushedonist OP t1_iw5xlhh wrote
Reply to comment by Zephos65 in [D] When was the last time you wrote a custom neural net? by cautioushedonist
I am only slightly old fashioned :)
cautioushedonist OP t1_iw5vajt wrote
Reply to comment by ThatInternetGuy in [D] When was the last time you wrote a custom neural net? by cautioushedonist
Great comment! Always helps newbies like me to read about how things used to be done.
I was "born" into these luxuries of huggingface, papers with github repos, and extensive community interaction online that keeps on giving for years. I feel grateful.
cautioushedonist OP t1_iw5uajx wrote
Reply to comment by moist_buckets in [D] When was the last time you wrote a custom neural net? by cautioushedonist
Astrophysics is an interesting use-case!
Can you share with us what the data looks like? Is it structured, tabular data?
cautioushedonist t1_ivcx548 wrote
Reply to comment by BrisklyBrusque in [D] What are the major general advances in techniques? by windoze
Yes, it's different.
Universal function approximation sort of guarantees/implies that you can approximate any mapping function given the right config/weights of neural nets. It doesn't really guide us to the correct config.
cautioushedonist t1_ivbvhp8 wrote
Reply to [D] Simple Questions Thread by AutoModerator
For someone who has mostly been occupied with modeling side of things, what's the easiest deployment+monitoring/MLOps framework to pick up which is moderately 'Batteries included' but then is flexible enough to customize ?
cautioushedonist t1_iuzeog4 wrote
Not as famous and might not qualify as a 'trick' but I'll mention "Geometric Deep Learning" anyway.
It tries to explain all the successful neural nets (CNN, RNN, Transformers) on a unified, universal mathematical framework. The most exciting extrapolation of this being that we'll be able to quickly discover new architectures using this framework.
cautioushedonist t1_iwgzytj wrote
Reply to comment by Devinco001 in [D] Phonetic Algorithm Spellcheck Metric by Devinco001
Example is under How to use on
https://huggingface.co/distilbert-base-uncased?text=The+goal+of+life+is+%5BMASK%5D
https://stackoverflow.com/questions/73370817/how-to-use-gpt-3-for-fill-mask-tasks
This will be API calls and so you wouldn't need to worry about inference times and sizes.
So, you can either find the smallest LM possible that can work with fill-mask or use some API service to get around size/memory bottlenecks.