Comments

You must log in or register to comment.

Miserable-Program679 t1_j29bmqy wrote

This question reminds me of a ted talk this guy gave about building a model to detect if his cat was coming in the cat door with a dead animal in its mouth. Cool talk and story.

12

rikeshmm t1_j2cvga8 wrote

Interested to check the talk out! Please share link!

1

HGFlyGirl t1_j2aen6d wrote

I trained a model to find duplicate music files in my brother's huge collection of digital music. He was frustrated by so many duplicates that still had different file names, file sizes and tags. We couldn't find any existing software that could do it - because they were all just looking for matches on those parameters. The model ended up working quite well.

5

iantimmis t1_j2bb6x0 wrote

How did you set it up?

1

HGFlyGirl t1_j2cpl6v wrote

For pairs of files, I took their filename length, calculated the Levenshtein distance between them, their size in bytes and their duration in Ticks.

I used the ML.NET AutoML API to train a binary classifier.

1

Apprehensive_Maize_4 t1_j2du7wc wrote

>duplicates

fdupes or any of the programs here didn't work for you?

https://www.tecmint.com/find-and-delete-duplicate-files-in-linux/

1

HGFlyGirl t1_j2ewwry wrote

Tried a few of these things. The problem was that a lot of the songs had been ripped from CD's using different software. So, some would be called things like track01.mp3 with a duplicate with a completely different file name. These could also have different byte lengths and durations. Then there are the ones that come from the original recording, the live version and/or the compilation album - which often differ a bit in all the parameters.

1

Apprehensive_Maize_4 t1_j2a7tfp wrote

Haven't completed it yet but am working on training an ML model to split some audio of an old TV show which had a foreign language dub superimposed on it. The tv show is nearly lost and these dubs are some of the few episodes that remain. I hope I can save one or two episodes this way.

4

Lintaar OP t1_j2bf911 wrote

Thats a worthy project, best of luck!

2

bitemenow999 t1_j2bzpsx wrote

I just finished a tool/ml model (a shallow one, technically ml) that would suggest keywords and ideas for your next paper/project for maximum citations...

A bit of background for motivation, a couple weeks ago I was with a friend getting drunk and talking about the monopoly of big tech on deep learning and pattern of publication in major conferences, eg. this year it was diffusion, last year it was transformers etc.So I came home a bit drunk and wrote a script to scrape data from papers, nothing fancy just the keywords and citations. Woke up to decent size data, so I trained a quick decision tree (don't know why it made sense to my half-drunk brain). Sent it to some of my friends in my lab to play with, got some funny results and suggestions and it looks like I am gonna work on it a little bit to add more features and as a side project.

So it takes some inputs like a couple of keywords from your side like if your area eg. x-ray,cancer detection or inverse pde/system identification etc and gives next couple of key words like diffusion, transformers, clip-guided etc. as well as predicted number of citations.

3

Lintaar OP t1_j2c1pzp wrote

A living example of the Ballmer Peak. In all seriousness though, that's pretty impressive to build that much in 24 hours... especially while being drunk. And more impressive, it sounds like it could be an actually useful model if you keep working on it. Good luck!

1

frantonio1 t1_j2dh2nd wrote

I built a model to predict what kind of coffee( hot or iced) to drink given the current weather. Made a iOS shortcut to invoke the model. It was a few years ago but I still use it

2

Lintaar OP t1_j2f9ksa wrote

How many/which weather features does it use?

1

K_Siegs t1_j2bf00t wrote

I don't code or even understand how to use ML, but I'm thinking of hiring someone to help me make my job easier. I make sample libraries (pro audio plugins), and sorting audio samples is a painstaking process. If a computer could analyze the specific instrument samples, gain match them based on velocity zone, and delete samples that don't fit, my life would be made much easier. I don't even know where to begin looking, which is why I started following this sub. I have already scripted the editing process.

1

hannahmontana1814 t1_j27p6jg wrote

If I built an ML model for my own use, I'd never have to leave my house again.

−7

kyoko9 t1_j27ot04 wrote

I haven't built any ML models for my own use, but I have built a few for my friends' use.

−12

od3tzk1 t1_j28mvle wrote

Can you tell more about what these models do?

1

kyoko9 t1_j28noy1 wrote

These models are for people who want to feel like they're doing something without actually accomplishing anything.

−19