Submitted by SimonJDPrince t3_10jlq1q in MachineLearning

I've been writing a new textbook on deep learning for publication by MIT Press late this year. The current draft is at:

https://udlbook.github.io/udlbook/

It contains a lot more detail than most similar textbooks and will likely be useful for all practitioners, people learning about this subject, and anyone teaching it. It's (supposed to be) fairly easy to read and has hundreds of new visualizations.

Most recently, I've added a section on generative models, including chapters on GANs, VAEs, normalizing flows, and diffusion models.

Looking for feedback from the community.

  • If you are an expert, then what is missing?
  • If you are a beginner, then what did you find hard to understand?
  • If you are teaching this, then what can I add to support your course better?

Plus of course any typos or mistakes. It's kind of hard to proof your own 500 page book!

310

Comments

You must log in or register to comment.

arsenyinfo t1_j5m33oc wrote

As a practitioner, I am surprised to see no chapter on finetuning

79

SimonJDPrince OP t1_j5o0orv wrote

Can you give me an example of a review article or chapter in another book that covers roughly what you expect to see?

7

NoRexTreX t1_j5o9hd8 wrote

I can't give an example material outside of just huggingface documentation but it's the big thing right now to leverage pre trained models so if your book doesn't mention it then it's missing the hipest thing. And also adapterhub.

5

new_name_who_dis_ t1_j5oix1c wrote

Fine tuning isn’t any different than just training…? You just don’t start with random network, but fine tuning doesn’t really have anything different besides that and the size of the dataset

1

SimonJDPrince OP t1_j5olux3 wrote

That was kind of my impression. And I do discuss this in the chapters on transformers and regularization. Was wondering if there is more to it.

3

aristotle137 t1_j5m25xi wrote

Btw, I absolutely loved your computer vision textbook, clear, comprehensible and so much fun! Best visulations in the biz. Also loved your UCL course on the subject, I was there 2010/2011 -- will definitely check out the next book

20

Comfortable_End5976 t1_j5m4w76 wrote

having a skim, it looks good mate. i like your writing style. please let us know once it's published and we can pick up a physical copy.

10

Philpax t1_j5lqko9 wrote

Awesome! I'll add it to my reading list :)

7

aamir23 t1_j5mzuw2 wrote

There's another book with the same title. Understanding deep learning

7

Qpylon t1_j5nlguz wrote

That one’s full title is actually “ Understanding Deep Learning: Application in Rare Event Prediction”

4

SimonJDPrince OP t1_j5o0gbn wrote

Yeah -- I feel a bit bad about that, but as someone else pointed out, the title is not actually the same. I should put a link to this book on my website though, so anyone looking for this book can find it.

4

taleofbenji t1_j5lw5ie wrote

I love your book and refer to it often. I keep hitting F5 for Chapter 19. :-)

3

K_is_for_Karma t1_j5mh00k wrote

how recent is your chapter on generative models? I’m starting to pursue research in the area and need to get up to date

3

SimonJDPrince OP t1_j5o0jd9 wrote

There are five chapters and around 100 pages. I think it would be a good start.

3

Own_Quality_5321 t1_j5lt463 wrote

Nice. I wil have a look and possibly recommend it. Thanks for sharing, that must have been a huge amount of work

2

promiise t1_j5lu8wl wrote

Nice, thanks for sharing your hard-work!

2

PabloEs_ t1_j5mdolq wrote

Looks good and fills a gap, imo there is no good DL book out there. What could be better: state results more clear as a Theorem with all needed assumptions.

2

profjonathanbriggs t1_j5mljdr wrote

Added to my reading stack. Thanks for this. Will revert with comments.

2

sweetlou357 t1_j5mtytp wrote

Wow this looks like an amazing resource!

2

bacocololo t1_j5obxnl wrote

In page 41 just near problem 3.9 you write twice the. Do you need this type of comment too ?

2

SimonJDPrince OP t1_j5ocgd0 wrote

Yes! Any tiny errors (even punctuation) are super useful! Couldn't find this though. Can you give me more info about which sentence?

2

bacocololo t1_j5of34s wrote

just above 3.1.2 sum of slopes from the the regions

2

TheMachineTookShape t1_j5ohns9 wrote

There's another on page 349 in section "Combination with other models":

>...will ensure that the the aggregated posterior...

1

SimonJDPrince OP t1_j5olz4s wrote

Thanks! If you send your real name to the e-mail on the front page of the book, then I'll add you to the acknowledgements.

3

TheMachineTookShape t1_j5ofex4 wrote

What is the most efficient way for someone to tell you about typos, or provide suggestions? I'll try to have a read over the weekend.

2

TheMachineTookShape t1_j5ofkzb wrote

Sorry, you've written the instructions right there on the Web page! Just ignore me...

1

LornartheBreton t1_j6i180q wrote

Please let us know when it's published so I can tell my university to buy some copies for its' library!

2

like_a_tensor t1_j5n7akg wrote

Very nice work! Do you plan to release any solutions to the problems?

1

SimonJDPrince OP t1_j5o09dv wrote

I will release solutions to about half of them. Have to keep the rest back for professors. You can always message me if you want to know the solution to a particular problem.

2

bacocololo t1_j5ng5fs wrote

Will be please to look at it, especially the figure explaining algorithms, Thanks

1

bacocololo t1_j5nk0vo wrote

Fitst impression base on transformers figure : You book could become a best seller…

1

libai123456 t1_j5nkizf wrote

I really like the book for it provides many beautiful pictures and gives us many intuitions behind deep learning algorithms, really appreciate the work you have done on this book.

1

[deleted] t1_j5o68mg wrote

[deleted]

1

SimonJDPrince OP t1_j5ocrdo wrote

I'd say that mine is more internally consistent -- all the notation is consistent across all equations and figures. I have made 275 new figures, whereas he has curated existing figures from papers. Mine is more in depth on the topics that it covers (only deep learning), but his has much greater breadth. His is more of a reference work, whereas mine is intended mainly for people learning this for the first time.
Full credit to Kevin Murphy -- writing book is much more work than people think, and so completing that monster is quite an achievement.

Thanks for tip about Hacker News -- that's a good idea.

3

SatoshiNotMe t1_j5ponoh wrote

Looks like a great book so far. I think it is definitely valuable to focus on giving a clear understanding of some topics rather than covering everything while compromising depth of understanding

1

bythenumbers10 t1_j5pa9iz wrote

When to reach for deep learning over older, simpler methods. Just an executive summary to keep folks from sandblasting soda crackers, or being forced to.

1

AdFew4357 t1_j5pqkqb wrote

I have one minor gripe about deep learning textbooks. I think they are great references, but should not be used as a way for beginners to get into the field. I genuinely feel like time is better spent on the student going down a rabbit hole of actual papers of maybe one of the chapters of those books, say, a student reads the chapter on graph neural networks and the proceeds to read everything in graph neural networks, rather than read the whole book on different subsections.

1

SimonJDPrince OP t1_j5q2nbo wrote

Agreed -- in some cases. Depends on the level of the student, if they are studying in a class etc. My goal was to write the first thing you should read about each area.

1

NeoKov t1_j5wmjkr wrote

As a novice, I’m not understanding why the test loss continues to increase— in general, but also in Fig. 8.2b, if anyone can explain… The model continues to update and (over)fit throughout testing? I thought it was static after training. And the testing batch is always the same size as the training batch? And they don’t occur simultaneously, right? So the test plot is only generated after the training plot.

1

SimonJDPrince OP t1_j5yc4n2 wrote

You are correct -- they don't usually occur simultaneously. Usually, you would train and then test afterwards, but I've shown the test performance as a function of the number of training iterations, just so you can see what happens with generalization.

(Sometimes people do examine curves like this using validation data, so they can see when the best time to stop training is though)

The test loss goes back up because it classifies some of the test answers wrong. With more training iterations, it becomes more certain about it's answers (e.g., it pushes the likelihood of its chosen class from 0.9 to 0.99 to 0.999 etc.). For the training data, where the everything is classified correctly, that makes it more likely and decreases the loss. For the cases in the test data where its classified wrong, it makes it less likely, and so the loss starts to go back up.

Hope this helps. I will try to clarify in the book. It's always helpful to learn where people are getting confused.

1

NeoKov t1_j5zvrkz wrote

I see, thanks! This seems like a great resource. Thank you for making it available. I’ll post any further questions here, unless GitHub is the preference.

1

SimonJDPrince OP t1_j648ce9 wrote

GitHub or e-mail are better. Only occasionally on Reddit.

1

NeoKov t1_j60xeip wrote

Fig. 8.5 mentions “brown line” for b) but line appears to be black.

1

SimonJDPrince OP t1_j648umm wrote

Thanks! Definitely a mistake. If you send your real name to the e-mail address on the website, I'll add you to the acknowledgements in the book.

Let me know if you find any more.

1

NihonNoRyu t1_j5md892 wrote

Will you add a section for Forward-forward algorithm?

−1

H0lzm1ch3l t1_j5nrium wrote

Didn‘t Hinton just try to start talking about it again at NeurIps? Isn‘t it like super irrelevant right now or am I missing something?

5

SimonJDPrince OP t1_j5o0cj7 wrote

I'm planning to add extra material on line for things like this where it's still unclear how important they are. If they get widely adopted, I'll incorporate into next edition.

1

Nhabls t1_j5m4lxa wrote

Obviously haven't had the time to read through it, and this is a clear nitpick but i really don't like when sites like this force you to download the files rather than display it in the browser by default

−7